cnn kaggle仙人掌

by Jerin Paul

杰林·保罗(Jerin Paul)

我如何开发可识别情绪并闯入Kaggle前10名的CNN (How I developed a C.N.N. that recognizes emotions and broke into the Kaggle top 10)

A baby starts to recognize its parents’ faces when it is just a couple of weeks old. As it grows, this innate ability improves. By the time it is a few months old, it starts to display social cues and is able to understand basic emotions like a smile.

几周大的婴儿开始认出父母的脸。 随着它的成长,这种先天的能力会提高。 几个月前,它开始显示社交线索,并能够理解基本的情感,如微笑。

Thanks to millions of years of evolution, we are able to understand each other without using a single word. Just a look and that is all that takes to understand whether a person is crestfallen or elated. Well, I tried teaching computers to do just that. This article is a detailed account of how the whole experiment turned out. Follow along as we recreate the network.

由于数百万年的发展,我们无需使用任何单词就能相互理解。 只是看一下就可以了解一个人是否垂头丧气或兴高采烈。 好吧,我试着教计算机做到这一点。 本文详细介绍了整个实验的结果。 跟随我们重新创建网络。

Cut to the chase Paul, please, give me the code. Don’t want fancy reading? No problem. You can find the code for this project here.

切入正题Paul,请给我代码。 不想看书吗? 没问题。 您可以在此处找到该项目的代码。

简介 (A Brief Introduction)

“The best and most beautiful things in the world cannot be seen or even touched. They must be felt with the heart” ― Helen Keller

“世界上最美好的事物是看不见的,甚至无法触及的。 必须用心去感受他们。”- 海伦·凯勒

Hellen Keller excellently described the essence of human emotions in the aforementioned quote. What was once reserved for animals is no longer limited to them. Machine learning is catching on at a mindnumbing pace. The onset of convolutional neural networks was a breakthrough and changed the way computers “look” at the world.

Hellen Keller在上述引用中很好地描述了人类情感的本质。 曾经为动物保留的东西不再仅限于它们。 机器学习正以令人发指的速度发展。 卷积神经网络的出现是一个突破,改变了计算机“看待”世界的方式。

Facial expressions are nothing more than the arrangement of facial muscles to convey a certain emotional state to the observer. Emotions can be divided into six broad categories — Anger, Disgust, Fear, Happiness, Sadness, Surprise, and Neutral. In this M.L. project, we will train a model to differentiate between these.

面部表情无非是安排面部肌肉将某种情绪状态传达给观察者。 情绪可分为六大类-愤怒,厌恶,恐惧,幸福,悲伤,惊奇和中立。 在此ML项目中,我们将训练一个模型来区分这些模型。

We will train a convolutional neural network using the FER2013 dataset and will use various hyper-parameters to fine-tune the model. We will train it on Google Colab, which is a research project created to disseminate ML education. They will allocate you some resources like G.P.U. or T.P.U., and these can be used to train your model faster. The best part is that it is completely free.

我们将使用FER2013数据集训练卷积神经网络,并将使用各种超参数对模型进行微调。 我们将在Google Colab上对它进行培训 ,这是一个旨在传播ML教育的研究项目。 他们将为您分配一些资源,例如GPU或TPU,这些资源可用于更快地训练模型。 最好的部分是它是完全免费的。

窥探数据 (Peek at the data)

We will start by uploading the FER2013.csv file to our drive so that we can access it from Google Colab. There are 35,888 images in this dataset which are classified into six emotions. The data file contains 3 columns — Class, Image data, and Usage.

我们将从将FER2013.csv文件上传到我们的驱动器开始,以便我们可以从Google Colab对其进行访问。 该数据集中有35,888张图像,分为六种情绪。 数据文件包含3列-类,图像数据和使用情况。

Class: is a digit between 0 to 6 and represents the emotion depicted in the corresponding picture. Each emotion is mapped to an integer as shown below.

类别:是介于0到6之间的数字,代表相应图片中描述的情绪。 每种情绪都映射到一个整数,如下所示。

0 - 'Angry'1 - 'Disgust'2 - 'Fear' 3 - 'Happy' 4 - 'Sad' 5 - 'Surprise'6 - 'Neutral'

Image data: is a string of 2,304 numbers and these are the pixel intensity values of our image, we will cover this in detail in a while.

图像数据:是一个2304个数字的字符串,这些是我们图像的像素强度值,我们将在稍后详细介绍。

Usage: denotes whether the corresponding data should be used to train the network or test it.

用法:表示应使用相应的数据来训练网络还是对其进行测试。

分解图像。 (Decomposing an image.)

As we all know that images are composed of pixels and these pixels are nothing more than numbers. Colored images have three color channels — red, green, and blue — and each channel is represented by a grid (2-dimensional array). Each cell in the grid stores a number between 0 and 255 which denotes the intensity of that cell.

众所周知,图像由像素组成,这些像素不过是数字。 彩色图像具有三个颜色通道-红色,绿色和蓝色-每个通道都由一个网格(二维数组)表示。 网格中的每个单元格都存储一个介于0和255之间的数字,表示该单元格的强度。

When these three channels are aligned together we get the images that we see.

当这三个通道对齐在一起时,我们得到的图像就是我们看到的。

导入必要的库 (Importing Necessary Libraries)

%matplotlib inlineimport matplotlib.pyplot as plt
import numpy as npfrom keras.utils import to_categoricalfrom sklearn.model_selection import train_test_split
from keras.models import Sequential #Initialise our neural network model as a sequential networkfrom keras.layers import Conv2D #Convolution operationfrom keras.layers.normalization import BatchNormalizationfrom keras.regularizers import l2from keras.layers import Activation#Applies activation functionfrom keras.layers import Dropout#Prevents overfitting by randomly converting few outputs to zerofrom keras.layers import MaxPooling2D # Maxpooling functionfrom keras.layers import Flatten # Converting 2D arrays into a 1D linear vectorfrom keras.layers import Dense # Regular fully connected neural networkfrom keras import optimizersfrom keras.callbacks import ReduceLROnPlateau, EarlyStopping, TensorBoard, ModelCheckpointfrom sklearn.metrics import accuracy_score

定义数据加载机制 (Define Data Loading Mechanism)

Now, we will define the load_data() function which will efficiently parse the data file and extract necessary data and then convert it into a usable image format.

现在,我们将定义load_data()函数,该函数将有效地解析数据文件并提取必要的数据,然后将其转换为可用的图像格式。

All the images in our dataset are 48x48 in dimension. Since these images are gray-scale, there is only one channel. We will extract the image data and rearrange it into a 48x48 array. Then convert it into unsigned integers and divide it by 255 to normalize the data. 255 is the maximum possible value of a single cell. By dividing every element by 255, we ensure that all our values range between 0 and 1.

我们的数据集中所有图像的尺寸均为48x48。 由于这些图像是灰度图像,因此只有一个通道。 我们将提取图像数据并将其重新排列为48x48阵列。 然后将其转换为无符号整数并将其除以255以对数据进行归一化。 255是单个单元格的最大可能值。 通过将每个元素除以255,我们确保所有值都在0到1之间。

We will check the Usage column and store the data in separate lists, one for training the network and the other for testing it.

我们将检查“ 使用情况”列,并将数据存储在单独的列表中,一个用于训练网络,另一个用于测试网络。

def load_data(dataset_path):
data = []  test_data = []  test_labels = []  labels =[]
with open(dataset_path, 'r') as file:      for line_no, line in enumerate(file.readlines()):          if 0 < line_no <= 35887:            curr_class, line, set_type = line.split(',')            image_data = np.asarray([int(x) for x in line.split()]).reshape(48, 48)            image_data =image_data.astype(np.uint8)/255.0                        if (set_type.strip() == 'PrivateTest'):                            test_data.append(image_data)              test_labels.append(curr_class)            else:              data.append(image_data)              labels.append(curr_class)            test_data = np.expand_dims(test_data, -1)      test_labels = to_categorical(test_labels, num_classes = 7)      data = np.expand_dims(data, -1)         labels = to_categorical(labels, num_classes = 7)          return np.array(data), np.array(labels), np.array(test_data), np.array(test_labels)

Once our data is segregated, we will expand the dimensions of both testing and training data by one to accommodate the channel. Then, we will one hot encode all the labels using the to_categorical() function and return all the lists as numpy arrays.

隔离数据后,我们将把测试和培训数据的维度扩大一个以适应渠道。 然后,我们将使用to_categorical()函数对所有标签进行热编码,并将所有列表作为numpy数组返回。

We will load the data by calling the load_data() function.

我们将通过调用load_data()函数来加载数据。

dataset_path = "/content/gdrive/My Drive/Colab Notebooks/Emotion Recognition/Data/fer2013.csv"
train_data, train_labels, test_data, test_labels = load_data(dataset_path)
print("Number of images in Training set:", len(train_data))print("Number of images in Test set:", len(test_data))

Our data is loaded and now let us get to the best part, defining the network.

我们的数据已加载,现在让我们进入最佳状态,定义网络。

定义模型。 (Defining the model.)

We will use Keras to create a Sequential Convolutional Network. Which means that our neural network will be a linear stack of layers. This network will have the following components:

我们将使用Keras创建顺序卷积网络。 这意味着我们的神经网络将是层的线性堆叠。 该网络将包含以下组件:

  1. Convolutional Layers: These layers are the building blocks of our network and these compute dot product between their weights and the small regions to which they are linked. This is how these layers learn certain features from these images.卷积层:这些层是我们网络的构建块,它们计算权重与所链接的小区域之间的点积。 这就是这些图层从这些图像中学习某些功能的方式。
  2. Activation functions: are those functions which are applied to the outputs of all layers in the network. In this project, we will resort to the use of two functions— Relu and Softmax.

    激活功能:是应用于网络所有层的输出的那些功能。 在这个项目中,我们将诉诸使用两个函数-ReluSoftmax

  3. Pooling Layers: These layers will downsample the operation along the dimensions. This helps reduce the spatial data and minimize the processing power that is required.合并层:这些层将沿维度下采样操作。 这有助于减少空间数据并使所需的处理能力最小化。
  4. Dense layers: These layers are present at the end of a C.N.N. They take in all the feature data generated by the convolution layers and do the decision making.密集层:这些层位于CNN的末尾,它们吸收卷积层生成的所有特征数据并进行决策。
  5. Dropout Layers: randomly turns off a few neurons in the network to prevent overfitting.辍学层:随机关闭网络中的一些神经元,以防止过度拟合。
  6. Batch Normalization: normalizes the output of a previous activation layer by subtracting the batch mean and dividing by the batch standard deviation. This speeds up the training process.批次归一化:通过减去批次平均值并除以批次标准偏差来归一化先前激活层的输出。 这样可以加快培训过程。
model.add(Conv2D(64, (3, 3), activation='relu', input_shape=(48, 48, 1), kernel_regularizer=l2(0.01)))model.add(Conv2D(64, (3, 3), padding='same',activation='relu'))model.add(BatchNormalization())model.add(MaxPooling2D(pool_size=(2,2), strides=(2, 2)))model.add(Dropout(0.5))    model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(MaxPooling2D(pool_size=(2,2)))model.add(Dropout(0.5))    model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(MaxPooling2D(pool_size=(2,2)))model.add(Dropout(0.5))    model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(Conv2D(512, (3, 3), padding='same', activation='relu'))model.add(BatchNormalization())model.add(MaxPooling2D(pool_size=(2,2)))model.add(Dropout(0.5))    model.add(Flatten())model.add(Dense(512, activation='relu'))model.add(Dropout(0.5))model.add(Dense(256, activation='relu'))model.add(Dropout(0.5))model.add(Dense(128, activation='relu'))model.add(Dropout(0.5))model.add(Dense(64, activation='relu'))model.add(Dropout(0.5))model.add(Dense(7, activation='softmax'))

We will compile the network using Adam optimizer and will use a variable learning rate. Since we are dealing with a classification problem that involves multiple categories, we will use categorical_crossentropy as our loss function.

我们将使用Adam优化器来编译网络,并将使用可变的学习率。 由于我们正在处理涉及多个类别的分类问题,因此我们将使用categorical_crossentropy作为损失函数。

adam = optimizers.Adam(lr = learning_rate)
model.compile(optimizer = adam, loss = 'categorical_crossentropy', metrics = ['accuracy'])    print(model.summary()

回调功能 (Callback functions)

Callback functions are those functions which are called after every epoch during the training process. We will be using the following callback functions:

回调函数是在训练过程中的每个时期之后调用的函数。 我们将使用以下回调函数:

  1. ReduceLROnPlateau: Training a neural network can plateau at times and we stop seeing any progress during this stage. Therefore, this function monitors the validation loss for signs of a plateau and then alter the learning rate by the specified factor if a plateau is detected.ReduceLROnPlateau:训练神经网络有时会停滞不前,在此阶段我们看不到任何进展。 因此,此功能监视平稳损失的平稳迹象,如果检测到平稳,则将学习率改变指定的因子。
lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=0.9, patience=3)

2. EarlyStopping: At times, the progress stalls while training a neural network and we stop seeing any improvement in the validation accuracy (in this case). Majority of the time, this means that the network won’t converge any further and there is no point in continuing the training process. This function waits for a specified number of epochs and terminates the training if no change in the parameter is found.

2. EarlyStopping:有时,在训练神经网络时进度会停滞,我们不再看到验证准确性有任何改善(在这种情况下)。 在大多数情况下,这意味着网络不会进一步融合,继续进行培训没有意义。 如果找不到参数更改,此函数将等待指定的时期数并终止训练。

early_stopper = EarlyStopping(monitor='val_acc', min_delta=0, patience=6, mode='auto')

3. ModelCheckpoint: Training neural networks generally takes a lot of time and anything can happen during this period that may result in loss of all the variables and weights. Creating checkpoints is a good habit as it saves your model after every epoch. In case your training stops you can load the checkpoint and resume the process.

3. ModelCheckpoint:训练神经网络通常会花费很多时间,在此期间可能发生任何事情,可能会导致所有变量和权重损失。 创建检查点是一个好习惯,因为它会在每个时期后保存您的模型。 万一您的培训停止了,您可以加载检查点并继续该过程。

checkpointer = ModelCheckpoint('/content/gdrive/My Drive/Colab Notebooks/Emotion Recognition/Model/weights.hd5', monitor='val_loss', verbose=1, save_best_only=True)

训练时间 (Time to train)

All our hard work is about to be put to the test. But before we fit the model, let us define some hyper-parameters.

我们所有的辛苦工作即将受到考验。 但是在拟合模型之前,让我们定义一些超参数。

epochs = 100batch_size = 64learning_rate = 0.001

Our data will pass through the model 100 times and in batches of 64 images. We will use 20% of our training data to validate the model after every epoch.

我们的数据将通过模型100次并分批处理64张图像。 在每个时期之后,我们将使用20%的训练数据来验证模型。

model.fit(          train_data,          train_labels,          epochs = epochs,          batch_size = batch_size,          validation_split = 0.2,          shuffle = True,          callbacks=[lr_reducer, checkpointer, early_stopper]          )

Now that the network is being trained, I suggest that you go and finish that book you started or go for a run. It took me about an hour on Google Colab.

现在已经对网络进行了培训,我建议您去完成您开始或尝试的那本书。 我在Google Colab上花了大约一个小时。

测试模型 (Test the model)

Remember the private set we stored separately? That was for this very moment. This is the moment of truth and this is where we will reap the fruit of our labor.

还记得我们单独存储的私有集吗? 那是在这一刻。 这是关键时刻,这是我们收获劳动成果的地方。

predicted_test_labels = np.argmax(model.predict(test_data), axis=1)test_labels = np.argmax(test_labels, axis=1)print ("Accuracy score = ", accuracy_score(test_labels, predicted_test_labels))

Well, the results came back and we scored 63.167%. On first glance, it isn’t much but we broke into the ninth position of the Facial Emotion Recognition Kaggle competition.

好吧,结果又回来了,我们得分了63.167%。 乍一看,虽然不多,但我们闯入了面部表情识别Kaggle竞赛的第9位。

Now, pat yourself on the back and start brainstorming about the ways in which you can improve this model. We can use better hyper-parameters or create a different network architecture altogether to achieve higher accuracies.

现在,拍打自己的背,开始就如何改进此模型进行集思广益。 我们可以使用更好的超参数或完全创建不同的网络体系结构以实现更高的精度。

保存模型 (Save the model)

Quickly save the model using model_from_json from keras.models.

快速保存使用model_from_jsonkeras.models模型

from keras.models import model_from_json
model_json = model.to_json()with open("/content/gdrive/My Drive/Colab Notebooks/Emotion Recognition/FERmodel.json", "w") as json_file:    json_file.write(model_json)# serialize weights to HDF5model.save_weights("/content/gdrive/My Drive/Colab Notebooks/Emotion Recognition/FERmodel.h5")print("Saved model to disk")

包装全部 (Wrapping it all up)

We started off by defining a loading mechanism and loading the images. Then we created a training set and a testing set. Then we defined a fine model and defined a few callback functions. We went over the basic components of a convolutional neural network and then we trained our network.

我们首先定义了一种加载机制并加载了图像。 然后,我们创建了训练集和测试集。 然后,我们定义了一个很好的模型并定义了一些回调函数。 我们研究了卷积神经网络的基本组件,然后训练了我们的网络。

I extended this project by creating a python application which is able to detect faces and recognize their emotions in real time. That will be covered in a later post.

我通过创建一个Python应用程序扩展了该项目,该应用程序能够实时检测人脸并识别其情绪。 这将在以后的文章中介绍。

We just accomplished something that was part of science fiction a few decades ago. Yet there is a lot left to learn. The internet provides us with a plethora of information to constantly create and learn. May the learning never cease.

几十年前,我们刚刚完成了一些科幻小说。 然而,还有很多东西需要学习。 互联网为我们提供了大量信息,以不断创造和学习。 愿学习永无止境。

翻译自: https://www.freecodecamp.org/news/facial-emotion-recognition-develop-a-c-n-n-and-break-into-kaggle-top-10-f618c024faa7/

cnn kaggle仙人掌

cnn kaggle仙人掌_我如何开发可识别情绪并闯入Kaggle前10名的CNN相关推荐

  1. MySQL和java连连看_用 JAVA 开发游戏连连看(之一)动手前的准备

    JAVA ,相信大家也不会陌生了吧, JAVA 是一门相当优秀的语言.目前 JAVA 领域 J2EE . JSP . STRUTS 等技术不知有多么的热门,他们的主要用途是用来进行企业开发, J2ME ...

  2. java游戏开始被流星_用 JAVA 开发游戏连连看(之一)动手前的准备

    JAVA ,相信大家也不会陌生了吧, JAVA 是一门相当优秀的语言.目前 JAVA 领域 J2EE . JSP . STRUTS 等技术不知有多么的热门,他们的主要用途是用来进行企业开发, J2ME ...

  3. python urllib3离线安装_全球Python库下载前10名

    Python的简洁性,不仅仅在于其语法简单,还有各种python库函数的支持,为大家节省了大量的时间和精力,所以网上有人戏称python的编程者为调包侠.但是你知道全球最受欢迎的python库嘛?今天 ...

  4. 语言关键字特别注意没有_从零开始写文本编辑器(三十三):前20名编程语言的关键字...

    前言 以前没细心学习,其实html, xml 并不是编程语言,它们叫标记语言,即缩写ml的全称markup language. 尽量找了资源,整理了前20名的编程语言的关键字.然后用工厂封装创建.我之 ...

  5. 的有效性最好_世界前10名面膜补水排行榜 最好用的十款面膜推荐

    敷面膜就是快速有效的补水保湿措施,能够很快的让肌肤吸收水分和营养,让皮肤保持健康的状态.下面是小编整理出来的全球最好的面膜排行榜让我们参考一下吧. 城野医生光学美白面膜 阻止黑色素产生,預防肌肤暗沉及 ...

  6. cnn keras 实现_在iOS应用中实现Keras CNN

    cnn keras 实现 I first thought about image classification in an app through watching the TV show Silic ...

  7. 第0课第2节_刚接触开发板之烧写裸板程序

    第0课第2节_刚接触开发板之烧写裸板程序 tftp用法 q //退出菜单 help tftp print //显示IP set ipaddr 192.168.31.203 //设置开发板IP set ...

  8. 观光公交削弱_削弱Web开发人员和Internet的7大障碍

    观光公交削弱 As a web developer I periodically take a step back from the text editor and look at the lands ...

  9. 视频教程-微信小程序系统教程[2/3阶段]_核心技术-微信开发

    微信小程序系统教程[2/3阶段]_核心技术 微信企业号星级会员.10多年软件从业经历,国家级软件项目负责人,主要从事软件研发.软件企业员工技能培训.已经取得计算机技术与软件资格考试(软考)--&quo ...

最新文章

  1. OpenDNS 不再向用户展示广告
  2. js遍历追加html子样式,前端基本功:JS(十一)动画封装(CSS样式获取、JSON遍历)...
  3. 好好学python·函数进阶(递归函数,回调函数,闭包函数,匿名函数,迭代器)
  4. python元组的概念_python元组的概念知识点
  5. DIV+CSS两种盒子模型(W3C盒子与IE盒子)
  6. Android实现图片的高效批量加载
  7. 加菲猫的人生歪理~ 看完果然开心,哈哈~
  8. web 页面间传值 js 封装方法
  9. Python 实现数据结构中的单链表,循环单链表,双链表
  10. linux 串口监视工具_监视Linux的最佳工具
  11. 数据结构课程设计———迷宫和哈夫曼编/译码器
  12. ORACLE公司传奇历史
  13. Sketch 快捷键速记表(中英对照)
  14. 如何才能将企业现有的组织关系集成到jbpm中?
  15. 网页实现excel下载
  16. 机器人可操作度 matlab,双臂机器人运动学与可操作性及其优化的研究
  17. js调用本地摄像头拍照截图,提交后台
  18. 队列,栈,堆栈,数组,链表特点与区别
  19. 怎么将CAJ转换成PDF
  20. 智能枕头的功能及工作原理

热门文章

  1. C语言100题练习计划 31——计算两数的和与差(函数实现)
  2. 爱快云微信连wifi3.1用户前期准备工作
  3. 第二十九章 狼心狗肺
  4. 计算机文化基础——计算机基础知识
  5. Tensorflow 2.x(keras)源码详解之第四章:DatasetTFRecord
  6. Unity3D for Android 纹理压缩支持
  7. [硬件基础] 有刷、有感和无刷无感电机对比
  8. Win10系统,如何使用系统自带截图工具 “PrintScreen键“
  9. word长公式不换行显示的方法
  10. 印度软件腾飞不是偶然