
深度学习 (Deep Learning)

According to dogtime.com, there are 266 different breeds of dogs, and by alone thinking about this number, it frightens me to distinguish them. And most of the people, if they’re normal, just know about 5–10 breeds because you don’t see the chapter “266 Different Dog Breeds” in a Bachelor’s Curriculum.

根据dogtime.com的资料 ,有266种不同的狗,单单思考这个数字,我就难以区分它们。 而且大多数人,如果他们是正常人,只知道大约5-10个品种,因为您不会在学士课程中看到“ 266种不同的犬种”一章。

总览 (Overview)

The main aim of this project is to build an algorithm to classify the different Dog Breeds from the dataset.


This seems like a simple task but when we think of Machine Learning, then it is not! The Images are in random order, having dogs at random spaces in the images, the images are shot in different lightenings, there is no preprocessing done on the data, it’s just a dataset containing simple dogs pictures.

这似乎是一个简单的任务,但是当我们想到机器学习时,事实并非如此! 图像以随机顺序排列,在图像中的随机空间处有狗,图像以不同的亮光拍摄,没有对数据进行任何预处理,而只是一个包含简单狗图像的数据集。

So, the first step is to give the dataset a look.


环境与工具 (Environment and tools)

  • scikit-learn


  • Keras


  • numpy


  • Pandas


  • matplotlib


数据 (Data)

The Dataset used for this project is Stanford Dogs Dataset. The Dataset contains a total of 20,580 images of 120 different dog breeds.

该项目使用的数据集是Stanford Dogs数据集 。 数据集包含120种不同犬种的20580张图像。

The Stanford Dogs dataset contains images of 120 breeds of dogs from around the world. This dataset has been built using images and annotation from ImageNet for the task of fine-grained image categorization.

斯坦福犬数据集包含来自世界各地的120种犬的图像。 此数据集是使用ImageNet的图像和注释构建的,用于精细图像分类。

导入库 (Importing Libraries)

import osimport sysimport kerasimport tarfileimport numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltfrom keras.models import Sequentialfrom keras.engine.training import Modelfrom sklearn.preprocessing import LabelBinarizerfrom keras.preprocessing.image import ImageDataGeneratorfrom keras.layers import Add, Dropout, Flatten, Dense, Activation

数据预处理 (Data Preprocessing)

I found 5 directories to be unusable and hence, didn’t used them. So, I imported a total of 115 Breeds.

我发现5个目录不可用,因此没有使用它们。 因此,我总共导入了115种。

import cv2BASEPATH = './Images'LABELS = set()paths = []for d in os.listdir(BASEPATH):LABELS.add(d)paths.append((BASEPATH + '/' + d, d))# resizing and converting to RGBdef load_and_preprocess_image(path):image = cv2.imread(path)image = cv2.resize(image, (224, 224))image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)return imageX, y = [], []i = 0for path, label in paths:i += 1# Faulty Directoriesif i == 18 or i == 23 or i == 41 or i == 49 or i == 90: continue if path == "./Images/.DS_Store": continuefor image_path in os.listdir(path):image = load_and_preprocess_image(path + "/" + image_path)X.append(image)y.append(label)

Now, the names of the folder are in this pattern ‘n8725563753-Husky’, hence, we need to clean this up to be left with the ‘Husky’ part of the name.

现在,该文件夹的名称采用此模式'n8725563753-Husky' ,因此,我们需要清理该文件夹,以保留名称中的'Husky'部分。

Y = []# Cleaning the names of the directories/targetsfor i in y:Y.append(i.split('-')[1])

标签二值化器 (Label Binarizer)

This dependency is from sklearn.preprocessing and is used to get a binary representation of strings. Why are we using this here? We can’t use ‘Husky’ as the target in a model, we need to convert it into a usable data type, numeric. Hence, we use this.

此依赖项来自sklearn.preprocessing ,用于获取字符串的二进制表示形式。 我们为什么在这里使用它? 我们无法将“ Husky”用作模型中的目标,我们需要将其转换为可用的数据类型numeric 。 因此,我们使用它。

encoder = LabelBinarizer()y = encoder.fit_transform(np.array(y))

分割数据 (Splitting Data)

We are using the train_test_split dependency from sklearn.model_selection.


train_test_split is a function in Sklearn model selection for splitting data arrays into two subsets: for training data and for testing data. With this function, you don't need to divide the dataset manually.

train_test_splitSklearn模型选择中的一个函数,用于将数据数组分为两个子集 :用于训练数据和用于测试数据。 使用此功能,您无需手动划分数据集。

By default, Sklearn train_test_split will make random partitions for the two subsets. However, you can also specify a random state for the operation.

默认情况下,Sklearn train_test_split将对这两个子集进行随机分区。 但是,您也可以为操作指定随机状态。

from sklearn.model_selection import train_test_splitX = np.array(X)x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.25, random_state=87)

Now, after this, we convert the x_train and x_test sets to ‘float32’ and normalize them.

现在,在此之后,我们将x_trainx_test设置转换为“ float32 ”并将其标准化 。

x_train = x_train.astype("float32") / 255.0x_test = x_test.astype("float32") / 255.0

最初查看数据 (Viewing Data initially)

These are the pictures with which we’ll be making our model learn.


转移学习 (Transfer Learning)

Now, Transfer Learning can be a full topic to be explained on its own, but I’ll just scratch the tip of the iceberg here.


Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task.


Why we use Transfer Learning? You don’t want to train a model with millions of nodes, again and again, to use in your projects, hence why you have this concept. The concept of Transfer Learning is that you use a pre-trained model and just retrain some of the layers to adapt it to your requirements.

为什么我们要使用转学? 您不想一次又一次地训练具有数百万个节点的模型以在您的项目中使用,因此为什么会有这个概念。 转移学习的概念是您使用预先训练的模型,而只是重新训练一些层次以使其适应您的需求。

from keras.applications import inception_v3input_size = 224num_classes = 115inception_bottleneck = inception_v3.InceptionV3(weights='imagenet', include_top=False, pooling='avg')temp_train = inception_bottleneck.predict(x_train, batch_size=32, verbose=1)temp_test = inception_bottleneck.predict(x_test, batch_size=32, verbose=1)print('InceptionV3 train bottleneck features shape: {} size: {:,}'.format(temp_train.shape, temp_train.size))print('InceptionV3 test bottleneck features shape: {} size: {:,}'.format(temp_test.shape, temp_test.size))

We set include_top parameter is set to False, which means that we would not import the last layer, Dense layer and we’d use our own layers to adapt the model to our Dataset.

我们设置include_top 参数设置为False,这意味着我们将不导入最后一层Dense层,而将使用我们自己的层使模型适应数据集。

致密层 (Dense Layers)

After this, we add 3 Dense Layers to the model of depths 1024, 512, and 115, number of classes.


model = Sequential()model.add(Flatten())model.add(Dense(1024, activation='elu'))model.add(Dropout(0.45))model.add(Dense(512, activation='elu'))model.add(Dropout(0.35))model.add(Dense(num_classes, activation='softmax'))

Then, we compile this.


model.compile(optimizer=’adam’,loss=’categorical_crossentropy’, metrics=[‘accuracy’])

Then, finally train.


history = model.fit(temp_train, Y_train,epochs = 15,batch_size = 32,validation_data = (temp_test, Y_test))

Do you see? We used temp_train and temp_test here instead of x_train and x_test . This is because we wanted to extend our Inception model not to use this Sequential Model to start training from scratch.

你有看到? 我们使用了temp_traintemp_test 在这里而不是x_trainx_test 这是因为我们希望扩展Inception模型,而不是使用此顺序模型从头开始训练。

损失图 (Loss plots)

score = model.evaluate(temp_test, Y_test, verbose=0)print("%s: %.2f%%" % (model.metrics_names[1], score[1]*100))# summarize history for accuracyplt.subplot(211)plt.plot(history.history['accuracy'])plt.plot(history.history['val_accuracy'])plt.title('model accuracy')plt.ylabel('accuracy')plt.xlabel('epoch')plt.legend(['train', 'test'], loc='upper left')# summarize history for lossplt.subplot(212)plt.plot(history.history['loss'])plt.plot(history.history['val_loss'])plt.title('model loss')plt.ylabel('loss')plt.xlabel('epoch')plt.legend(['train', 'test'], loc='upper left')plt.subplots_adjust(right=3, top=3)plt.show()

结果与结论 (Results & Conclusion)

So, after all this, we reached 77.31% accuracy and I’ll be honest, considering the fact that there were 115 different classes, the model did a pretty good job.


可视化结果 (Visualizing results)

for i in range(9):pyplot.subplot(330 + 1 + i)pyplot.xlabel("Actual: " + y_test[i] + ", Predicted: " + results[i])pyplot.imshow(x_test[i], cmap=pyplot.get_cmap('gray'))plt.subplots_adjust(right=3, top=3)pyplot.show()

So, luckily, this is a subset of the data, there is no miss-classification. hehe

因此,幸运的是,这是数据的子集,没有遗漏分类。 呵呵

可以改进的地方 (Improvements that can be made)

I still think that adding one more Dense layer can make a difference and preprocessing the data will surely help but we’ll give it a shot later. :D

我仍然认为,再增加一个“密集”层可以有所作为,对数据进行预处理肯定会有所帮助,但是稍后我们将对其进行介绍。 :D

