第三卷 第十一章 车辆识别

尝试使用mxnet库对预训练的CNN进行微调,以超过96.54%的准确率识别超过164种车辆品牌和型号(使用非常少的训练数据)。

1、斯坦福汽车数据集简介

斯坦福汽车数据集包含196辆汽车的16185 张图像,但是数据集中的极端类别不平衡,其中一些车辆制造商和型号被严重过度代表(例如,奥迪和宝马各有超过1000个数据点,而特斯拉只有 77 个示例)。

您可以在此处下载数据集的存档:

https://ai.stanford.edu/~jkrause/cars/car_dataset.htmlhttps://ai.stanford.edu/~jkrause/cars/car_dataset.html

或者百度网盘下载

链接:https://pan.baidu.com/s/1G_KFRIXk_BPR4V49alQ-2g 
提取码:xrse

数据集中的DevKit包含难以解析的MATLAB元文件,包含车辆制造商、型号和名称。我们需要进行解析处理以适合我们使用,网盘内有导出的complete_dataset.csv文件。

2、配置文件及数据处理

创建配置文件,car_config.py。

# import the necessary packages
from os import path# define the base path to the cars dataset
BASE_PATH = "/raid/datasets/cars"# based on the base path, derive the images path and meta file path
IMAGES_PATH = path.sep.join([BASE_PATH, "car_ims"])
LABELS_PATH = path.sep.join([BASE_PATH, "complete_dataset.csv"])# define the path to the output training, validation, and testing
# lists
MX_OUTPUT = BASE_PATH
TRAIN_MX_LIST = path.sep.join([MX_OUTPUT, "lists/train.lst"])
VAL_MX_LIST = path.sep.join([MX_OUTPUT, "lists/val.lst"])
TEST_MX_LIST = path.sep.join([MX_OUTPUT, "lists/test.lst"])# define the path to the output training, validation, and testing
# image records
TRAIN_MX_REC = path.sep.join([MX_OUTPUT, "rec/train.rec"])
VAL_MX_REC = path.sep.join([MX_OUTPUT, "rec/val.rec"])
TEST_MX_REC = path.sep.join([MX_OUTPUT, "rec/test.rec"])# define the path to the label encoder
LABEL_ENCODER_PATH = path.sep.join([BASE_PATH, "output/le.cpickle"])# define the RGB means from the ImageNet dataset
R_MEAN = 123.68
G_MEAN = 116.779
B_MEAN = 103.939# define the percentage of validation and testing images relative
# to the number of training images
NUM_CLASSES = 164
NUM_VAL_IMAGES = 0.15
NUM_TEST_IMAGES = 0.15# define the batch size
BATCH_SIZE = 32
NUM_DEVICES = 1

创建build_dataset.py,负责构建训练、验证和测试 .lst 文件。

# import the necessary packages
import car_config as config
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import progressbar
import pickle
import os# read the contents of the labels file, then initialize the list of
# image paths and labels
print("[INFO] loading image paths and labels...")
rows = open(config.LABELS_PATH).read()
rows = rows.strip().split("\n")[1:]
trainPaths = []
trainLabels = []# loop over the rows
for row in rows:# unpack the row, then update the image paths and labels list# (filename, make) = row.split(",")[:2](filename, make, model) = row.split(",")[:3]filename = filename[filename.rfind("/") + 1:]trainPaths.append(os.sep.join([config.IMAGES_PATH, filename]))trainLabels.append("{}:{}".format(make, model))# now that we have the total number of images in the dataset that
# can be used for training, compute the number of images that
# should be used for validation and testing
numVal = int(len(trainPaths) * config.NUM_VAL_IMAGES)
numTest = int(len(trainPaths) * config.NUM_TEST_IMAGES)# our class labels are represented as strings so we need to encode
# them
print("[INFO] encoding labels...")
le = LabelEncoder().fit(trainLabels)
trainLabels = le.transform(trainLabels)# perform sampling from the training set to construct a a validation
# set
print("[INFO] constructing validation data...")
split = train_test_split(trainPaths, trainLabels, test_size=numVal, stratify=trainLabels)
(trainPaths, valPaths, trainLabels, valLabels) = split# perform stratified sampling from the training set to construct a
# a testing set
print("[INFO] constructing testing data...")
split = train_test_split(trainPaths, trainLabels, test_size=numTest, stratify=trainLabels)
(trainPaths, testPaths, trainLabels, testLabels) = split# construct a list pairing the training, validation, and testing
# image paths along with their corresponding labels and output list
# files
datasets = [("train", trainPaths, trainLabels, config.TRAIN_MX_LIST),("val", valPaths, valLabels, config.VAL_MX_LIST),("test", testPaths, testLabels, config.TEST_MX_LIST)]# loop over the dataset tuples
for (dType, paths, labels, outputPath) in datasets:# open the output file for writingprint("[INFO] building {}...".format(outputPath))f = open(outputPath, "w")# initialize the progress barwidgets = ["Building List: ", progressbar.Percentage(), " ", progressbar.Bar(), " ", progressbar.ETA()]pbar = progressbar.ProgressBar(maxval=len(paths), widgets=widgets).start()# loop over each of the individual images + labelsfor (i, (path, label)) in enumerate(zip(paths, labels)):# write the image index, label, and output path to filerow = "\t".join([str(i), str(label), path])f.write("{}\n".format(row))pbar.update(i)# close the output filepbar.finish()f.close()# write the label encoder to file
print("[INFO] serializing label encoder...")
f = open(config.LABEL_ENCODER_PATH, "wb")
f.write(pickle.dumps(le))
f.close()

生成 train.rec 文件的命令,该文件与 TRAIN_MX_REC 配置中的文件路径完全相同:

生成 test.rec数据集:

生成val.rec数据集:

        3、在斯坦福汽车数据集上微调 VGG

首先需要下载VGG16 的预训练权重。
What is Cloud Computing? - Soft Cloud Techhttp://data.dmlc.ml/models/imagenet/vgg/        创建fine_tune_cars.py文件。

# import the necessary packages
import mxnet as mx
import car_config as config
import argparse
import logging
import os# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--vgg", required=False, help="path to pre-trained VGGNet for fine-tuning", default="D:/Project/deeplearn/预训练模型/vgg16_zoo/vgg16")
ap.add_argument("-c", "--checkpoints", required=False, help="path to output checkpoint directory", default="checkpoints")
ap.add_argument("-p", "--prefix", required=False, help="name of model prefix", default="vggnet")
ap.add_argument("-s", "--start-epoch", type=int, default=0, help="epoch to restart training at")
args = vars(ap.parse_args())# set the logging level and output file
logging.basicConfig(level=logging.DEBUG, filename="training_{}.log".format(args["start_epoch"]), filemode="w")# determine the batch
batchSize = config.BATCH_SIZE * config.NUM_DEVICES# construct the training image iterator
trainIter = mx.io.ImageRecordIter(path_imgrec=config.TRAIN_MX_REC,data_shape=(3, 224, 224),batch_size=batchSize,rand_crop=True,rand_mirror=True,rotate=15,max_shear_ratio=0.1,mean_r=config.R_MEAN,mean_g=config.G_MEAN,mean_b=config.B_MEAN,preprocess_threads=config.NUM_DEVICES * 2)# construct the validation image iterator
valIter = mx.io.ImageRecordIter(path_imgrec=config.VAL_MX_REC,data_shape=(3, 224, 224),batch_size=batchSize,mean_r=config.R_MEAN,mean_g=config.G_MEAN,mean_b=config.B_MEAN)# initialize the optimizer and the training contexts
opt = mx.optimizer.SGD(learning_rate=1e-4, momentum=0.9, wd=0.0005, rescale_grad=1.0 / batchSize)
ctx = [mx.gpu(3)]# construct the checkpoints path, initialize the model argument and
# auxiliary parameters, and whether uninitialized parameters should
# be allowed
checkpointsPath = os.path.sep.join([args["checkpoints"], args["prefix"]])
argParams = None
auxParams = None
allowMissing = False# if there is no specific model starting epoch supplied, then we
# need to build the network architecture
if args["start_epoch"] <= 0:# load the pre-trained VGG16 modelprint("[INFO] loading pre-trained model...")(symbol, argParams, auxParams) = mx.model.load_checkpoint(args["vgg"], 0)allowMissing = True# grab the layers from the pre-trained model, then find the# dropout layer *prior* to the final FC layer (i.e., the layer# that contains the number of class labels)# HINT: you can find layer names like this:# for layer in layers:# print(layer.name)# then, append the string ‘_output‘ to the layer namelayers = symbol.get_internals()net = layers["drop7_output"]# construct a new FC layer using the desired number of output# class labels, followed by a softmax outputnet = mx.sym.FullyConnected(data=net, num_hidden = config.NUM_CLASSES, name = "fc8")net = mx.sym.SoftmaxOutput(data=net, name="softmax")# construct a new set of network arguments, removing any previous# arguments pertaining to FC8 (this will allow us to train the# final layer)argParams = dict({k: argParams[k] for k in argParams if "fc8" not in k})# otherwise, a specific checkpoint was supplied
else:# load the checkpoint from diskprint("[INFO] loading epoch {}...".format(args["start_epoch"]))(net, argParams, auxParams) = mx.model.load_checkpoint(checkpointsPath, args["start_epoch"])# initialize the callbacks and evaluation metrics
batchEndCBs = [mx.callback.Speedometer(batchSize, 50)]
epochEndCBs = [mx.callback.do_checkpoint(checkpointsPath)]
metrics = [mx.metric.Accuracy(), mx.metric.TopKAccuracy(top_k=5), mx.metric.CrossEntropy()]# construct the model and train it
print("[INFO] training network...")
model = mx.mod.Module(symbol=net, context=ctx)
model.fit(trainIter,eval_data=valIter,num_epoch=65,begin_epoch=args["start_epoch"],initializer=mx.initializer.Xavier(),arg_params=argParams,aux_params=auxParams,optimizer=opt,allow_missing=allowMissing,eval_metric=metrics,batch_end_callback=batchEndCBs,epoch_end_callback=epochEndCBs)

在小型数据集(如斯坦福汽车)上微调(大型)网络(如 VGG16)时,过度拟合是不可避免的。 即使在应用数据增强时,也有网络中的参数太多而训练示例太少。 因此,获得正确的初始学习率非常重要——这比训练时更重要从零开始的网络。 花点时间微调网络并探索各种初始学习率。 此过程将为您提供在微调期间获得最高准确度的最佳机会。

4、评估我们的车辆分类器

创建test_cars.py文件。

# import the necessary packages
import car_config as config
from customize.tools.ranked import rank5_accuracy
import mxnet as mx
import argparse
import pickle
import os# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-c", "--checkpoints", required=True, help="path to output checkpoint directory")
ap.add_argument("-p", "--prefix", required=True, help="name of model prefix")
ap.add_argument("-e", "--epoch", type=int, required=True, help="epoch # to load")
args = vars(ap.parse_args())# load the label encoder
le = pickle.loads(open(config.LABEL_ENCODER_PATH, "rb").read())# construct the validation image iterator
testIter = mx.io.ImageRecordIter(path_imgrec=config.TEST_MX_REC,data_shape=(3, 224, 224),batch_size=config.BATCH_SIZE,mean_r=config.R_MEAN,mean_g=config.G_MEAN,mean_b=config.B_MEAN)# load our pre-trained model
print("[INFO] loading pre-trained model...")
checkpointsPath = os.path.sep.join([args["checkpoints"], args["prefix"]])
(symbol, argParams, auxParams) = mx.model.load_checkpoint(checkpointsPath, args["epoch"])# construct the model
model = mx.mod.Module(symbol=symbol, context=[mx.gpu(0)])
model.bind(data_shapes=testIter.provide_data, label_shapes=testIter.provide_label)
model.set_params(argParams, auxParams)# initialize the list of predictions and targets
print("[INFO] evaluating model...")
predictions = []
targets = []# loop over the predictions in batches
for (preds, _, batch) in model.iter_predict(testIter):# convert the batch of predictions and labels to NumPy# arrayspreds = preds[0].asnumpy()labels = batch.label[0].asnumpy().astype("int")# update the predictions and targets lists, respectivelypredictions.extend(preds)targets.extend(labels)# apply array slicing to the targets since mxnet will return the
# next full batch size rather than the *actual* number of labels
targets = targets[:len(predictions)]# compute the rank-1 and rank-5 accuracies
(rank1, rank5) = rank5_accuracy(predictions, targets)
print("[INFO] rank-1: {:.2f}%".format(rank1 * 100))
print("[INFO] rank-5: {:.2f}%".format(rank5 * 100))

结果表明,我们能够在测试集上获得 84.22% 的 rank-1 和 96.54% 的 rank-5 准确率。

5、可视化车辆分类结果

创建vis_classification.py文件。

# due to mxnet seg-fault issue, need to place OpenCV import at the
# top of the file
import cv2# import the necessary packages
import car_config as config
from customize.tools.imagetoarraypreprocessor import ImageToArrayPreprocessor
from customize.tools.aspectawarepreprocessor import AspectAwarePreprocessor
from customize.tools.meanpreprocessor import MeanPreprocessor
import numpy as np
import mxnet as mx
import argparse
import pickle
import imutils
import os# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-c", "--checkpoints", required=True, help="path to the checkpoint directory")
ap.add_argument("-p", "--prefix", required=True, help="name of model prefix")
ap.add_argument("-e", "--epoch", type=int, required=True, help="epoch # to load")
ap.add_argument("-s", "--sample-size", type=int, default=10, help="epoch # to load")
args = vars(ap.parse_args())# load the label encoder, followed by the testing dataset file,
# then sample the testing set
le = pickle.loads(open(config.LABEL_ENCODER_PATH, "rb").read())
rows = open(config.TEST_MX_LIST).read().strip().split("\n")
rows = np.random.choice(rows, size=args["sample_size"])# load our pre-trained model
print("[INFO] loading pre-trained model...")
checkpointsPath = os.path.sep.join([args["checkpoints"], args["prefix"]])
model = mx.model.FeedForward.load(checkpointsPath, args["epoch"])# compile the model
model = mx.model.FeedForward(ctx=[mx.gpu(0)],symbol=model.symbol,arg_params=model.arg_params,aux_params=model.aux_params)# initialize the image pre-processors
sp = AspectAwarePreprocessor(width=224, height=224)
mp = MeanPreprocessor(config.R_MEAN, config.G_MEAN, config.B_MEAN)
iap = ImageToArrayPreprocessor(dataFormat="channels_first")# loop over the testing images
for row in rows:# grab the target class label and the image path from the row(target, imagePath) = row.split("\t")[1:]target = int(target)# load the image from disk and pre-process it by resizing the# image and applying the pre-processorsimage = cv2.imread(imagePath)orig = image.copy()orig = imutils.resize(orig, width=min(500, orig.shape[1]))image = iap.preprocess(mp.preprocess(sp.preprocess(image)))image = np.expand_dims(image, axis=0)# classify the image and grab the indexes of the top-5 predictionspreds = model.predict(image)[0]idxs = np.argsort(preds)[::-1][:5]# show the true class labelprint("[INFO] actual={}".format(le.inverse_transform(target)))# format and display the top predicted class labellabel = le.inverse_transform(idxs[0])label = label.replace(":", " ")label = "{}: {:.2f}%".format(label, preds[idxs[0]] * 100)cv2.putText(orig, label, (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)# loop over the predictions and display themfor (i, prob) in zip(idxs, preds):print("\t[INFO] predicted={}, probability={:.2f}%".format(le.inverse_transform(i), preds[i] * 100))# show the imagecv2.imshow("Image", orig)cv2.waitKey(0)

我们微调的 VGG16 网络可以以超过 84% 的 1 级和 95% 的 5 级准确率正确识别车辆的品牌和型号。

Python视觉深度学习系列教程 第三卷 第11章 车辆识别相关推荐

  1. Python视觉深度学习系列教程 第三卷 第5章 在ImageNet上训练VGGNet

            第三卷 第五章 在ImageNet上训练VGGNet 在本章中,我们将从头开始学习如何在 ImageNet 数据集上训练 VGG16 网络架构. 该网络的特点是简单,仅使用3*3 卷积 ...

  2. Python视觉深度学习系列教程 第三卷 第8章 在ImageNet上训练SqueezeNet

            第三卷 第八章 在ImageNet上训练SqueezeNet         关于在ImageNet大规模视觉识别挑战 (ILSVRC) 上训练深度神经网络的最后一章中,将讨论Sque ...

  3. Python视觉深度学习系列教程 第三卷 第9章 Kaggle竞赛:情绪识别

            第三卷 第九章 Kaggle竞赛:情绪识别 在本章中,我们将解决Kaggle的面部表情识别挑战.为了完成这项任务,我们将在训练数据上从头开始训练一个类似VGG的网络,同时考虑到我们的网 ...

  4. Python视觉深度学习系列教程 第三卷 第2章 什么是ImageNet?

            第三卷 第二章 什么是ImageNet? 在本章中,我们将讨论 ImageNet 数据集和相关的 ImageNet 大规模视觉识别挑战 (ILSVRC) . 这一挑战是评估图像分类算法 ...

  5. Python视觉深度学习系列教程 第三卷 第12章 年龄和性别预测

            第三卷 第十二章 年龄和性别预测 为了构建一个能够识别照片中人物年龄和性别的系统,我们将使用 Adience 数据集.我们训练两个模型,一个用于年龄识别,另一个用于性别识别.此外,我们 ...

  6. Python视觉深度学习系列教程 第三卷 第14章 从头开始训练Faster R-CNN

            第三卷 第十四章 从头开始训练Faster R-CNN 本章的目的是达到以下四点: 1.在您的系统上安装和配置 TensorFlow Object Detection API. 2.在 ...

  7. Python视觉深度学习系列教程 第二卷 第4章 微调网络

            第二卷 第四章 微调网络         在上一章中,我们学习了如何将预训练的卷积神经网络视为特征提取器.使用这个特征提取器,我们通过网络向前传播我们的图像数据集,提取给定层的激活,并 ...

  8. Python视觉深度学习系列教程 第二卷 第9章 Kaggle竞赛:Cat与Dog

    第二卷 第九章 Kaggle竞赛:Cat与Dog 在本章中,我们将扩展我们的工作并学习如何为HDF5数据集定义一个图像生成器,适用于使用Keras训练卷积神经网络.该生成器将打开HDF5数据集,为要训 ...

  9. Python视觉深度学习系列教程 第二卷 第10章 GoogLeNet

    第二卷 第十章 GoogLeNet 在本章中,我们将研究GoogLeNet 架构. 首先,与 AlexNet 和 VGGNet 相比,模型架构很小(权重本身为约28MB).作者能够通过移除完全连接的层 ...

  10. Python视觉深度学习系列教程 第一卷 第21章 案例:使用CNN破解验证码

            第一卷 第二十一章 案例:使用CNN破解验证码 Breaking captchas with deep learning, Keras, and TensorFlow - PyImag ...

最新文章

  1. [Linux] set dev label(设置分区卷标)
  2. 【机器学习实战】意大利Covid-19病毒感染数学模型及预测
  3. CodeForces - 1486D Max Median(二分+最长连续子段和)
  4. 你永远都不知道你老公可以多幼稚......
  5. 如何在React JS组件和React JS App中添加CSS样式?
  6. Simulink之变压器漏抗对整流电路的影响
  7. springcloud项目的启动顺序_Spring Cloud微服务项目完整示例,含注册中心,网关,断路器等等...
  8. spring 自动扫包代码放置的位置问题
  9. python沙箱逃逸小结
  10. 数据分析工具有哪些类型
  11. [python3] zipfile压缩目录下所有的文档都被压缩,并解决压缩路径过深的问题
  12. Kettle下载与安装
  13. 前端三级联动 distpicker插件
  14. 学习笔记:FW内容安全概述
  15. 淘宝API item_search_similar - 搜索相似的商品
  16. 一生之书《悉达多》接受这个世界,爱它,属于它
  17. 香蕉树上第二根芭蕉——安装tensorflow中一些问题说明
  18. Test meeting 11.23
  19. VM虚拟机安装CentOS系统的常见BUG
  20. 免费多功能转码机器人(小程序转码机器人)

热门文章

  1. robot framework接口自动化测试post请求
  2. PC/104总线简述
  3. 计算机文化基础—计算机软件
  4. 创建一个基础WDM驱动,并使用MFC调用驱动
  5. 百度文库,道客巴巴等文库免积分下载
  6. 基于java jsp企业人事管理系统mysql
  7. minitab怎么算西格玛水平_计算西格玛水平.ppt
  8. SAP HANA XS 专栏
  9. 怎么样把书上的字很快的弄成电子版
  10. 教你如何清除计算机病毒