目录

1.下载MNIST数据集

2.生成MNIST图片训练、验证、测试数据集

3.制作LMDB数据库文件

4.准备LeNet-5网络结构定义模型.prototxt文件

5.准备模型求解配置文件_solver.prototxt

6.开始训练并生成日志文件

7.训练日志画图(可视化一些训练数据)plot_training_log.py

7.2日志解析成txt文件(若干数据字段可供画图)parse_log.py

8.模型测试和评估(用于选取较优模型)

8.1.测试模型准确率

8.2评估模型性能

9.手写数字识别(模型部署)

10.数据增强再训练

10.1训练数据增加方法

10.2 接着上次训练状态再训练

10.3在训练的时候实时增加数据的方法:第三方实时扰动的Caffe层

11.caffe-augmentation

Realtime data augmentation

How to use


环境:

OS:Ubuntu 18.04LTS

caffe环境

1.下载MNIST数据集

这里使用Bengio组封装好的MNIST数据集

(题外话:Bengio:Yoshua Bengio:孤军奋战的AI学者和他的乌托邦情怀)

在控制台下输入:

wget http://deeplearning.net/data/mnist/mnist.pkl.gz

在当前文件夹下得到mnist.pkl.gz压缩文件。

2.生成MNIST图片训练、验证、测试数据集

mnist.pkl.gz这个压缩包中就是mnist数据集的  训练集train、验证集validate、测试集test采用pickle导出的文件被压缩为gzip格式,所以采用python中的gzip模块当成文件就可以读取。其中每个数据集是一个元组,第一个元素存储的是手写数字的图片:长度为28*28=728的一维浮点型号numpy数组,这个数组就算单一通道的灰度图像,归一化后的,最大值1代表白色,最小值0 代表黑色;元组的第二个元素代表的是图片的对于数字标签,是一个一维的整型numpy数组,按照下标位置对应图片中的数字。

知道了以上数据结构信息,就可以使用Python脚本完成数据->图片的转换:

执行以下convert_mnist.py 脚本:(这个脚本在当前文件夹下创建一个mnist文件夹,然后在mnist文件夹下创建3个子文件夹:train、val、test;分别用来表示对应生存的训练、验证、测试数据集的图片;

train下有5万图像、val和test文件下分别有1万幅图像。

# Load the dataset, 从压缩文件读取MNIST数据集:
print('Loading data from mnist.pkl.gz ...')
with gzip.open('mnist.pkl.gz', 'rb') as f:train_set, valid_set, test_set = pickle.load(f)#在当前路径下生成mnist文件夹
imgs_dir = 'mnist'
os.system('mkdir -p {}'.format(imgs_dir))#datasets是个字典 键-值对 dataname-dataset 对
datasets = {'train': train_set, 'val': valid_set, 'test': test_set}
for dataname, dataset in datasets.items():print('Converting {} dataset ...'.format(dataname))data_dir = os.sep.join([imgs_dir, dataname])    #字符串拼接os.system('mkdir -p {}'.format(data_dir))       #生成对应的文件夹# i代表数据的序号,用zip()函数读取对应的位置的图片和标签for i, (img, label) in enumerate(zip(*dataset)):filename = '{:0>6d}_{}.jpg'.format(i, label)filepath = os.sep.join([data_dir, filename])img = img.reshape((28, 28)) #将一维数组还原成二维数组#用pyplot保存可以自动归一化生成像素值在[0,255]之间的灰度图pyplot.imsave(filepath, img, cmap='gray')if (i+1) % 10000 == 0:print('{} images converted!'.format(i+1))

图片的命令规则:第一个地段是6位数字是图片的序号_第二个字段是该图的标签.jpg

3.制作LMDB数据库文件

使用Caffe提高的工具:convert_imageset命令:

先看看该命令的帮助说明:

在控制台执行以下命令:

/home/yang/caffe/build/tools/convert_imageset  -help

得到一些说明,提取关键说明:

convert_imageset: Convert a set of images to the leveldb/lmdb
format used as input for Caffe.
Usage:convert_imageset [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME
The ImageNet dataset for the training demo is athttp://www.image-net.org/download-images...Flags from tools/convert_imageset.cpp:-backend (The backend {lmdb, leveldb} for storing the result) type: stringdefault: "lmdb"-check_size (When this option is on, check that all the datum have the samesize) type: bool default: false-encode_type (Optional: What type should we encode the image as('png','jpg',...).) type: string default: ""-encoded (When this option is on, the encoded image will be save in datum)type: bool default: false-gray (When this option is on, treat images as grayscale ones) type: booldefault: false-resize_height (Height images are resized to) type: int32 default: 0-resize_width (Width images are resized to) type: int32 default: 0-shuffle (Randomly shuffle the order of images and their labels) type: booldefault: false

该命令需要一个图片路径和标签的列表文件.txt;该文件的每一行是一幅图片的全路径 标签

比如:train.txt 局部如下:

mnist/train/033247_5.jpg 5
mnist/train/025404_9.jpg 9
mnist/train/026385_8.jpg 8
mnist/train/013058_5.jpg 5
mnist/train/006524_5.jpg 5

...

我们需要将第2步骤生成的train、val、test文件夹下的所有图像的路径和标签生存三个train.txt、val.txt、test.txt文件:

执行如下的三条命令,分类生成train.txt;val.txt;test.txt文件

python gen_caffe_imglist.py mnist/train train.txt
python gen_caffe_imglist.py mnist/val val.txt
python gen_caffe_imglist.py mnist/test test.txt

其中gen_caffe_imglist.py脚本如下:传入的第一个参数是包含图片的文件路径(相对路径)第二个参数是 生存的.txt文件名(路径)

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Mon Dec  3 18:35:55 2018
@author: yang
"""import os
import sysinput_path = sys.argv[1].rstrip(os.sep)
output_path = sys.argv[2]filenames = os.listdir(input_path)with open(output_path, 'w') as f:for filename in filenames:filepath = os.sep.join([input_path, filename])label = filename[:filename.rfind('.')].split('_')[1]line = '{} {}\n'.format(filepath, label)f.write(line)
f.close()

这样就生存了三个数据集图片文件列表和对应的标签了,下面就可以调用caffe提供的convert_imageset命令实现转换了:

/home/yang/caffe/build/tools/convert_imageset ./ train.txt train_lmdb --gray --shuffle/home/yang/caffe/build/tools/convert_imageset ./ val.txt val_lmdb --gray --shuffle/home/yang/caffe/build/tools/convert_imageset ./ test.txt test_lmdb --gray --shuffle

然后就在当前文件夹下生成了3个LMDB文件夹了:train_lmdb;val_lmdb;test_lmdb;

4.准备LeNet-5网络结构定义模型.prototxt文件

lenet_train_val.prototxt 这个文件是用来训练模型用的;内容稍微区别于发布文件lenet.prototxt;

lenet_train_val.prototxt内容如下:(注意需要修改开头的输入数据lmdb文件的路径)

name: "LeNet"
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {mean_value: 128scale: 0.00390625}data_param {source: "train_lmdb"batch_size: 64backend: LMDB}
}
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mean_value: 128scale: 0.00390625}data_param {source: "val_lmdb"batch_size: 100backend: LMDB}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 20kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 50kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "ip1"type: "InnerProduct"bottom: "pool2"top: "ip1"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 500weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "relu1"type: "ReLU"bottom: "ip1"top: "ip1"
}
layer {name: "ip2"type: "InnerProduct"bottom: "ip1"top: "ip2"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 10weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "accuracy"type: "Accuracy"bottom: "ip2"bottom: "label"top: "accuracy"include {phase: TEST}
}
layer {name: "loss"type: "SoftmaxWithLoss"bottom: "ip2"bottom: "label"top: "loss"
}

5.准备模型求解配置文件_solver.prototxt

lenet_solver.prototxt内容如下:

注意修改net定义文件的路径(这里是相对路径就是文件名)net: "lenet_train_val.prototxt"

和存储训练迭代中间结果的快照文件夹:snapshot_prefix: "snapshot" ,先在当前文件夹下创建一个快照文件夹snapshot

# The train/validate net protocol buffer definition
net: "lenet_train_val.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 100
# The maximum number of iterations
max_iter: 36000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "snapshot"
# solver mode: CPU or GPU
solver_mode: GPU

6.开始训练并生成日志文件

首先在当前文件夹下创建一个存储日志的文件夹:trainLog文件夹

然后执行以下命令:

train命令需要带 网络模型定义文件lenet_solver.prototxt

其他参数可以通过-help 查看一下

/home/yang/caffe/build/tools/caffe train -solver lenet_solver.prototxt -gpu 0 -log_dir ./trainLog

训练结束在快照文件夹下生成了不同迭代次数的求解状态文件*.solverstate和网络模型参数文件:*.caffemodel

7.训练日志画图(可视化一些训练数据)plot_training_log.py

caffe提供了可视化log的工具:

python /home/yang/caffe/tools/extra/plot_training_log.py

执行如下命令查看帮助:

python /home/yang/caffe/tools/extra/plot_training_log.py -help
yang@yang-System-Product-Name:~/caffe/data/mnist_Bengio/trainLog$ python /home/yang/caffe/tools/extra/plot_training_log.py -help
This script mainly serves as the basis of your customizations.
Customization is a must.
You can copy, paste, edit them in whatever way you want.
Be warned that the fields in the training log may change in the future.
You had better check the data files and change the mapping from field name tofield index in create_field_index before designing your own plots.
Usage:./plot_training_log.py chart_type[0-7] /where/to/save.png /path/to/first.log ...
Notes:1. Supporting multiple logs.2. Log file name must end with the lower-cased ".log".
Supported chart types:0: Test accuracy  vs. Iters1: Test accuracy  vs. Seconds2: Test loss  vs. Iters3: Test loss  vs. Seconds4: Train learning rate  vs. Iters5: Train learning rate  vs. Seconds6: Train loss  vs. Iters7: Train loss  vs. Seconds

Caffe提供了可视化log的工具:在/home/yang/caffe/tools/extra 下面的polt_training_log.py.example 文件
把这个文件复制一份并命名为polt_training_log.py ,就可以用这个python脚本来画图:
这个脚本的输入参数类型是:需要画什么图、生成的图片存储路径与文件名、训练得到log文件路径

支持画8中图:
0:测试准确率 vs. 迭代次数
1:测试准确率 vs. 训练时间(秒)
2:测试loss vs. 迭代次数
3:测试loss vs. 训练时间
4:学习率lr vs. 迭代次数
5:学习率lr vs. 训练时间
6:训练loss vs.迭代次数
7: 训练loss vs.训练时间

在控制台执行如下命令:生成以上8幅图;

python /home/yang/caffe/tools/extra/plot_training_log.py 0 test_acc_vs_iters.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 1 test_acc_vs_time.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 2 test_loss_vs_iters.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 3 test_loss_vs_time.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 4 lr_vs_iters.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 5 lr_vs_time.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 6 train_loss_vs_iters.png caffeLeNetTrain20181203.log
python /home/yang/caffe/tools/extra/plot_training_log.py 7 train_loss_vs_time.png caffeLeNetTrain20181203.log

如下是train的准确率-迭代次数图:

7.2日志解析成txt文件(若干数据字段可供画图)parse_log.py

yang@yang-System-Product-Name:~/caffe/data/mnist_Bengio/trainLog$ python /home/yang/caffe/tools/extra/parse_log.py -h
usage: parse_log.py [-h] [--verbose] [--delimiter DELIMITER]logfile_path output_dirParse a Caffe training log into two CSV files containing training and testing
informationpositional arguments:logfile_path          Path to log fileoutput_dir            Directory in which to place output CSV filesoptional arguments:-h, --help            show this help message and exit--verbose             Print some extra info (e.g., output filenames)--delimiter DELIMITERColumn delimiter in output files (default: ',')

2.parse_log.py文件的作用就是:将你的日志文件分解成两个txt(csv)文本文件。
终端输入如下命令
python ./tools/extra/parse_log.py ./examples/myfile/a.log  ./examples/myfile/
便会在myfile/目录下产生a.log.train 和a.log.test的文件,根据这两个文件你可以使用matplotlib库画出你想要的图像。

/home/yang/caffe/tools/extra/parse_log.py caffeLeNetTrain20181203.log ./

下面的指令解析caffeLeNetTrain20181203.log 并在当前文件夹./下生成 两个.txt文件:caffeLeNetTrain20181203.log.train;caffeLeNetTrain20181203.log.test
这两个文本文件包含这些字段:NumIters,Seconds,LearningRate,accuracy,loss

根据这两个文件你可以使用matplotlib库画出你想要的图像。

下面是训练日志解析文件局部:caffeLeNetTrain20181203.log.train:

NumIters,Seconds,LearningRate,loss
0.0,0.155351,0.01,2.33102
100.0,0.482773,0.01,0.167176
200.0,0.806851,0.00992565,0.1556
300.0,1.1295,0.00985258,0.0575197
400.0,1.460222,0.00978075,0.0952922
500.0,1.897946,0.00971013,0.0684174
600.0,2.216532,0.00964069,0.0514046

下面是测试文件解析局部:caffeLeNetTrain20181203.log.test

NumIters,Seconds,LearningRate,accuracy,loss
0.0,0.129343,0.00971013,0.0919,2.33742
500.0,1.895023,0.00971013,0.976,0.0833776
1000.0,3.602925,0.00937411,0.9794,0.0671232
1500.0,5.299409,0.00906403,0.9853,0.0522081
2000.0,6.99157,0.00877687,0.9856,0.0475213
2500.0,8.691082,0.00851008,0.9859,0.0473052

利用Python的 pandas 和 matplotlib 可以画出以上字段的各个字段的曲线:

import pandas as pd
import matplotlib.pyplot as plt

如下是画出训练和验证(测试)的loss-NumIters迭代次数 曲线图:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Dec  4 10:53:28 2018@author: yang
"""import pandas as pd
import matplotlib.pyplot as plt train_log = pd.read_csv("caffeLeNetTrain20181203.log.train")
test_log = pd.read_csv("caffeLeNetTrain20181203.log.test")
_, ax1 = plt.subplots()
ax1.set_title("train loss and test loss")
ax1.plot(train_log["NumIters"], train_log["loss"], alpha=0.5)
ax1.plot(test_log["NumIters"], test_log["loss"], 'g')
ax1.set_xlabel('iteration')
ax1.set_ylabel('train loss')
plt.legend(loc='upper left')
ax2 = ax1.twinx()
#ax2.plot(test_log["NumIters"], test_log["LearningRate"], 'r')
#ax2.plot(test_log["NumIters"], test_log["LearningRate"], 'm')
#ax2.set_ylabel('test LearningRate')
#plt.legend(loc='upper right')
plt.show()
print('Done.')

8.模型测试和评估(用于选取较优模型)

8.1.测试模型准确率

训练好后,就需要对模型进行测试和评估

其实在训练过程中,每迭代500此,就已经在val_mldb上对模型进行了准确率的评估了;
不过MNIST除了验证集还有测试集,对于模型的选择,以测试集合为准进行评估(泛化能力)

对lenet_train_val.ptototxt 文件的头部数据层部分稍作修改,删去TRAIN层,将TEST层的数据源路径改为test_lmdb文件路径:

lenet_test.prototxt:

name: "LeNet"
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mean_value: 128scale: 0.00390625}data_param {source: "test_lmdb"batch_size: 100backend: LMDB}
}...

下面执行caffe的测试命令:并生成日志文件:

/home/yang/caffe/build/tools/caffe test -model lenet_test.prototxt -weights ./snapshot/lenet_solver_iter_5000.caffemodel -gpu 0 -iterations 100 -log_dir ./testLog

test命令参数说明:
-model 指定测试网络定义文件lenet_test.prototxt
-weights 指定一个迭代次数下生成的权重文件lenet_solver_iter_5000.caffemodel(这个需要在所以这些权重文件下选取最优的一个)

-gpu 0 :使用0号GPU
-iterations 100
-iterations参数与 lenet_solver.prototxt文件下的 test_iter参赛类似:
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100

要遍历所有的待测试图像1万张
需要满足:-iterations * batch_size=10,000

batch_size是lenet_test.prototxt 中的测试数据层中指定的批处理的大小参数。

然后,在testLog文件夹下得到caffe.INFO的终端命令行输出记录  和  
caffe.yang-System-Product-Name.yang.log.INFO.20181204-095911.3660 是测试日志文件;
caffe.INFO是终端屏幕输出;

终端部分输出如下(关注最后的Loss和accuracy):

...
I1204 09:59:16.654608  3660 caffe.cpp:281] Running for 100 iterations.
I1204 09:59:16.670982  3660 caffe.cpp:304] Batch 0, accuracy = 0.98
I1204 09:59:16.671051  3660 caffe.cpp:304] Batch 0, loss = 0.0443168
I1204 09:59:16.672643  3660 caffe.cpp:304] Batch 1, accuracy = 1
I1204 09:59:16.672709  3660 caffe.cpp:304] Batch 1, loss = 0.0175841
I1204 09:59:16.674376  3660 caffe.cpp:304] Batch 2, accuracy = 0.99
I1204 09:59:16.674437  3660 caffe.cpp:304] Batch 2, loss = 0.0308315
...
I1204 09:59:16.795164  3671 data_layer.cpp:73] Restarting data prefetching from start.
I1204 09:59:16.795873  3660 caffe.cpp:304] Batch 97, accuracy = 0.98
I1204 09:59:16.795882  3660 caffe.cpp:304] Batch 97, loss = 0.0427303
I1204 09:59:16.797765  3660 caffe.cpp:304] Batch 98, accuracy = 0.97
I1204 09:59:16.797775  3660 caffe.cpp:304] Batch 98, loss = 0.107767
I1204 09:59:16.798722  3660 caffe.cpp:304] Batch 99, accuracy = 0.99
I1204 09:59:16.798730  3660 caffe.cpp:304] Batch 99, loss = 0.0540964
I1204 09:59:16.798734  3660 caffe.cpp:309] Loss: 0.0391683
I1204 09:59:16.798739  3660 caffe.cpp:321] accuracy = 0.9879
I1204 09:59:16.798746  3660 caffe.cpp:321] loss = 0.0391683 (* 1 = 0.0391683 loss)

可以看出程序对每一个Batch的准确率都进行了计算,最后得到了一个总的准确率;
当训练生成的存档模型不是很多的时候,可以对照验证数据Loss 小,accuracy高的区域,手动人工选取一个最优的模型;
如是模型存档快照比较多,可以利用测试数据集进行挑选模型,写脚本来遍历所有的模型,得到一个Loss小,accuracy高的模型;
一般而言,当数据多时,测试集合的loss最小和accuracy最高的 模型就越有可能是同一个,如果不是同一个,通常选取loss最小的模型泛化能力会好一些;

其实,训练集、验证集、测试集都只是对真是数据分布情况的采样,
从大数据量挑选的模型比从小数据量更有信心而已了;

下面在训练日志文件夹下 利用 7.2节中的parse_log.py解析训练日志caffe.yang-System-Product-Name.yang.log.INFO.20181203-184414.8457文件得到的 验证集合(TEST)的csv文件:

利用如下Python脚本画出: val_loss-iterNums   val_accuracy-iterNums 画在一张图中:

val_loss_val_accuracy.py 如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Dec  4 10:53:28 2018@author: yang
"""import pandas as pd
import matplotlib.pyplot as plt val_log = pd.read_csv("caffeLeNetTrain20181203.log.test") #验证数据集
_, ax1 = plt.subplots()
ax1.set_title("val loss and val accuracy")
ax1.plot(val_log["NumIters"], val_log["loss"], 'g')
ax1.set_xlabel('iterations')
ax1.set_ylabel('val loss')
plt.legend(loc='center left')
ax2 = ax1.twinx() ax2.plot(val_log["NumIters"], val_log["accuracy"], 'm')
ax2.set_ylabel('val accuracy')
plt.legend(loc='center right')
plt.show()
print('Done.')

得到:val_loss_val_accuracy_vs_iterNums1.png

随着迭代次数的增加,模型在验证集上的loss下降,accuracy上升;所以可以选择35000次的模型进行部署;

我们这里在测试集上测试一下35000次迭代生成的模型快照,与上面的5000次迭代生成的模型快照作对比。

在终端输入:

/home/yang/caffe/build/tools/caffe test -model lenet_test.prototxt -weights ./snapshot/lenet_solver_iter_35000.caffemodel -gpu 0 -iterations 100 -log_dir ./testLog

查看输出的最后三行:

I1204 14:58:12.377961  6560 caffe.cpp:309] Loss: 0.0267361
I1204 14:58:12.377966  6560 caffe.cpp:321] accuracy = 0.9904
I1204 14:58:12.377972  6560 caffe.cpp:321] loss = 0.0267361 (* 1 = 0.0267361 loss)

发现Loss 和accuracy确实比5000次迭代的更优了!

8.2评估模型性能

评估模型性能主要是时间和空间占用情况,即评估模型的一次前向传播所需要的运行时间和内存占用情况:

Caffe支持此评估,使用Caffe提供的工具:

/home/yang/caffe/build/tools/caffe time 命令

只要有模型网络描述文件.prototxt就可以:

在Caffe主目录 examples/mnist/下得到 lenet.prototxt文件,然后执行如下命令:

/home/yang/caffe/build/tools/caffe time -model lenet.prototxt -gpu 0

部分输出如下:

yang@yang-System-Product-Name:~/caffe/data/mnist_Bengio$ /home/yang/caffe/build/tools/caffe time -model lenet.prototxt -gpu 0
/home/yang/caffe/build/tools/caffe: /home/yang/anaconda2/lib/libtiff.so.5: no version information available (required by /usr/local/lib/libopencv_imgcodecs.so.3.4)
I1204 15:12:33.080821  6687 caffe.cpp:339] Use GPU with device ID 0
I1204 15:12:33.266084  6687 net.cpp:53] Initializing net from parameters:
...I1204 15:12:33.266204  6687 layer_factory.hpp:77] Creating layer data
I1204 15:12:33.266217  6687 net.cpp:86] Creating Layer data
I1204 15:12:33.266227  6687 net.cpp:382] data -> data
I1204 15:12:33.275761  6687 net.cpp:124] Setting up data
I1204 15:12:33.275779  6687 net.cpp:131] Top shape: 64 1 28 28 (50176)
I1204 15:12:33.275794  6687 net.cpp:139] Memory required for data: 200704
I1204 15:12:33.275801  6687 layer_factory.hpp:77] Creating layer conv1
I1204 15:12:33.275822  6687 net.cpp:86] Creating Layer conv1
I1204 15:12:33.275828  6687 net.cpp:408] conv1 <- data
I1204 15:12:33.275837  6687 net.cpp:382] conv1 -> conv1
I1204 15:12:33.680294  6687 net.cpp:124] Setting up conv1
I1204 15:12:33.680315  6687 net.cpp:131] Top shape: 64 20 24 24 (737280)
I1204 15:12:33.680322  6687 net.cpp:139] Memory required for data: 3149824
...
I1204 15:12:33.685878  6687 net.cpp:244] This network produces output prob
I1204 15:12:33.685887  6687 net.cpp:257] Network initialization done.
I1204 15:12:33.685910  6687 caffe.cpp:351] Performing Forward
I1204 15:12:33.703292  6687 caffe.cpp:356] Initial loss: 0
I1204 15:12:33.703311  6687 caffe.cpp:357] Performing Backward
I1204 15:12:33.703316  6687 caffe.cpp:365] *** Benchmark begins ***
I1204 15:12:33.703320  6687 caffe.cpp:366] Testing for 50 iterations.
I1204 15:12:33.705480  6687 caffe.cpp:394] Iteration: 1 forward-backward time: 2.14998 ms.
I1204 15:12:33.707129  6687 caffe.cpp:394] Iteration: 2 forward-backward time: 1.63258 ms.
I1204 15:12:33.709730  6687 caffe.cpp:394] Iteration: 3 forward-backward time: 2.58979 ms.
...
I1204 15:12:33.783918  6687 caffe.cpp:397] Average time per layer:
I1204 15:12:33.783921  6687 caffe.cpp:400]       data   forward: 0.0011584 ms.
I1204 15:12:33.783926  6687 caffe.cpp:403]       data   backward: 0.00117824 ms.
I1204 15:12:33.783929  6687 caffe.cpp:400]      conv1   forward: 0.449037 ms.
I1204 15:12:33.783933  6687 caffe.cpp:403]      conv1   backward: 0.251798 ms.
I1204 15:12:33.783936  6687 caffe.cpp:400]      pool1   forward: 0.0626419 ms.
I1204 15:12:33.783941  6687 caffe.cpp:403]      pool1   backward: 0.00116608 ms.
I1204 15:12:33.783943  6687 caffe.cpp:400]      conv2   forward: 0.194311 ms.
I1204 15:12:33.783947  6687 caffe.cpp:403]      conv2   backward: 0.190176 ms.
I1204 15:12:33.783965  6687 caffe.cpp:400]      pool2   forward: 0.0201024 ms.
I1204 15:12:33.783969  6687 caffe.cpp:403]      pool2   backward: 0.00117952 ms.
I1204 15:12:33.783972  6687 caffe.cpp:400]        ip1   forward: 0.0706387 ms.
I1204 15:12:33.783977  6687 caffe.cpp:403]        ip1   backward: 0.0717856 ms.
I1204 15:12:33.783980  6687 caffe.cpp:400]      relu1   forward: 0.00906752 ms.
I1204 15:12:33.783984  6687 caffe.cpp:403]      relu1   backward: 0.0011584 ms.
I1204 15:12:33.783988  6687 caffe.cpp:400]        ip2   forward: 0.0247597 ms.
I1204 15:12:33.783993  6687 caffe.cpp:403]        ip2   backward: 0.0221478 ms.
I1204 15:12:33.783996  6687 caffe.cpp:400]       prob   forward: 0.0119437 ms.
I1204 15:12:33.784000  6687 caffe.cpp:403]       prob   backward: 0.00113536 ms.
I1204 15:12:33.784006  6687 caffe.cpp:408] Average Forward pass: 0.938644 ms.
I1204 15:12:33.784010  6687 caffe.cpp:410] Average Backward pass: 0.637078 ms.
I1204 15:12:33.784014  6687 caffe.cpp:412] Average Forward-Backward: 1.61356 ms.
I1204 15:12:33.784021  6687 caffe.cpp:414] Total Time: 80.678 ms.
I1204 15:12:33.784029  6687 caffe.cpp:415] *** Benchmark ends ***

我的电脑的GPU是NVIDIA  GeForce GTX 960 2G显存的,执行一次Lenet的前向传播,平均时间不到1ms

Average Forward pass: 0.938644 ms.

再来测试一下CPU下运行的时间:去掉以上命令中的 -gpu 0参数

输入:

/home/yang/caffe/build/tools/caffe time -model lenet.prototxt

结尾输出:

I1204 15:18:10.153908  6768 caffe.cpp:397] Average time per layer:
I1204 15:18:10.153916  6768 caffe.cpp:400]       data   forward: 0.00064 ms.
I1204 15:18:10.153939  6768 caffe.cpp:403]       data   backward: 0.0009 ms.
I1204 15:18:10.153951  6768 caffe.cpp:400]      conv1   forward: 2.21126 ms.
I1204 15:18:10.153965  6768 caffe.cpp:403]      conv1   backward: 3.18376 ms.
I1204 15:18:10.153981  6768 caffe.cpp:400]      pool1   forward: 2.59676 ms.
I1204 15:18:10.153996  6768 caffe.cpp:403]      pool1   backward: 0.0006 ms.
I1204 15:18:10.154012  6768 caffe.cpp:400]      conv2   forward: 6.02428 ms.
I1204 15:18:10.154027  6768 caffe.cpp:403]      conv2   backward: 4.72778 ms.
I1204 15:18:10.154043  6768 caffe.cpp:400]      pool2   forward: 1.6211 ms.
I1204 15:18:10.154058  6768 caffe.cpp:403]      pool2   backward: 0.00072 ms.
I1204 15:18:10.154073  6768 caffe.cpp:400]        ip1   forward: 0.3852 ms.
I1204 15:18:10.154086  6768 caffe.cpp:403]        ip1   backward: 0.2337 ms.
I1204 15:18:10.154150  6768 caffe.cpp:400]      relu1   forward: 0.04076 ms.
I1204 15:18:10.154165  6768 caffe.cpp:403]      relu1   backward: 0.0005 ms.
I1204 15:18:10.154181  6768 caffe.cpp:400]        ip2   forward: 0.03236 ms.
I1204 15:18:10.154196  6768 caffe.cpp:403]        ip2   backward: 0.01712 ms.
I1204 15:18:10.154213  6768 caffe.cpp:400]       prob   forward: 0.04284 ms.
I1204 15:18:10.154230  6768 caffe.cpp:403]       prob   backward: 0.02084 ms.
I1204 15:18:10.154249  6768 caffe.cpp:408] Average Forward pass: 12.9634 ms.
I1204 15:18:10.154259  6768 caffe.cpp:410] Average Backward pass: 8.19254 ms.
I1204 15:18:10.154268  6768 caffe.cpp:412] Average Forward-Backward: 21.2 ms.
I1204 15:18:10.154278  6768 caffe.cpp:414] Total Time: 1060 ms.
I1204 15:18:10.154287  6768 caffe.cpp:415] *** Benchmark ends ***

本台计算机的CPU是Intel® Core™ i7-6700K CPU @ 4.00GHz × 8 ,Caffe的基本线性代数子库使用的是OpenBLAS

平均一次前向传播需要 12.96ms的时间,远远比GPU的运算时间慢;

9.手写数字识别(模型部署)

有了训练好的模型,我们就可以用来识别手写数字了,这里测试用的是test数据集的图片和之前生成的test.txt文件列表:

下面是用来完成以上任务的recognize_digit.py

recognize_digit.py如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Dec  4 15:29:44 2018@author: yang
"""import sys
sys.path.append('/home/yang/caffe/python')
import numpy as np
import cv2
import caffeMEAN = 128
SCALE = 0.00390625imglist = sys.argv[1]  #第一个参数输入test.txt 文件路径caffe.set_mode_gpu()
caffe.set_device(0)
net = caffe.Net('lenet.prototxt', './snapshot/lenet_solver_iter_36000.caffemodel', caffe.TEST)
net.blobs['data'].reshape(1, 1, 28, 28)with open(imglist, 'r') as f:line = f.readline()while line:imgpath, label = line.split()line = f.readline()image = cv2.imread(imgpath, cv2.IMREAD_GRAYSCALE).astype(np.float) - MEANimage *= SCALEnet.blobs['data'].data[...] = imageoutput = net.forward()pred_label = np.argmax(output['prob'][0])print('Predicted digit for {} is {}'.format(imgpath, pred_label))

在终端执行:因为1万张图片太多,所以将屏幕的标准输出重定向lenet_model_test.txt文件:

test.txt 参数是 测试数据集的文件路径 和标签的列表

python recognize_digit.py test.txt >& lenet_model_test.txt

如下是部分预测输出结果:

Predicted digit for mnist/test/005120_2.jpg is 2
Predicted digit for mnist/test/006110_1.jpg is 1
Predicted digit for mnist/test/004019_6.jpg is 6
Predicted digit for mnist/test/009045_7.jpg is 7
Predicted digit for mnist/test/004194_4.jpg is 4
Predicted digit for mnist/test/006253_7.jpg is 7
Predicted digit for mnist/test/000188_0.jpg is 0
Predicted digit for mnist/test/001068_8.jpg is 8
Predicted digit for mnist/test/007297_8.jpg is 8
Predicted digit for mnist/test/000003_0.jpg is 0
Predicted digit for mnist/test/009837_7.jpg is 7
Predicted digit for mnist/test/000093_3.jpg is 3

10.数据增强再训练

训练数据的增加参考:https://github.com/frombeijingwithlove/dlcv_for_beginners/tree/master/chap6/data_augmentation

这里因为Mnist是灰度图,所以我们就使用平移和旋转来增加数据:

10.1训练数据增加方法

在工作路径文件夹下将以上链接的run_augmentation.py 和 image_augmentation.py 下载下来:

run_augmentation.py如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Dec  4 16:00:27 2018@author: yang
"""import os
import argparse
import random
import math
from multiprocessing import Process, cpu_countimport cv2import image_augmentation as iadef parse_args():parser = argparse.ArgumentParser(description='A Simple Image Data Augmentation Tool',formatter_class=argparse.ArgumentDefaultsHelpFormatter)parser.add_argument('input_dir',help='Directory containing images')parser.add_argument('output_dir',help='Directory for augmented images')parser.add_argument('num',help='Number of images to be augmented',type=int)parser.add_argument('--num_procs',help='Number of processes for paralleled augmentation',type=int, default=cpu_count())parser.add_argument('--p_mirror',help='Ratio to mirror an image',type=float, default=0.5)parser.add_argument('--p_crop',help='Ratio to randomly crop an image',type=float, default=1.0)parser.add_argument('--crop_size',help='The ratio of cropped image size to original image size, in area',type=float, default=0.8)parser.add_argument('--crop_hw_vari',help='Variation of h/w ratio',type=float, default=0.1)parser.add_argument('--p_rotate',help='Ratio to randomly rotate an image',type=float, default=1.0)parser.add_argument('--p_rotate_crop',help='Ratio to crop out the empty part in a rotated image',type=float, default=1.0)parser.add_argument('--rotate_angle_vari',help='Variation range of rotate angle',type=float, default=10.0)parser.add_argument('--p_hsv',help='Ratio to randomly change gamma of an image',type=float, default=1.0)parser.add_argument('--hue_vari',help='Variation of hue',type=int, default=10)parser.add_argument('--sat_vari',help='Variation of saturation',type=float, default=0.1)parser.add_argument('--val_vari',help='Variation of value',type=float, default=0.1)parser.add_argument('--p_gamma',help='Ratio to randomly change gamma of an image',type=float, default=1.0)parser.add_argument('--gamma_vari',help='Variation of gamma',type=float, default=2.0)args = parser.parse_args()args.input_dir = args.input_dir.rstrip('/')args.output_dir = args.output_dir.rstrip('/')return argsdef generate_image_list(args):filenames = os.listdir(args.input_dir)num_imgs = len(filenames)num_ave_aug = int(math.floor(args.num/num_imgs))rem = args.num - num_ave_aug*num_imgslucky_seq = [True]*rem + [False]*(num_imgs-rem)random.shuffle(lucky_seq)img_list = [(os.sep.join([args.input_dir, filename]), num_ave_aug+1 if lucky else num_ave_aug)for filename, lucky in zip(filenames, lucky_seq)]random.shuffle(img_list)  # in case the file size are not uniformly distributedlength = float(num_imgs) / float(args.num_procs)indices = [int(round(i * length)) for i in range(args.num_procs + 1)]return [img_list[indices[i]:indices[i + 1]] for i in range(args.num_procs)]def augment_images(filelist, args):for filepath, n in filelist:img = cv2.imread(filepath)filename = filepath.split(os.sep)[-1]dot_pos = filename.rfind('.')imgname = filename[:dot_pos]ext = filename[dot_pos:]print('Augmenting {} ...'.format(filename))for i in range(n):img_varied = img.copy()varied_imgname = '{}_{:0>3d}_'.format(imgname, i)if random.random() < args.p_mirror:img_varied = cv2.flip(img_varied, 1)varied_imgname += 'm'if random.random() < args.p_crop:img_varied = ia.random_crop(img_varied,args.crop_size,args.crop_hw_vari)varied_imgname += 'c'if random.random() < args.p_rotate:img_varied = ia.random_rotate(img_varied,args.rotate_angle_vari,args.p_rotate_crop)varied_imgname += 'r'if random.random() < args.p_hsv:img_varied = ia.random_hsv_transform(img_varied,args.hue_vari,args.sat_vari,args.val_vari)varied_imgname += 'h'if random.random() < args.p_gamma:img_varied = ia.random_gamma_transform(img_varied,args.gamma_vari)varied_imgname += 'g'output_filepath = os.sep.join([args.output_dir,'{}{}'.format(varied_imgname, ext)])cv2.imwrite(output_filepath, img_varied)def main():args = parse_args()params_str = str(args)[10:-1]if not os.path.exists(args.output_dir):os.mkdir(args.output_dir)print('Starting image data augmentation for {}\n''with\n{}\n'.format(args.input_dir, params_str))sublists = generate_image_list(args)processes = [Process(target=augment_images, args=(x, args, )) for x in sublists]for p in processes:p.start()for p in processes:p.join()print('\nDone!')if __name__ == '__main__':main()

image_augmentation.py 如下:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Dec  4 15:59:29 2018@author: yang
"""import numpy as np
import cv2crop_image = lambda img, x0, y0, w, h: img[y0:y0+h, x0:x0+w]def random_crop(img, area_ratio, hw_vari):h, w = img.shape[:2]hw_delta = np.random.uniform(-hw_vari, hw_vari)hw_mult = 1 + hw_deltaw_crop = int(round(w*np.sqrt(area_ratio*hw_mult)))if w_crop > w - 2:w_crop = w - 2h_crop = int(round(h*np.sqrt(area_ratio/hw_mult)))if h_crop > h - 2:h_crop = h - 2x0 = np.random.randint(0, w-w_crop-1)y0 = np.random.randint(0, h-h_crop-1)return crop_image(img, x0, y0, w_crop, h_crop)def rotate_image(img, angle, crop):h, w = img.shape[:2]angle %= 360M_rotate = cv2.getRotationMatrix2D((w/2, h/2), angle, 1)img_rotated = cv2.warpAffine(img, M_rotate, (w, h))if crop:angle_crop = angle % 180if angle_crop > 90:angle_crop = 180 - angle_croptheta = angle_crop * np.pi / 180.0hw_ratio = float(h) / float(w)tan_theta = np.tan(theta)numerator = np.cos(theta) + np.sin(theta) * tan_thetar = hw_ratio if h > w else 1 / hw_ratiodenominator = r * tan_theta + 1crop_mult = numerator / denominatorw_crop = int(round(crop_mult*w))h_crop = int(round(crop_mult*h))x0 = int((w-w_crop)/2)y0 = int((h-h_crop)/2)img_rotated = crop_image(img_rotated, x0, y0, w_crop, h_crop)return img_rotateddef random_rotate(img, angle_vari, p_crop):angle = np.random.uniform(-angle_vari, angle_vari)crop = False if np.random.random() > p_crop else Truereturn rotate_image(img, angle, crop)def hsv_transform(img, hue_delta, sat_mult, val_mult):img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV).astype(np.float)img_hsv[:, :, 0] = (img_hsv[:, :, 0] + hue_delta) % 180img_hsv[:, :, 1] *= sat_multimg_hsv[:, :, 2] *= val_multimg_hsv[img_hsv > 255] = 255return cv2.cvtColor(np.round(img_hsv).astype(np.uint8), cv2.COLOR_HSV2BGR)def random_hsv_transform(img, hue_vari, sat_vari, val_vari):hue_delta = np.random.randint(-hue_vari, hue_vari)sat_mult = 1 + np.random.uniform(-sat_vari, sat_vari)val_mult = 1 + np.random.uniform(-val_vari, val_vari)return hsv_transform(img, hue_delta, sat_mult, val_mult)def gamma_transform(img, gamma):gamma_table = [np.power(x / 255.0, gamma) * 255.0 for x in range(256)]gamma_table = np.round(np.array(gamma_table)).astype(np.uint8)return cv2.LUT(img, gamma_table)def random_gamma_transform(img, gamma_vari):log_gamma_vari = np.log(gamma_vari)alpha = np.random.uniform(-log_gamma_vari, log_gamma_vari)gamma = np.exp(alpha)return gamma_transform(img, gamma)

我们使用上面两个Python脚本文件,来增强 minst/train 文件夹下的5W万张图,生存25万张图,与之前的5万合并的到30万张图,即将数据集增加为原来的6倍!

关闭除了旋转和平移以外的一切选项:旋转范围设为正负15度:

在终端输入如下命令:

python run_augmentation.py mnist/train/ mnist/augmented 250000 --rotate_angle=15 --p_mirror=0 --p_hsv=0 --p_gamma=0

这样会在mnist/augmented/ 文件夹下生成25万张增加平移和旋转绕动后的图,并且这些图的命名规则也与gen_caffe_imglist.py的解析规则一致,接下来生存这些图的文件和标签列表文件:

python gen_caffe_imglist.py mnist/augmented augmented.txt

然后将训练集和新增加的集的文件与标签文件列表合并成:train_aug.txt:

cat train.txt augmented.txt > train_aug.txt

然后为这个文件train_aug.txt单独建立一个lmdb文件夹:

因为扰动后的图片分辨率不一定是28*28了,所以必须在这个使用 --resize_widrh=28 和 --resize_height=28 ,把输入lmdb的图像尺寸固定为28*28;另外使用--shuffle将输入顺序大散;

/home/yang/caffe/build/tools/convert_imageset ./ train_aug.txt train_aug_lmdb --resize_width=28 --resize_height=28 --gray --shuffle

然后将lenet_train_val.prototxt 复制一份 命名为lenet_train_val_aug.prototxt;

将lenet_train_val_aug.prototxt的输入训练数据层的数据原lmdb文件路径改为:

source: "train_aug_lmdb"

然后在工作文件夹下再创建 snapshot_aug文件夹,和一个train_aug_Log 存储日志信息的文件夹;

然后再复制一份lenet_solver.prototxt文件,命名为lenet_aug_solver.prototxt文件:

修该lenet_aug_solver.prototxt 文件的

net 参数为:net: "lenet_train_val_aug.prototxt"

snapshot参数为:snapshot_prefix: "snapshot_aug"

然后就可以开始训练了:训练日志输出到train_aug_Log文件夹下:

/home/yang/caffe/build/tools/caffe train -solver lenet_aug_solver.prototxt -gpu 0 -log_dir ./train_aug_Log

最后几行的输出:

I1204 17:12:28.571137  5109 solver.cpp:414]     Test net output #0: accuracy = 0.9911
I1204 17:12:28.571157  5109 solver.cpp:414]     Test net output #1: loss = 0.0319057 (* 1 = 0.0319057 loss)

然后到train_aug_Log文件夹下将log文件的名字修改为mnist_train_with_augmentation.log;

没有增强数据前的训练日志在 trainLog文件夹下的mnist_train.log;

然后使用Caffe的plot_training_log.py 工具 画出这两次训练的一些对比图:

python /home/yang/caffe/tools/extra/plot_training_log.py

下面的指令画这两次训练的验证集的 accu vs iters图:

python /home/yang/caffe/tools/extra/plot_training_log.py 0 test_acc_vs_iters.png trainLog/mnist_train.log train_aug_Log/mnist_train_with_augmentation.log 

下面的指令画出这两次训练的在验证集合上的 loss vs iters 图;

python /home/yang/caffe/tools/extra/plot_training_log.py 2 test_loss_vs_iters.png trainLog/mnist_train.log train_aug_Log/mnist_train_with_augmentation.log 

10.2 接着上次训练状态再训练

原来的训练数据只有5万张,每个batch大小为50的情况下,迭代1000次就是一代(epoch);

增加后的数据量为30万张图,(50*6000=30W)迭代6000次才是一代,迭代到了36000次也才6代;

如果希望接着36000次迭代的状态继续训练,训练到20代,即最大迭代次数为120000

120000/6000=20(epooch)

将lenet_aug_solver.prototxt文件的 最大迭代次数:

max_iter: 36000

改为:max_iter: 120000

然后执行如下命令,就能接着36000次的训练状态继续训练:

python /home/yang/caffe/tools/caffe train -solver lenet_aug_solver.prototxt -snapshot_aug lenet_aug_solver_iter_36000.solverstate -gpu 0

/home/yang/caffe/build/tools/caffe train -solver lenet_aug_solver.prototxt -snapshot snapshot_aug/lenet_aug_solver_iter_36000.solverstate -gpu 0 -log_dir ./train_aug_Log

训练输出局部:

I1204 18:44:26.388453  6406 solver.cpp:414]     Test net output #0: accuracy = 0.9911
I1204 18:44:26.388473  6406 solver.cpp:414]     Test net output #1: loss = 0.0305995 (* 1 = 0.0305995 loss)
I1204 18:44:26.388478  6406 solver.cpp:332] Optimization Done.
I1204 18:44:26.388481  6406 caffe.cpp:250] Optimization Done.

下面画一下accuracy-iterNums:

python /home/yang/caffe/tools/extra/plot_training_log.py 0 test_acc_vs_iters_120000.png train_aug_Log/mnist_train_augmentation_iter_120000.log

loss-iterNums:

python /home/yang/caffe/tools/extra/plot_training_log.py 2 test_loss_vs_iters_120000.png train_aug_Log/mnist_train_augmentation_iter_120000.log

10.3在训练的时候实时增加数据的方法:第三方实时扰动的Caffe层

注意:这样直接在原始样本的基础上做扰动来增加数据只是数据增加的一种方式之一,并不是最好的方案,因为增加的数据量有限,并且还要占用额外的硬盘存储空间;

最好的方式是在训练的时候对数据进行实时的扰动,这样等效于无限多的随即扰动。

Caffe的数据层已经自带了最基础的扰动方式:随即裁剪和镜像;

Github上有一些开源的第三方实现 实时扰动的Caffe层,会包含各种常见的数据扰动方式,在github上搜索:caffe augmentation:

比如:

https://github.com/kevinlin311tw/caffe-augmentation

11.caffe-augmentation

Caffe with real-time data augmentation

Data augmentation is a simple yet effective way to enrich training data. However, we don't want to re-create a dataset (such as ImageNet) with more than millions of images every time when we change our augmentation strategy. To address this problem, this project provides real-time training data augmentation. During training, caffe will augment training data with random combination of different geometric transformations (scaling, rotation, cropping), image variations (blur, sharping, JPEG compression), and lighting adjustments.

Realtime data augmentation

Realtime data augmentation is implemented within the ImageData layer. We provide several augmentations as below:

  • Geometric transform: random flipping, cropping, resizing, rotation
  • Smooth filtering
  • JPEG compression
  • Contrast & brightness adjustment

How to use

You could specify your network prototxt as:

layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {phase: TRAIN
}
transform_param {mirror: truecrop_size: 227mean_file: "/home/your/imagenet_mean.binaryproto"contrast_adjustment: truesmooth_filtering: truejpeg_compression: truerotation_angle_interval: 30display: true
}
image_data_param {source: "/home/your/image/list.txt"batch_size: 32shuffle: truenew_height: 256new_width: 256}
}

You could also find a toy example at /examples/SSDH/train_val.prototxt

Note: ImageData Layer is currently not supported in TEST mode

Caffe MNIST手写数字识别 训练_验证 测试 模型测试评估与选择 数据增强

参考:

https://github.com/frombeijingwithlove/dlcv_for_beginners

Caffe MNIST 手写数字识别(全面流程)相关推荐

  1. 使用Caffe进行手写数字识别执行流程解析

    之前在 http://blog.csdn.net/fengbingchun/article/details/50987185 中仿照Caffe中的examples实现对手写数字进行识别,这里详细介绍下 ...

  2. 用MXnet实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别 (zz)

    用MXnet实战深度学习之一:安装GPU版mxnet并跑一个MNIST手写数字识别 我想写一系列深度学习的简单实战教程,用mxnet做实现平台的实例代码简单讲解深度学习常用的一些技术方向和实战样例.这 ...

  3. TensorFlow高阶 API: keras教程-使用tf.keras搭建mnist手写数字识别网络

    TensorFlow高阶 API:keras教程-使用tf.keras搭建mnist手写数字识别网络 目录 TensorFlow高阶 API:keras教程-使用tf.keras搭建mnist手写数字 ...

  4. 基于K210的MNIST手写数字识别

    基于K210的MNIST手写数字识别 项目已开源链接: Github. 硬件平台 采用Maixduino开发板 在sipeed官方有售 软件平台 使用MaixPy环境进行单片机的编程 官方资源可在这里 ...

  5. FPGA实现mnist手写数字识别(软件部分)

    文章目录 FPGA实现mnist手写数字识别 ① 环境配置 ② 数据集及代码下载 ③ 代码操作 (1)训练模型 (2)权重输出 (3)关于灰度转换 FPGA实现mnist手写数字识别 ① 环境配置 使 ...

  6. MNIST手写数字识别准确度提升最全、最实用的方法

    MNIST手写数字识别是所有学习AI同学的入门必经过程,MNIST识别准确率提升修炼是精通AI模型的必经课程,MNIST识别准确率开刚始大家一般都能达到90%左右,再往上提高还需要费较大的精力去修改模 ...

  7. FlyAi实战之MNIST手写数字识别练习赛(准确率99.55%)

    欢迎关注WX公众号:[程序员管小亮] 文章目录 欢迎关注WX公众号:[程序员管小亮] 一.介绍 二.代码实现 1_数据加载 2_归一化 3_定义网络结构 4_设置优化器和退火函数 5_数据增强 6_拟 ...

  8. 手写数字识别整体流程

    手写数字识别整体流程 pytorch中数据加载 batch:数据打乱顺序,组成一波-波的数据,批处理 epoch:拿所有的数据训练一-次 Dataset基类,数据集类 torch.utils.data ...

  9. TF之LSTM:利用多层LSTM算法对MNIST手写数字识别数据集进行多分类

    TF之LSTM:利用多层LSTM算法对MNIST手写数字识别数据集进行多分类 目录 设计思路 实现代码 设计思路 更新-- 实现代码 # -*- coding:utf-8 -*- import ten ...

最新文章

  1. php相关扩展安装及报错总结
  2. Win64 驱动内核编程-24.64位驱动里内嵌汇编
  3. 2017计算机应用+简答,2017计算机应用基础试题及答案
  4. 企业能为员工储蓄点什么呢
  5. Android Studio连接手机没反应,提示错误adb.exe start-server' failed -- run manually if necessary
  6. Java-File-文件操作
  7. python open函数参数_python open函数的用法笔记
  8. 【重磅推出】推荐系统系列教程之九:解密“看了又看”和“买了又买”(Item-Based)...
  9. 用 Python 爬取 4332 条数据,揭秘甜咸肉粽的江湖!
  10. 【leetcode】1041. Robot Bounded In Circle
  11. 灵悟礼品网上专卖店——分析类似项目的布局和商品的分类模式
  12. Android框架揭秘电子书pdf下载
  13. ESP8266-Arduino编程实例-HMC5883L磁场传感器
  14. 阿里巴巴大数据之路-数据整合管理体系
  15. yum安装报错No package xxx available
  16. OPEN-SET RECOGNITION:A GOOD CLOSED-SET CLASSIFIER IS ALL YOU NEED
  17. 邮箱容量满了怎么办?我的邮箱容量快满了如何解决?
  18. Java打印流——PrintStream
  19. pc模拟器运行多个Android,低配电脑如何强制开50个安卓模拟器挂机
  20. 篇一、Flask打造 Python Web 开发的灵活框架,实现简易登录。要求有 Python、HTML 和 CSS 基础。

热门文章

  1. 基于知识图谱的智能问答方案
  2. springboot毕设项目美食网站设计与实现62e76(java+VUE+Mybatis+Maven+Mysql)
  3. 行列式怎样用计算机算,行列式计算器怎么样?行列式计算器如何使用?
  4. TA游戏推荐:Android益智游戏《戳青蛙》
  5. Ubuntu 18.04 进入 Recovery 模式
  6. Android程序报错:Unable to add window——token android.os.BinderProxy@431e65b8 is not valid
  7. iframe子页面调用父页面javascript函数的方法
  8. 百度ueditor富文本--PC端单个,PC端多个,mobile单个,mobile多个
  9. 技术项目 - Linux网卡中断使单个CPU过载
  10. echarts地图省份按顺序依次高亮demo(源码)