之前在 http://blog.csdn.net/fengbingchun/article/details/49849225 中简单介绍过使用Caffe train MNIST的文章,当时只是仿照caffe中的example实现了下,下面说一下执行流程,并精简代码到仅有10余行:

1.        先注册所有层,执行layer_factory.hpp中类LayerRegisterer的构造函数,类LayerRegistry的AddCreator和Registry静态函数;关于Caffe中Layer的注册可以参考 http://blog.csdn.net/fengbingchun/article/details/54310956

2.        指定执行mode是采用CPU还是GPU;

3.        构造SolverParameter类对象,存放Solver 参数信息;

4.        调用ReadProtoFromTextFile函数解析Solver文本文件(lenet_solver.prototxt),其文件内的各字段名需要在caffe.proto的message SolverParameter中存在,否则会解析不成功,其文件内容如下:

# The train/test net protocol buffer definition
# Proto filename for the train net, possibly combined with one or more test nets.
# train net 和 test net在同一个文件中
net: "E:/GitCode/Caffe_Test/test_data/model/mnist/lenet_train_test.prototxt"
# Solver type,采用哪种Solver优化方法
solver_type: SGD
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
# The number of iterations for each test net.
# batch size * test iter = test images,即 100 * 100 = 10000
test_iter: 100
# Carry out testing every 500 training iterations.
# The number of iterations between two testing phases.
#  指定执行多少次训练网络执行一次测试网络
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
# The base learning rate, 基础学习率
base_lr: 0.01
# The momentum value, 动量
momentum: 0.9
# The weight decay, 权值衰减
weight_decay: 0.0005
# The learning rate policy, 学习策略
lr_policy: "inv"
# The parameter to compute the learning rate,学习率计算参数
gamma: 0.0001
# The parameter to compute the learning rate,学习率计算参数
power: 0.75
# Display every 100 iterations
# the number of iterations between displaying info.
# If display = 0, no info will be displayed.
# 指定训练多少次显示一次结果信息,如loss值等
display: 100
# The maximum number of iterations,最多执行训练次数
max_iter: 10000
# snapshot intermediate results,执行多少次训练保存一次中间结果
snapshot: 5000
# The prefix for the snapshot, file save position,中间结果保存位置
snapshot_prefix: "E:/GitCode/Caffe_Test/test_data/model/mnist/lenet"

5.        将MNIST原始数据转换成LMDB数据库格式,在train和test时会使用;

6.        根据Solver type,New一个SGDSolver类对象并进行初始化操作:

(1)、调用GetSolver函数,new一个SGDSolver了对象;

(2)、调用Solver类的Init函数;

(3)、调用SolverParameter的DebugString函数打印解析后的lenet_solver.prototxt信息,输出结果如下:

test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "E:/GitCode/Caffe_Test/test_data/model/mnist/lenet"
net: "E:/GitCode/Caffe_Test/test_data/model/mnist/lenet_train_test.prototxt"
solver_type: SGD

(4)、调用ReadNetParamsFromTextFileOrDie函数,解析lenet_train_test.prototxt文件(此文件中的各个layer段的位置不是固定的,每个layer内的各个段位置也不是固定的,它们的位置无关紧要,只是一般按照流程顺序从上到下排列),各个字段的说明如下:

name: "LeNet" # net名
layer { # memory required: (50175+64)*4=200960name: "mnist" # layer名字type: "Data" # layer类型,数据层,Data enters Caffe through data layers,read data from LEVELDB or LMDBtop: "data" # top名字, shape: 64 1 28 28 (50175)top: "label" # top名字, shape: 64 (64)include { # 指定何时将此layer mnist包含到网络中phase: TRAIN # 训练阶段会将此layer mnist包含到网络中}transform_param { # 图像预处理scale: 0.00390625 # 对图像像素值进行scale操作,范围[0, 1)}data_param { # data parametersource: "E:/GitCode/Caffe_Test/test_data/MNIST/train" # 数据存放路径batch_size: 64 # 指定一次处理图像的数量backend: LMDB # 数据存储方式}
}
layer { # memory required: (78400+100)*4=314000name: "mnist" # layer名字type: "Data" # layer类型,数据层,Data enters Caffe through data layers,read data from LEVELDB or LMDBtop: "data" # top名字, shape: 100 1 28 28 (78400)top: "label" # top名字, shape: 100 (100)include { # 指定何时将此layer mnist包含到网络中phase: TEST # 测试阶段会将此layer mnist包含到此网络中}transform_param { # 图像预处理scale: 0.00390625 # 对图像像素值进行scale操作,范围[0, 1)}data_param { # data parametersource: "E:/GitCode/Caffe_Test/test_data/MNIST/test" # 数据存放路径batch_size: 100 # 指定一次处理图像的数量backend: LMDB # 数据存储方式}
}
# test 阶段会创建一个layer: label_mnist_1_split,如下:
# layer_factory.hpp:75] Creating layer label_mnist_1_split
# net.cpp:110] Creating Layer label_mnist_1_split
# net.cpp:476] label_mnist_1_split <- label
# net.cpp:432] label_mnist_1_split -> label_mnist_1_split_0
# net.cpp:432] label_mnist_1_split -> label_mnist_1_split_1
# net.cpp:155] Setting up label_mnist_1_split
# net.cpp:163] Top shape: 100 (100)
# net.cpp:163] Top shape: 100 (100)
layer { # memory required: 737280*4=2949120/1152000*4=4608000name: "conv1" # layer名字type: "Convolution" # layer类型,卷积层bottom: "data" # bottom名字top: "conv1" # top名字, shape: 64 20 24 24 (737280)/100 20 24 24 (1152000)param { # 训练时用到的参数lr_mult: 1 # The multiplier on the global learning rate}param { # 训练时用到的参数lr_mult: 2 # The multiplier on the global learning rate}convolution_param { # convolution parameternum_output: 20 # 输出特征图(feature map)数量kernel_size: 5 # 卷积核大小(卷积核其实就是权值)stride: 1 # 滑动步长weight_filler { # The filler for the weighttype: "xavier" # 权值使用xavier滤波}bias_filler { # The filler for the biastype: "constant" # 偏置使用常量滤波}}
}
layer { # memory required: 184320*4=737280/288000*4=1152000name: "pool1" # layer名字type: "Pooling" # layer类型,Pooling层bottom: "conv1" # bottom名字top: "pool1" # top名字, shape: 64 20 12 12 (184320)/ 100 20 12 12 (288000)pooling_param { # pooling parameter,pooling层参数pool: MAX # pooling方法:最大值采样kernel_size: 2 # 滤波器大小stride: 2 # 滑动步长}
}
layer { # memory required: 204800*4=819200/320000*4=1280000name: "conv2" # layer名字type: "Convolution" # layer类型,卷积层bottom: "pool1" # bottom名字top: "conv2" # top名字, shape: 64 50 8 8 (204800)/ 100 50 8 8 (320000)param { # 训练时用到的参数lr_mult: 1 # The multiplier on the global learning rate}param { # 训练时用到的参数lr_mult: 2 # The multiplier on the global learning rate}convolution_param { # convolution parameter,卷基层参数num_output: 50 # 输出特征图(feature map)数量kernel_size: 5 # 卷积核大小(卷积核其实就是权值)stride: 1 # 滑动步长weight_filler { # The filler for the weighttype: "xavier" # 权值使用xavier滤波}bias_filler { # The filler for the biastype: "constant" # 偏置使用常量滤波}}
}
layer { # memory required: 51200*4=204800/80000*4=320000name: "pool2" # layer名字type: "Pooling" # layer类型,Pooling层bottom: "conv2" # bottom名字top: "pool2" # top名字, shape: 64 50 4 4 (51200)/ 100 50 4 4 (80000)pooling_param { # pooling parameter,卷积层参数pool: MAX # pooling方法:最大值采样kernel_size: 2 # 滤波器大小stride: 2 # 滑动步长}
}
layer { # memory required: 32000*4=128000/50000*4=200000name: "ip1" # layer名字type: "InnerProduct" # layer类型,全连接层bottom: "pool2" # bottom名字top: "ip1" # top名字, shape: 64 500 (32000)/ 100 500 (50000)param { # 训练时用到的参数lr_mult: 1 # The multiplier on the global learning rate}param { # 训练时用到的参数lr_mult: 2 # The multiplier on the global learning rate}inner_product_param { # 全连接层参数num_output: 500 # 输出特征图(feature map)数量weight_filler { # The filler for the weighttype: "xavier" # 权值使用xavier滤波}bias_filler { # The filler for the biastype: "constant" # 偏置使用常量滤波}}
}
# ReLU: Given an input value x, The ReLU layer computes the output as x if x > 0 and
# negative_slope * x if x <= 0. When the negative slope parameter is not set,
# it is equivalent to the standard ReLU function of taking max(x, 0).
# It also supports in-place computation, meaning that the bottom and
# the top blob could be the same to preserve memory consumption
layer { # memory required: 32000*4=128000/50000*4=200000name: "relu1" # layer名字type: "ReLU" # layer类型bottom: "ip1" # bottom名字top: "ip1" # top名字 (in-place), shape: 64 500 (32000)/ 100 500 (50000)
}
layer { # memory required: 640*4=2560/1000*4=4000name: "ip2" # layer名字type: "InnerProduct" # layer类型,全连接层bottom: "ip1" # bottom名字top: "ip2" # top名字, shape: 64 10 (640)/ 100 10 (1000)param { # 训练时用到的参数lr_mult: 1 # The multiplier on the global learning rate}param { # 训练时用到的参数lr_mult: 2 # The multiplier on the global learning rate}inner_product_param { # 全连接层参数num_output: 10 # 输出特征图(feature map)数量weight_filler { # The filler for the weighttype: "xavier" # 权值使用xavier滤波}bias_filler { # The filler for the biastype: "constant" # 偏置使用常量滤波}}
}
# test阶段会创建一个layer: ip2_ip2_0_split,如下:
# layer_factory.hpp:75] Creating layer ip2_ip2_0_split
# net.cpp:110] Creating Layer ip2_ip2_0_split
# net.cpp:476] ip2_ip2_0_split <- ip2
# net.cpp:432] ip2_ip2_0_split -> ip2_ip2_0_split_0
# net.cpp:432] ip2_ip2_0_split -> ip2_ip2_0_split_1
# net.cpp:155] Setting up ip2_ip2_0_split
# net.cpp:163] Top shape: 100 10 (1000)
# net.cpp:163] Top shape: 100 10 (1000)
layer { # memory required: 1*4=4name: "accuracy" # layer名字type: "Accuracy" # layer类型,计算输出准确率bottom: "ip2" # bottom名字bottom: "label" # bottom名字top: "accuracy" # top名字, shape: (1)include { # 指定何时将此layer accuracy包含到网络中phase: TEST # 测试阶段会将此layer accuracy包含到此网络中}
}
# SoftmaxWithLoss: Computes the multinomial logistic loss for a one-of-many
# classification task, passing real-valued predictions through a
# softmax to get a probability distribution over classes.
layer { # memory required: 1*4=4/1*4=4name: "loss" # layer名字type: "SoftmaxWithLoss" # layer类型bottom: "ip2" # bottom名字bottom: "label" # bottom名字top: "loss" # top名字, shape: (1)/ (1)
}# 在训练网络中,占用总内存大小为:200960+2949120+737280+819200+204800+128000+128000+2560+4=5169924
# 在测试网络中,占用总内存大小为:314000+(100+100)*4+4608000+1152000+1280000+320000+200000+200000+4000+(1000+1000)*4+4+4=8086808

lenet_train_test.prototxt可视化结果如下图( http://ethereon.github.io/netscope/quickstart.html):此视图给出的是train阶段时的流程图,不包括测试阶段:

(5)、创建训练网络Net对象,并调用Net类的InitTrainNet函数构建训练网络,训练网络输出结果如下:

name: "LeNet"
state {phase: TRAIN
}
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {scale: 0.00390625}data_param {source: "E:/GitCode/Caffe_Test/test_data/MNIST/train"batch_size: 64backend: LMDB}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 20kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 50kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "ip1"type: "InnerProduct"bottom: "pool2"top: "ip1"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 500weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "relu1"type: "ReLU"bottom: "ip1"top: "ip1"
}
layer {name: "ip2"type: "InnerProduct"bottom: "ip1"top: "ip2"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 10weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "loss"type: "SoftmaxWithLoss"bottom: "ip2"bottom: "label"top: "loss"
}

(6)、创建测试网络Net对象,并调用Net类的InitTestNets函数构建测试网络,测试网络输出结果如下:

name: "LeNet"
state {phase: TEST
}
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {scale: 0.00390625}data_param {source: "E:/GitCode/Caffe_Test/test_data/MNIST/test"batch_size: 100backend: LMDB}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 20kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 50kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "ip1"type: "InnerProduct"bottom: "pool2"top: "ip1"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 500weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "relu1"type: "ReLU"bottom: "ip1"top: "ip1"
}
layer {name: "ip2"type: "InnerProduct"bottom: "ip1"top: "ip2"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 10weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "accuracy"type: "Accuracy"bottom: "ip2"bottom: "label"top: "accuracy"include {phase: TEST}
}
layer {name: "loss"type: "SoftmaxWithLoss"bottom: "ip2"bottom: "label"top: "loss"
}

(7)、调用SGDSolver类的PreSolve函数。

7.        调用Solver类的Solve函数开始进行训练和测试:

(1)、当训练次数是500的倍数时(在lenet_solver.prototxt中设置test_interval为500),执行一次测试网络的Forward计算,循环100次(在lenet_solver.prototxt中设置test_iter为100,在lenet_train_test.prototxt文件中,测试阶段batch size为100,这样100*100=10000正好覆盖到所有的测试图像),测试网络最终会有两个结果输出,一个是accuracy,一个是loss;

(2)、执行一次训练网络的ForwardBackward计算,训练网络最终会有一个结果输出即loss;

(3)、更新训练网络的权值和偏置;

(4)、每训练5000次(在lenet_solver.prototxt中设置snapshot为5000)保存一次结果,包括.caffemodel和.caffestate;

(5)、按照以上(1)、(2)、(3)、(4)中的步骤,循环执行10000次(在lenet_solver.prototxt中设置max_iter为10000)。

精简后的mnist train代码如下:

#include "funset.hpp"
#include "common.hpp"int mnist_train()
{caffe::Caffe::set_mode(caffe::Caffe::CPU);const std::string filename{ "E:/GitCode/Caffe_Test/test_data/model/mnist/lenet_solver.prototxt" };caffe::SolverParameter solver_param;if (!caffe::ReadProtoFromTextFile(filename.c_str(), &solver_param)) {fprintf(stderr, "parse solver.prototxt fail\n");return -1;}mnist_convert(); // convert MNIST to LMDBboost::shared_ptr<caffe::Solver<float> > solver(caffe::GetSolver<float>(solver_param));solver->Solve();fprintf(stderr, "train finish\n");return 0;
}

train最终结果如下:

GitHub:https://github.com/fengbingchun/Caffe_Test

Caffe中对MNIST执行train操作执行流程解析相关推荐

  1. 【Caffe学习01】在Caffe中trian MNIST

    在上次搭建好Caffe环境的基础上我们进行第一次实验,将Caffe自带的一个Mnist example跑一跑,对其在处理图像方面的能力有个初步了解. 如果还没有搭建好环境的朋友可以看看我的上一篇文章: ...

  2. apache php 执行shell,PHP执行普通shell命令流程解析

    这里演示一些普通的shell命令 php执行shell命令,可以使用下面几个函数: string system ( string $command [, int &$return_var ] ...

  3. Caffe中对cifar10执行train操作

    参考Caffe source中examples/cifar10目录下内容. cifar10是一个用于普通物体识别的数据集,cifar10被分为10类,分别为airplane.automobile.bi ...

  4. CAD制图中3D镜像命令怎么操作执行?

    其实在进行绘制的CAD图纸会遇到各种问题,其实更多的则是需要你熟练的掌握CAD绘图工具.其实三维图纸要比平常所进行熟知的平面图更直观详细,从而获得精准的效果,但是因为文件内存大很少有人绘制.具体的操作 ...

  5. caffe中layer的一些特殊操作,比如split

    slice:在某一个维度,按照给定的下标,blob拆分成几块.比如要拆分channel,总数50,下标为10,20,30,40,那就是分成5份,每份10个channel,输出5个layer. conc ...

  6. 在Android中使用Handler和Thread线程执行后台操作

    在 Android中使用Handler和Thread线程执行后台操作 对于线程的控制,我们将介绍一个 Handler类,使用该类可以对运行在不同线程中的多个任务进行排队,并使用Message和Runn ...

  7. Android中使用Handler和异步任务(AsyncTack)来为UI线程执行费时操作

    出于性能优化的考虑,Android的UI线程不是线程安全的.这致使我们不能在Android的UI线程中执行一些费时的操作,如下载.刷新等.Android中只允许UI线程对Activity中的UI组件进 ...

  8. 离散结构和离散数学中文书_在离散数学中对场景执行的操作

    离散结构和离散数学中文书 Prerequisite: Set theory and types of set in Discrete Mathematics 先决条件: 离散数学中的集合论和集合类型 ...

  9. 在使用win 7 无线承载网络时,启动该服务时,有时会提示:组或资源的状态不是执行请求操作的正确状态。 网上有文章指出,解决这个问题的方法是在设备管理器中启动“Microsoft托管网络虚拟适配

    在使用win 7 无线承载网络时,启动该服务时,有时会提示:组或资源的状态不是执行请求操作的正确状态. 网上有文章指出,解决这个问题的方法是在设备管理器中启动"Microsoft托管网络虚拟 ...

最新文章

  1. Esper学习之十二:EPL语法(八)
  2. Red hat linux 下配置Java环境(jdk)
  3. 你不曾见过的酷炫地图可视化作品(一)
  4. Python之旅:列表
  5. php单文件压缩的功能函数的实现
  6. IDC:聚焦6+6,抓住数字化转型商机
  7. [原创]K8 cping 3.0大型内网渗透扫描工具
  8. 2021总结、2022展望
  9. python 埋点_scala spark 埋点统计_spark—1:WordCount(Python与Scala对照)
  10. c 连接mysql总是失败_Windows 下用C++连接MYSQL失败
  11. 软工结对项目之词频统计update
  12. 高等数学辅导讲义_研学堂:考研数学辅导书测评
  13. Linux 进程与信号的概念和操作 linux process and signals
  14. 中国科学家完全破解世界级百年数学难题庞加莱猜想
  15. phpstorm 报 expecting statement
  16. 腾讯云/阿里云域名注册解析备案
  17. ValueError: No JSON object could be decoded 成功解决
  18. 附录三 虚拟机的使用
  19. EAS提示对不起当前对象正在被其他操作锁定问题处理
  20. 论文总结(三)-- 超分辨算法基础与综述

热门文章

  1. 【camera-lidar-radar】基于卡尔曼滤波和扩展卡尔曼滤波的相机、激光雷达、毫米波雷达多传感器后融合
  2. php禁止网页抓取,服务器反爬虫攻略:Apache/Nginx/PHP禁止某些User Agent抓取网站 - 龙笑天下...
  3. 【讲人话】Angular如何通过@ViewChildren获取实时渲染的动态DOM节点元素(@ViewChild只能获取静态的固定DOM节点)
  4. 使用Vue的filters(本地)或filter(全局)过滤常用数据类型
  5. ATS 4.2.3隐藏服务器名称及版本号的方法
  6. HLS中m3u8格式规范解读
  7. 用Python构建网页抓取器
  8. Unity + SQL数据库创建管理玩家排行榜学习教程
  9. C++派生类与基类构造函数调用次序
  10. Appium+python自动化(八)- 初识琵琶女Appium(千呼万唤始出来,犹抱琵琶半遮面)- 下(超详解)...