前言

此博客主要介绍如何利用matlab一步一步训练caffe模型,类似使用caffe.exe 的train命令。

国际惯例,参考博客:

http://caffe.berkeleyvision.org/tutorial/interfaces.html

http://www.cnblogs.com/denny402/p/5110204.html

抱怨一下:matlab的教程是真少哇,大牛们都跑去玩Python了。。。o(╯□╰)o,开更。。。。。。。。。

【注】所有专业说法请参考caffe官网以及其它大牛博客,博主写博客可能有点白话文且没那么咬文嚼字。

一、读入模型

先去caffe主页瞄一眼。。。。。得到一个讯息:

solver = caffe.Solver('./models/bvlc_reference_caffenet/solver.prototxt');

这句话干什么的呢?读模型。

尝试一下,采用大家都有的mnist 中的solver,我采用了绝对路径,读者可采用相对路径,无影响

【注】我的solver可能修改了,前面有一篇博客介绍了修改内容和原因。贴一下下载地址:

lenet_solver1.prototxt:链接:http://pan.baidu.com/s/1qXWQrhy 密码:we0e

lenet_train_test1.prototxt:链接:http://pan.baidu.com/s/1miawrxQ 密码:ghxt

均值文件:链接:http://pan.baidu.com/s/1miFDNHe 密码:48az

下面用到的mnist_data:链接:http://pan.baidu.com/s/1bp62Enl 密码:royk

Google一下,感觉可能会有两个原因导致matlab未响应:一是dll没有链接到,就跟很多人出现caffe.set_mode_gpu()会直接未响应一样;二是prototxt内部错误。我不会说我折腾了一下午这个问题。

排除第一种情况,因为目前为止,使用caffe都是比较顺利的,dll问题可能性不大。那就是prototxt 路径问题了,去看prototxt是什么情况

net: "examples/mnist/lenet_train_test1.prototxt"
snapshot_prefix: "examples/mnist/lenet"

与路径有关的两句话,我们的matlab程序文件夹是E:\CaffeDev\caffe-master\matlab\demo,与这个路径相差十万八千里。保险起见,我的解决方法是把mnist训练需要的东西全都复制丢到matlab程序文件夹了。如下:


mnist_data文件夹存的是mnist数据集的lmdb文件以及lenet.prototxt,不想动手制作的去上面下载,想动手自己做的,前面有博客介绍。

移动完毕,那就得改改prototxt里面的路径了:

lenet_solver1.prototxt

# The train/test net protocol buffer definition
net: "lenet_train_test1.prototxt"
# test_iter specifies how many forward passes the test should carry out.
# In the case of MNIST, we have test batch size 100 and 100 test iterations,
# covering the full 10,000 testing images.
test_iter: 100
# Carry out testing every 500 training iterations.
test_interval: 500
# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
# The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
# Display every 100 iterations
display: 1
# The maximum number of iterations
max_iter: 10000
# snapshot intermediate results
snapshot: 5000
snapshot_prefix: "mnist_data/lenet"
# solver mode: CPU or GPU
solver_mode: CPU

lenet_train_test1.prototxt 被修改部分

name: "LeNet"
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {mean_file: "mean.binaryproto"scale: 0.00390625}data_param {source: "mnist_data/mnist_train_lmdb"batch_size: 64backend: LMDB}
}
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mean_file: "mean.binaryproto"scale: 0.00390625}data_param {source: "mnist_data/mnist_test_lmdb"batch_size: 100backend: LMDB}
}

再进行下一步操作之前,最好用bat测试一下是否能读取到这个prototxt并训练,排除这一步错误才能进行下步工作。

接下来再去读取模型:

addpath('..')
caffe.reset_all
solver = caffe.Solver('lenet_solver1.prototxt');

显示一下:

>> solversolver = Solver with properties:net: [1x1 caffe.Net]test_nets: [1x1 caffe.Net]

二、训练模型

2.1、一次性训练模型

solver.solve();

这里会有一个幌子,你会发现运行以后matlab跟待机一样,啥输出都没。我还以为出错,分析了一下上面solver读取的两个net:

模型的输入竟然是empty的,难道我们的lmdb数据没有读进去,然后尝试了leveldb,以及各种改leveldb的路径,比如添加“./”之类的,都不行。这时候便想到了一种可能性,模型载入是不读取数据的,只有在运行时候读取数据,但是solver的solve方法是一次性训练模型,没有任何输出,matlab可能已经在训练模型了。为了验证此想法,run→吃饭→回来→观察,果然在下面这个路径中发现了训练好的model

snapshot_prefix: "mnist_data/lenet"

为了避免这些模型是用空数据训练的,使用mnist的classification_demo测试一下,竟然手写数字都分类正确,这样便验证了我们的想法:solver.solve()是一次性训练数据,不会附带任何输出,matlab表现会如死机了。

2.2训练模型step-by-step

依旧去官网找:

意思是我们可以用step命令设置每次训练多少次以后,可以干一下别的事情,然后再训练。

好,卡壳了,设置完毕step为1表示我们想在每一次迭代都取出loss和accuracy,但是然后呢?怎么继续训练?找了很多教程都是Python的,受次启发,以及反复看caffe的官网,发现:

net.blobs('data').set_data(data);
net.forward_prefilled();
prob = net.blobs('prob').get_data();

给出的解释简单翻译一下是:

net.forward 函数接受n维的输入,输出output的blob的数据

net.forward_prefilled使用的则是使用在模型中已经存在的数据继续训练,并不接受任何输入,以及提供任何输出。

看完这两个解释,思考一下,用step设置训练1次停一下,那么我们的数据是否依旧在blob中存着呢?

【最开始想法】那么就可以使用net.forward_prefilled做继续训练的工作,尝试一下:

%训练
clear
clc
addpath('..')
caffe.reset_all
solver=caffe.Solver('lenet_solver1.prototxt');
loss=[];
accuracy=[];for i=1:10000disp('.')solver.step(1);iter=solver.iter();solver.net.forward_prefilled
end

更新日志2016-10-21

【试验之后】按照上面的想法能训练,但是突然发现,为什么不要backward_prefilled呢?而且,去掉solver.net.forward_prefilled也能训练。应该是solver.step自动包含了forward和backward过程了,因此正式使用的训练代码是:

%训练
clear
clc
addpath('..')
caffe.reset_all
solver=caffe.Solver('lenet_solver1.prototxt');
loss=[];
accuracy=[];for i=1:10000disp('.')solver.step(1);iter=solver.iter();
end

接下来就是取出每次迭代的loss和accuracy了,想都不用想,用blob,为了训练快点,我切换到GPU版本的caffe-windows去了,代码如下:

<pre name="code" class="cpp"><pre name="code" class="cpp">%训练
clear
clc
close all
format long %设置精度,caffe的损失貌似精度在小数点后面好几位
addpath('..')
caffe.reset_all%重设网络,否则载入两个网络会卡住
solver=caffe.Solver('lenet_solver1.prototxt'); %载入网络
loss=[];%记录相邻两个loss
accuracy=[];%记录相邻两个accuracy
hold on%画图用的
accuracy_init=0;
loss_init=0;
for i=1:10000solver.step(1);%每迭代一次就取一次loss和accuracyiter=solver.iter();loss=solver.net.blobs('loss').get_data();%取训练集的lossaccuracy=solver.test_nets.blobs('accuracy').get_data();%取验证集的accuracy%画loss折线图x=[i-1,i];y=[loss_init loss];plot(x,y,'r-')drawnowloss_init=loss;
end

接下来我们便得到了实时的曲线图,每次迭代都有一个loss显示在折线图中。

为了避免训练错误,测试一下

E:\CaffeDev-GPU\caffe-master\Build\x64\Release\caffe.exe test --model=lenet_train_test1.prototxt -weights=mnist_data/lenet_iter_10000.caffemodel -gpu=0
pause

结果如下:

I1021 21:00:07.988450  8132 net.cpp:261] This network produces output accuracy
I1021 21:00:07.989449  8132 net.cpp:261] This network produces output loss
I1021 21:00:07.989449  8132 net.cpp:274] Network initialization done.
I1021 21:00:07.992449  8132 caffe.cpp:253] Running for 50 iterations.
I1021 21:00:07.999449  8132 caffe.cpp:276] Batch 0, accuracy = 0.96
I1021 21:00:07.999449  8132 caffe.cpp:276] Batch 0, loss = 0.168208
I1021 21:00:08.002449  8132 caffe.cpp:276] Batch 1, accuracy = 0.95
I1021 21:00:08.002449  8132 caffe.cpp:276] Batch 1, loss = 0.152652
I1021 21:00:08.005450  8132 caffe.cpp:276] Batch 2, accuracy = 0.88
I1021 21:00:08.005450  8132 caffe.cpp:276] Batch 2, loss = 0.320218
I1021 21:00:08.007450  8132 caffe.cpp:276] Batch 3, accuracy = 0.92
I1021 21:00:08.008450  8132 caffe.cpp:276] Batch 3, loss = 0.320782
I1021 21:00:08.010450  8132 caffe.cpp:276] Batch 4, accuracy = 0.88
I1021 21:00:08.011451  8132 caffe.cpp:276] Batch 4, loss = 0.354194
I1021 21:00:08.013450  8132 caffe.cpp:276] Batch 5, accuracy = 0.91
I1021 21:00:08.013450  8132 caffe.cpp:276] Batch 5, loss = 0.604682
I1021 21:00:08.015450  8132 caffe.cpp:276] Batch 6, accuracy = 0.88
I1021 21:00:08.015450  8132 caffe.cpp:276] Batch 6, loss = 0.310961
I1021 21:00:08.017451  8132 caffe.cpp:276] Batch 7, accuracy = 0.95
I1021 21:00:08.017451  8132 caffe.cpp:276] Batch 7, loss = 0.18691
I1021 21:00:08.019450  8132 caffe.cpp:276] Batch 8, accuracy = 0.93
I1021 21:00:08.019450  8132 caffe.cpp:276] Batch 8, loss = 0.302631
I1021 21:00:08.022451  8132 caffe.cpp:276] Batch 9, accuracy = 0.96
I1021 21:00:08.022451  8132 caffe.cpp:276] Batch 9, loss = 0.10867
I1021 21:00:08.024451  8132 caffe.cpp:276] Batch 10, accuracy = 0.94
I1021 21:00:08.024451  8132 caffe.cpp:276] Batch 10, loss = 0.283927
I1021 21:00:08.026451  8132 caffe.cpp:276] Batch 11, accuracy = 0.91
I1021 21:00:08.027451  8132 caffe.cpp:276] Batch 11, loss = 0.389279
I1021 21:00:08.029451  8132 caffe.cpp:276] Batch 12, accuracy = 0.87
I1021 21:00:08.029451  8132 caffe.cpp:276] Batch 12, loss = 0.618325
I1021 21:00:08.031451  8132 caffe.cpp:276] Batch 13, accuracy = 0.91
I1021 21:00:08.031451  8132 caffe.cpp:276] Batch 13, loss = 0.464931
I1021 21:00:08.033452  8132 caffe.cpp:276] Batch 14, accuracy = 0.91
I1021 21:00:08.033452  8132 caffe.cpp:276] Batch 14, loss = 0.348089
I1021 21:00:08.035451  8132 caffe.cpp:276] Batch 15, accuracy = 0.88
I1021 21:00:08.035451  8132 caffe.cpp:276] Batch 15, loss = 0.45388
I1021 21:00:08.037451  8132 caffe.cpp:276] Batch 16, accuracy = 0.93
I1021 21:00:08.038452  8132 caffe.cpp:276] Batch 16, loss = 0.277403
I1021 21:00:08.040452  8132 caffe.cpp:276] Batch 17, accuracy = 0.9
I1021 21:00:08.040452  8132 caffe.cpp:276] Batch 17, loss = 0.48363
I1021 21:00:08.042453  8132 caffe.cpp:276] Batch 18, accuracy = 0.91
I1021 21:00:08.042453  8132 caffe.cpp:276] Batch 18, loss = 0.519036
I1021 21:00:08.044452  8132 caffe.cpp:276] Batch 19, accuracy = 0.88
I1021 21:00:08.045452  8132 caffe.cpp:276] Batch 19, loss = 0.364235
I1021 21:00:08.047452  8132 caffe.cpp:276] Batch 20, accuracy = 0.9
I1021 21:00:08.047452  8132 caffe.cpp:276] Batch 20, loss = 0.414757
I1021 21:00:08.049453  8132 caffe.cpp:276] Batch 21, accuracy = 0.9
I1021 21:00:08.049453  8132 caffe.cpp:276] Batch 21, loss = 0.387713
I1021 21:00:08.051452  8132 caffe.cpp:276] Batch 22, accuracy = 0.93
I1021 21:00:08.051452  8132 caffe.cpp:276] Batch 22, loss = 0.308721
I1021 21:00:08.053452  8132 caffe.cpp:276] Batch 23, accuracy = 0.93
I1021 21:00:08.053452  8132 caffe.cpp:276] Batch 23, loss = 0.328804
I1021 21:00:08.055454  8132 caffe.cpp:276] Batch 24, accuracy = 0.92
I1021 21:00:08.055454  8132 caffe.cpp:276] Batch 24, loss = 0.385196
I1021 21:00:08.058454  8132 caffe.cpp:276] Batch 25, accuracy = 0.93
I1021 21:00:08.058454  8132 caffe.cpp:276] Batch 25, loss = 0.255955
I1021 21:00:08.061453  8132 caffe.cpp:276] Batch 26, accuracy = 0.92
I1021 21:00:08.061453  8132 caffe.cpp:276] Batch 26, loss = 0.49177
I1021 21:00:08.063453  8132 caffe.cpp:276] Batch 27, accuracy = 0.89
I1021 21:00:08.064453  8132 caffe.cpp:276] Batch 27, loss = 0.366904
I1021 21:00:08.066453  8132 caffe.cpp:276] Batch 28, accuracy = 0.93
I1021 21:00:08.066453  8132 caffe.cpp:276] Batch 28, loss = 0.309272
I1021 21:00:08.068454  8132 caffe.cpp:276] Batch 29, accuracy = 0.88
I1021 21:00:08.068454  8132 caffe.cpp:276] Batch 29, loss = 0.520516
I1021 21:00:08.070453  8132 caffe.cpp:276] Batch 30, accuracy = 0.92
I1021 21:00:08.070453  8132 caffe.cpp:276] Batch 30, loss = 0.358098
I1021 21:00:08.072453  8132 caffe.cpp:276] Batch 31, accuracy = 0.94
I1021 21:00:08.072453  8132 caffe.cpp:276] Batch 31, loss = 0.157759
I1021 21:00:08.074455  8132 caffe.cpp:276] Batch 32, accuracy = 0.91
I1021 21:00:08.075454  8132 caffe.cpp:276] Batch 32, loss = 0.336977
I1021 21:00:08.077455  8132 caffe.cpp:276] Batch 33, accuracy = 0.95
I1021 21:00:08.077455  8132 caffe.cpp:276] Batch 33, loss = 0.116172
I1021 21:00:08.079454  8132 caffe.cpp:276] Batch 34, accuracy = 0.93
I1021 21:00:08.079454  8132 caffe.cpp:276] Batch 34, loss = 0.136695
I1021 21:00:08.081454  8132 caffe.cpp:276] Batch 35, accuracy = 0.89
I1021 21:00:08.082454  8132 caffe.cpp:276] Batch 35, loss = 0.648639
I1021 21:00:08.084455  8132 caffe.cpp:276] Batch 36, accuracy = 0.91
I1021 21:00:08.084455  8132 caffe.cpp:276] Batch 36, loss = 0.256923
I1021 21:00:08.086454  8132 caffe.cpp:276] Batch 37, accuracy = 0.93
I1021 21:00:08.086454  8132 caffe.cpp:276] Batch 37, loss = 0.321325
I1021 21:00:08.088454  8132 caffe.cpp:276] Batch 38, accuracy = 0.92
I1021 21:00:08.088454  8132 caffe.cpp:276] Batch 38, loss = 0.28317
I1021 21:00:08.090456  8132 caffe.cpp:276] Batch 39, accuracy = 0.9
I1021 21:00:08.090456  8132 caffe.cpp:276] Batch 39, loss = 0.352922
I1021 21:00:08.093456  8132 caffe.cpp:276] Batch 40, accuracy = 0.93
I1021 21:00:08.093456  8132 caffe.cpp:276] Batch 40, loss = 0.298536
I1021 21:00:08.095455  8132 caffe.cpp:276] Batch 41, accuracy = 0.88
I1021 21:00:08.095455  8132 caffe.cpp:276] Batch 41, loss = 0.817203
I1021 21:00:08.097455  8132 caffe.cpp:276] Batch 42, accuracy = 0.89
I1021 21:00:08.097455  8132 caffe.cpp:276] Batch 42, loss = 0.324021
I1021 21:00:08.100455  8132 caffe.cpp:276] Batch 43, accuracy = 0.92
I1021 21:00:08.100455  8132 caffe.cpp:276] Batch 43, loss = 0.270256
I1021 21:00:08.102455  8132 caffe.cpp:276] Batch 44, accuracy = 0.89
I1021 21:00:08.102455  8132 caffe.cpp:276] Batch 44, loss = 0.443635
I1021 21:00:08.104455  8132 caffe.cpp:276] Batch 45, accuracy = 0.92
I1021 21:00:08.104455  8132 caffe.cpp:276] Batch 45, loss = 0.316793
I1021 21:00:08.106456  8132 caffe.cpp:276] Batch 46, accuracy = 0.9
I1021 21:00:08.106456  8132 caffe.cpp:276] Batch 46, loss = 0.353561
I1021 21:00:08.109457  8132 caffe.cpp:276] Batch 47, accuracy = 0.94
I1021 21:00:08.109457  8132 caffe.cpp:276] Batch 47, loss = 0.304726
I1021 21:00:08.111456  8132 caffe.cpp:276] Batch 48, accuracy = 0.88
I1021 21:00:08.111456  8132 caffe.cpp:276] Batch 48, loss = 0.643014
I1021 21:00:08.113456  8132 caffe.cpp:276] Batch 49, accuracy = 0.93
I1021 21:00:08.113456  8132 caffe.cpp:276] Batch 49, loss = 0.214009
I1021 21:00:08.113456  8132 caffe.cpp:281] Loss: 0.355134
I1021 21:00:08.113456  8132 caffe.cpp:293] accuracy = 0.9134
I1021 21:00:08.113456  8132 caffe.cpp:293] loss = 0.355134 (* 1 = 0.355134 loss)

附:读取日志文件的loss

如果使用的是dos窗口caffe -train命令训练,那么提取loss和accuracy就需要定向到caffe默认的日志文件去,找的方法很简单


按时间排序,找到最近的以caffe.exe开头的文件名称,用notepad++打开可以看到日志信息:

读文件的方法有很多,我用正则表达式去匹配loss信息:

先将这个记录log的文件拷贝出来,

%my loss
clear;
clc;
close all;train_log_file = 'caffe.exe.BINGO-PC.Bingo.log.INFO.20160924-193528.13464' ;
train_interval = 100 ;
test_interval = 500 ;[~, string_output] = dos(['type ' , train_log_file ]) ;
pat='1 = .*? loss';
o1=regexp(string_output,pat,'start');%用'start'参数指定输出o1为匹配正则表达式的子串的起始位置
o2=regexp(string_output,pat,'end');%用'start'参数指定输出o1为匹配正则表达式的子串的结束位置
o3=regexp(string_output,pat,'match');%用'match'参数指定输出o2为匹配正则表达式的子串 loss=zeros(1,size(o1,2));
for i=1:size(o1,2)loss(i)=str2num(string_output(o1(i)+4:o2(i)-5));
end
plot(loss)

【caffe-matlab】使用matlab训练caffe及绘制loss相关推荐

  1. 神经网络训练时如何绘制loss的动态曲线

    在神经网络训练中,可以利用tensorboard进行查看loss曲线及graph图,但是比较麻烦,本人想在训练代码中加入一段代码,实现train_loss及val_loss的实时动态变化,方便观察损失 ...

  2. 用MATLAB训练caffe,[转载]windows caffe部署训练+python调用全部流程

    原版caffe代码项目编译在windows下非常麻烦,还好微软集成了所有三方包之后放出了一个windows版本的,省了不少时间. 项目下载地址: https://github.com/Microsof ...

  3. ubuntu14.04下安装cudnn5.1.3,opencv3.0,编译caffe及配置matlab和python接口过程记录

    已有条件: ubuntu14.04+cuda7.5+anaconda2(即python2.7)+matlabR2014a 上述已经装好了,开始搭建caffe环境. 1. 装cudnn5.1.3,参照: ...

  4. Windows环境下使用 Caffe在ImageNet上训练网络

    在配置好Windows版的Caffe之后,可以使用Windows Caffe训练ImageNet网络,主要有4个步骤: (1)准备图片数据库 (2)将图片数据转换为Caffe可以使用的LMDB或者Le ...

  5. 使用caffe对mnist进行训练遇到的点点滴滴

    1. 前言 caffe 是一个非常好用的处理深度学习的基本框架,使用C++编写,因而速度非常快,同时提供了python 和 matlab接口,使用起来非常方便,但是由于caffe的文档资源相对比较少, ...

  6. Caffe学习系列(19): 绘制loss和accuracy曲线

    转载自: Caffe学习系列(19): 绘制loss和accuracy曲线 - denny402 - 博客园 http://www.cnblogs.com/denny402/p/5110204.htm ...

  7. Caffe—根据log日志绘制loss曲线和accuracy

    本文在此只讲述Ubuntu16.04下 caffe训练日志绘制loss曲线以及accuracy 如果是windows平台直接跳转文末 caffe中其实已经自带了这样的小工具 caffe-master/ ...

  8. Caffe如何画出训练中的loss曲线图和accuracy曲线图

    第一种方法:重定向训练日志文件 我们在训练的时候会用到caffe/buile/tools/caffe 这个里面的train这个选项.在输入之后,正常会显示训练日志的详细信息.想要画出这里面显示的los ...

  9. 利用caffe的solverstate断点训练

    你可以从系统 /tmp 文件夹获取,名字是什么 caffe.ubuntu.username.log.INFO.....之类 ====================================== ...

最新文章

  1. 程序员取悦女票的正确姿势---Tip1(iOS美容篇)
  2. 「镁客·请讲」仙知机器人赵越:“能友好工作”的机器人才能真正的为人类服务...
  3. bilibili怎么设置弹幕数量_python爬取B站视频弹幕分析并制作词云
  4. php文件写入生成文件,PHP 文件操作类(创建文件并写入) 生成日志
  5. POJ - 2516 Minimum Cost(最小费用最大流)
  6. [C++11]initializer_lisr模板类的使用
  7. 计算机系统占有率,微软继续领跑PC操作系统市场 Win10占有率突破25%
  8. C#获取Windows下光标位置(转)
  9. Go语言基础环境配置(windows)
  10. 21天Jmeter打卡Day11配置元件之CSV数据文件配置
  11. 正版windows序列号被激活工具重置问题解决
  12. PTES渗透测试执行标准
  13. 小程序-实现 tab-及多个列表选项切换
  14. 三元一次方程组例题_50道三元一次方程组计算题及答案过程
  15. static关键字详解
  16. 史上最强窃密软件来袭,手机或成泄密工具
  17. Python学习总结(1)——Python知识清单(基础知识数据科学)
  18. 事件回放:因「鹿晗、关晓彤」公布恋情,微博瘫痪了……
  19. LTE - 以IMS SIP消息为例深入解析RLC AM PDU收发过程
  20. OpenJudge百炼习题解答(C++)--题4040:买书问题

热门文章

  1. 下行文格式图片_帮你填平论文投稿格式修改这个大坑,一文了解三大出版社投稿要求...
  2. 蓝桥杯校内模拟值序列
  3. All men are brothers(并查集+思维 好题!!!)
  4. 威佐夫博弈(模板题)
  5. Altiumnbsp;designernbsp;学习教程
  6. 【计算机组成原理】定点运算器的基本结构
  7. 数学--数论--组合数(卢卡斯+扩展卢卡斯)模板
  8. 『数学』--数论--组合数+卢卡斯定理+扩展卢卡斯定理
  9. python数据分析入门
  10. 深度学习在CV领域的进展以及一些由深度学习演变的新技术