深度学习Caffe实战笔记（6）Windows caffe平台用Siamese网络跑自己的数据

终于到了介绍如何使用Siamese网络跑自己的数据了，在网上、论坛上、群里关于用Siamese网络的资料很多，但是实战的资料很少，难道是因为太容易了吗？反正博主查阅了各种地方，几乎没有找到Siamese网络实战的东东，即使有零星关于实战的东西，那也是基于Ubuntu系统，殊不知Ubuntu系统跑caffe可要比Windows简单的多了，所以，就本博主的调研情况来看，这篇博客绝对称的上是Windows平台使用Siamese网络跑自己的数据的第一篇详细资料！这一篇介绍如何利用Windows caffe Siamese网络跑自己的数据，下一篇打算介绍如何调整和搭建网络结构。。。。
开始train。（补充一句，这个博客是博主刚开始学习写的，有很多不足的地方，评论区有一高人指出了几个不足之处，望大家在训练的时候注意那几个地方）

1、准备数据
我们知道Siamese网络是要输入一个图像对，这个图像对是一对image pair，所以首先要把图像数据放到文件夹中，然后建立一个索引文件，索引文件的每一行是两个图像名，代表一个图像对。样式如下：

2、转换数据
转换数据需要转换数据的可执行文件，这个可执行文件哪里得到呢？哎，说到这里都是眼泪了，搞了好长时间，终于找到了相应的代码，不过这个代码有一个大坑，不知道当时写这个代码的人是没有注意，还是故意挖坑呢：

/*
* convertImgToSiamese.cpp
*/#include <algorithm>
#include <fstream>
#include <string>
#include <cstdio>
#include <utility>
#include <vector>
//#include <cstdlib>#include "boost/scoped_ptr.hpp"
#include "gflags/gflags.h"
#include "glog/logging.h"
#include "leveldb/db.h"#include "caffe/proto/caffe.pb.h"
#include "caffe/util/io.hpp"
#include "caffe/util/rng.hpp"
//#include "caffe/util/format.hpp"
#include "caffe/util/math_functions.hpp"#include "opencv2/opencv.hpp"
#include "google/protobuf/text_format.h"
#include "stdint.h"
#include <cstdio>
#include <iostream>
#include <cmath>using namespace caffe;
using std::pair;
using boost::scoped_ptr;
using namespace cv;
using namespace std;DEFINE_bool(gray, false, "when this option is on, treat images as grayscale ones");
DEFINE_bool(shuffle, false, "randomly shuffle the order of images and their labels");
DEFINE_string(backend, "leveldb", "the backend {lmdb, leveldb} for storing the result");
DEFINE_int32(resize_width, 0, "Width images are resized to");
DEFINE_int32(resize_height, 0, "Height images are resized to");
DEFINE_bool(check_size, false,"When this option is on, check that all the datum have the same size");
DEFINE_bool(encoded, false,"When this option is on, the encoded image will be save in datum");
DEFINE_string(encode_type, "","Optional: What type should we encode the image as ('png','jpg',...).");
DEFINE_int32(channel, 3, "channel numbers of the image");     //1//static bool ReadImageToMemory(const string &FileName, const int Height, const int Width, char *Pixels)   //2
static bool ReadImageToMemory(const string &FileName, const int Height, const int Width, char *Pixels)
{//read image//cv::Mat OriginImage = cv::imread(FileName, cv::IMREAD_GRAYSCALE);cv::Mat OriginImage = cv::imread(FileName);     //3. read color imageCHECK(OriginImage.data) << "Failed to read the image.\n";//resize the imagecv::Mat ResizeImage;cv::resize(OriginImage, ResizeImage, cv::Size(Width, Height));CHECK(ResizeImage.rows == Height) << "The heighs of Image is no equal to the input height.\n";CHECK(ResizeImage.cols == Width) << "The width of Image is no equal to the input width.\n";CHECK(ResizeImage.channels() == 3) << "The channel of Image is no equal to three.\n";    //4. should output the warning here// LOG(INFO) << "height " << ResizeImage.rows << " ";//LOG(INFO) << "weidth " << ResizeImage.cols << " ";//LOG(INFO) << "channels " << ResizeImage.channels() << "\n";// copy the image data to Pixelsfor (int HeightIndex = 0; HeightIndex < Height; ++HeightIndex){const uchar* ptr = ResizeImage.ptr<uchar>(HeightIndex);int img_index = 0;for (int WidthIndex = 0; WidthIndex < Width; ++WidthIndex){for (int ChannelIndex = 0; ChannelIndex < ResizeImage.channels(); ++ChannelIndex){int datum_index = (ChannelIndex * Height + HeightIndex) * Width + WidthIndex;*(Pixels + datum_index) = static_cast<char>(ptr[img_index++]);}}}return true;
}int main(int argc, char** argv)
{//::google::InitGoogleLogging(argv[0]);
#ifndef GFLAGS_GFLAGS_H_namespace gflags = google;
#endifgflags::SetUsageMessage("Convert a set of color images to the leveldb\n""format used as input for Caffe.\n""Usage:\n""    convert_imageset [FLAGS] ROOTFOLDER/ LISTFILE DB_NAME\n");caffe::GlobalInit(&argc, &argv);// 输入参数不足时报错if (argc < 4){gflags::ShowUsageWithFlagsRestrict(argv[0], "tools/convert_imageset");return 1;}// 读取图像名字和标签std::ifstream infile(argv[2]);std::vector<std::pair<std::string, std::string> > lines;std::string filename;std::string pairname;int label;while (infile >> filename >> pairname){lines.push_back(std::make_pair(filename, pairname));}// 打乱图片顺序if (FLAGS_shuffle){// randomly shuffle dataLOG(INFO) << "Shuffling data";shuffle(lines.begin(), lines.end());}LOG(INFO) << "A total of " << lines.size() << " images.";// 设置图像的高度和宽度int resize_height = std::max<int>(0, FLAGS_resize_height);int resize_width = std::max<int>(0, FLAGS_resize_width);int channel = std::max<int>(1, FLAGS_channel);     //5. add channel info// 打开数据库// Open leveldbleveldb::DB* db;leveldb::Options options;options.create_if_missing = true;options.error_if_exists = true;leveldb::Status status = leveldb::DB::Open(options, argv[3], &db);CHECK(status.ok()) << "Failed to open leveldb " << argv[3]<< ". Is it already existing?";// 保存到leveldb// Storing to leveldbstd::string root_folder(argv[1]);//char* Pixels = new char[2 * resize_height * resize_width];char* Pixels = new char[2 * resize_height * resize_width * channel];    //6. add channelconst int kMaxKeyLength = 10;   //10char key[kMaxKeyLength];std::string value;caffe::Datum datum;//datum.set_channels(2);  // one channel for each image in the pairdatum.set_channels(2 * channel);                //7. 3 channels for each image in the pairdatum.set_height(resize_height);datum.set_width(resize_width);//// int line_size = (int)(lines.size()/2);// std::cout<<"number of lines: "<<line_size<<endl;for (int LineIndex = 0; LineIndex < lines.size(); LineIndex++){//int PairIndex = LineIndex + line_size;// cout<<PairIndex<<endl;// int PairIndex = caffe::caffe_rng_rand() % lines.size();char* FirstImagePixel = Pixels;// cout<<root_folder + lines[LineIndex].first<<endl;ReadImageToMemory(root_folder + lines[LineIndex].first, resize_height, resize_width, FirstImagePixel);  //8. add channel here//char *SecondImagePixel = Pixels + resize_width * resize_height;char *SecondImagePixel = Pixels + resize_width * resize_height * channel;       //10. add channelReadImageToMemory(root_folder + lines[LineIndex].second, resize_height, resize_width, SecondImagePixel);  //9. add channel here// set image pair data// datum.set_data(Pixels, 2 * resize_height * resize_width);datum.set_data(Pixels, 2 * resize_height * resize_width * channel);     //11. correct// set label// for training, first 1000 pairs are true; for testing,first 1000 pairs are true// if (LineIndex<4000)   //train: 912,3000 true pairs, 81,1080 false pairs;//test: 35600 true pairs, 33500 false pairsif (LineIndex<9123000){datum.set_label(1);}else{datum.set_label(0);}// printf("first index: %d, second index: %d, labels: %d \n", lines[LineIndex].second, lines[PairIndex].second, datum.label());// serialize datum to stringdatum.SerializeToString(&value);int key_value = (int)(LineIndex);_snprintf(key, kMaxKeyLength, "%08d", key_value);string keystr(key);cout << "label: " << datum.label() << ' ' << "key index: " << keystr << endl;//sprintf_s(key, kMaxKeyLength, "%08d", LineIndex);     db->Put(leveldb::WriteOptions(), std::string(key), value);}delete db;delete[] Pixels;return 0;
}

这个cpp文件里有一个大坑，下面再做介绍。在caffe-windows-master\build_cpu_only\同样的方法，把convert_imageset文件复制一下，然后把复制文件夹里的release删除，把剩下的三个文件重命名，然后把复制的这个工程用VS打开，把原来的cpp移除，把上面的代码作为cpp文件导进去，这个步骤的实现方法我就不截图了，前面的博客里都有详细的截图说明，不懂的去看前面的博客吧。

生成之后，在bin目录下会有convert_imagese_siamese.exe可执行文件，这个文件就是用来数据转换的文件，在data文件夹下新建一个文件夹用来存放数据和可执行文件，写一个数据转换脚本文件：

分别转换训练集和验证集。

3、开始训练
有了数据集，就要训练了不是，写一个训练的脚本文件：

Siamese网络还是用的caffe自带的那个Siamese网络协议和超参文件，这两个文件中需要修改的地方在这里我不做介绍了，不懂的同学去看之前的博客吧。
这个时候就出现大坑了。

cv::resize错误，咋回事？起初我还以为是我的opencv有问题，所以在工程里我配置了一下opencv结果，还是不行，老方法，我就知道这个问题去问别人，别人也解答不了，也没人去解答，靠人不如靠自己啊！去一行行分析源码吧。。。。。。不看不知道，一看吓一跳啊，数据转换代码里有一个大坑。

// 设置图像的高度和宽度int resize_height = std::max<int>(0, FLAGS_resize_height);int resize_width = std::max<int>(0, FLAGS_resize_width);//要把数据转换成resize大小，转到FLAGS_resize_width和FLAGS_resize_height定义，发现是两个宏定义，这就是坑！
......
DEFINE_int32(resize_width, 0, "Width images are resized to");
DEFINE_int32(resize_height, 0, "Height images are resized to");//把要转化的大小设置成了0，要命了啊，怪不得会出现resize错误，都是0，当然会报错!

修改方法：把定义中的0改成你要设置的图像大小即可。
再编译一次，把数据重新转换，可以训练了。

4、一个疑问*（已经解决）*
开始训练以后，我的会caffe会报一个错误，第一个卷积层的卷积核参数共享提示维度不匹配，一个是20*1*5*5，共享层是20*5*5*5，没办法只能把这个参数共享关了，不知道具体原因是什么，参数共享的这两个卷积层明明是一样的，如果有知道原因的同学还想请教一下，在此先谢过了。

把第一个参数共享关闭了之后，成功。

后来查看了Glog日志，发现问题在于Slice层数据分解的问题，至于Slice层的讲解在这个博客里介绍的比较详细
http://blog.csdn.net/u012235274/article/details/52438479
解决方法，就是在slice_point: 1去除，使得数据平均分配，就没有维度不匹配的问题了。

写在后面的话：
Siamese网络是两个lenet网络的合并，用ContrastiveLoss损失函数做图像对的训练，但是我觉得Siamese更多的是一种思想，所有挖掘双分支或者更多分支关系的网络都可以称为Siamese网络，而不仅仅局限于lenet。

深度学习Caffe实战笔记（6）Windows caffe平台用Siamese网络跑自己的数据相关推荐

深度学习推荐系统实战笔记
小广告 (欢迎大家关注我的公众号"机器学习面试基地",之后将在公众号上持续记录本人从非科班转到算法路上的学习心得.笔经面经.心得体会.未来的重点也会主要放在机器学习面试上!) 序 ...
纽约大学深度学习PyTorch课程笔记（自用）Week6
纽约大学深度学习PyTorch课程笔记Week6 Week 6 6.1 卷积网络的应用 6.1.1 邮政编码识别器使用CNN进行识别 6.1.2 人脸检测一个多尺度人脸检测系统 6.1.3 语义分 ...
【深度学习】Keras vs PyTorch vs Caffe：CNN实现对比
作者 | PRUDHVI VARMA 编译 | VK 来源 | Analytics Indiamag 在当今世界,人工智能已被大多数商业运作所应用,而且由于先进的深度学习框架,它非常容易部署.这些深度 ...
深度学习项目实战-关键点定位视频课程
课程目标快速掌握如何使用caffe框架完成一个深度学习的实际项目适用人群深度学习爱好者,全民皆可入门课程简介深度学习项目实战-关键点定位课程以人脸关键点检测为背景,选择多阶段检测的网络架构, ...
（d2l-ai/d2l-zh）《动手学深度学习》pytorch 笔记（2）前言（介绍各种机器学习问题）以及数据操作预备知识Ⅰ
开源项目地址:d2l-ai/d2l-zh 教材官网:https://zh.d2l.ai/ 书介绍:https://zh-v2.d2l.ai/ 笔记基于2021年7月26日发布的版本,书及代码下载地址在 ...
TensorFlow 2.0深度学习案例实战
向AI转型的程序员都关注了这个号???????????? 机器学习AI算法工程公众号:datayx 基于TensorFlow 2.0正式版, 理论与实战结合,非常适合入门学习! 这是一本面向人工 ...
吴恩达深度学习教程——中文笔记网上资料整理
吴恩达深度学习笔记整理内容为网上博主博文整理,如有侵权,请私信联系. 课程内容: Coursera:官方课程安排(英文字幕).付费用户在课程作业中可以获得作业评分,每门课程修完可获得结课证书:不付费 ...
R基于H2O包构建深度学习模型实战
R基于H2O包构建深度学习模型实战目录 R基于H2O包构建深度学习模型实战 #案例分析
资源|2019 年 11 月最新《TensorFlow 2.0 深度学习算法实战》中文版教材免费开源（附随书代码+pdf）...
点击上方"AI遇见机器学习",选择"星标"公众号重磅干货,第一时间送 2019 年 10 月,谷歌正式宣布,开源机器学习库 TensorFlow 2.0 现在 ...

深度学习Caffe实战笔记（6）Windows caffe平台用Siamese网络跑自己的数据

深度学习Caffe实战笔记（6）Windows caffe平台用Siamese网络跑自己的数据相关推荐

最新文章

热门文章