深度学习开源框架知识汇总

1 概述
- 1.1开源框架总览
1.2如何学习开源框架
2 开源框架
- 2.1 Caffe
- - (1)caffe的使用通常是下面的流程：
  - (2)caffe:图像分类从模型自定义到测试
  - (3)Caffe 测试
- 2.2 Tensorflow
- - （1）特点
  - （2）tensorflow:图像分类从模型自定义到测试
  - （3）TensorFlow 测试

这是一篇总结文，为了帮助自己清楚12大深度学习开源框架而做的小总结。

1 概述

1.1开源框架总览

先放一张各大开源框架的一个总览表：
除此之外还有tiny-dnn，ConvNetJS，MarVin，Neon等等小众，以及CoreML等移动端框架。在选择开源框架时，要考虑很多原因，比如开源生态的完善性，比如自己项目的需求，比如自己熟悉的语言。 现在已经有很多开源框架之间进行互转的开源工具如MMDNN等，也降低了大家迁移框架的学习成本。
总的来说对于选择什么样的框架，有三可以给出一些建议：
(1) 不管怎么说，tensorflow/pytorch你都必须会，这是目前开发者最喜欢，开源项目最丰富的两个框架。

(2) 如果你要进行移动端算法的开发，那么Caffe是不能不会的。

(3) 如果你非常熟悉Matlab，matconvnet你不应该错过。

(4) 如果你追求高效轻量，那么darknet和mxnet你不能不熟悉。

(5) 如果你很懒，想写最少的代码完成任务，那么用keras吧。

(6) 如果你是java程序员，那么掌握deeplearning4j没错的。

1.2如何学习开源框架

要掌握好一个开源框架，通常需要做到以下几点：

(1) 掌握不同任务数据的准备和使用。

(2) 掌握模型的定义。

(3) 掌握训练过程和结果的可视化。

(4) 掌握训练方法和测试方法。

一个框架，官方都会开放有若干的案例，最常见的案例就是以 MNISI数据接口+预训练模型 的形式，供大家快速获得结果，但是这明显还不够，学习不应该停留在跑通官方的demo上，而是要解决实际的问题。

我们要学会从 自定义数据读取接口 ，自定义网络的搭建，模型的训练，模型的可视化，模型的测试与部署等全方位进行掌握。
在所有框架的学习过程中，都要完成下面这个流程，只有这样，才能叫做真正的完成了一个训练任务。
另外，所有的框架都使用同样的一个模型，这是一个 3层卷积+2层全连接的网络，由卷积+BN层+激活层组成，有的使用带步长的卷积，有的使用池化，差别不大。

输入图像，48483的RGB彩色图。

第一层卷积，通道数12，卷积核3*3。

第二层卷积，通道数24，卷积核3*3。

第三层卷积，通道数48，卷积核3*3。

第一层全连接，通道数128。

第二层全连接，通道数2，即类别数。

网络结构如下：

这是最简单的一种网络结构，优化的时候根据不同的框架，采用了略有不同的方案。因为此处的目标不是为了比较各个框架的性能，所以没有刻意保持完全一致。

2 开源框架

2.1 Caffe

github地址：https://github.com/BVLC/caffe

(1)caffe的使用通常是下面的流程：

以上的流程相互之间是解耦合的，所以caffe的使用非常优雅简单。

(2)caffe:图像分类从模型自定义到测试

这里展示一个分类任务的例子,数据存放地址https://github.com/longpeng2008/LongPeng_ML_Course。准备了 500 张微笑的图片、500 张非微笑的图片，放置在 data 目录下，图片预览如下，已经缩放到 60*60 的大小：
①准备训练数据
这是非微笑的图片：

这是微笑的图片：

Caffe 完成一个训练，必要准备以下资料：一个是 train.prototxt 作为网络配置文件，另一个是 solver.prototxt 作为优化参数配置文件，再一个是训练文件 list。

另外，在大多数情况下，需要一个预训练模型作为权重的初始化。
②准备网络配置文件
准备了一个 3*3 的卷积神经网络，它的 train.prototxt 文件是这样的：

name: "mouth"
layer {name: "data"type: "ImageData"top: "data"top: "clc-label"image_data_param {source: "all_shuffle_train.txt"batch_size: 96shuffle: true}transform_param {mean_value: 104.008mean_value: 116.669mean_value: 122.675crop_size: 48mirror: true}include: { phase: TRAIN}
}
layer {name: "data"type: "ImageData"top: "data"top: "clc-label"image_data_param {source: "all_shuffle_val.txt"batch_size: 30shuffle: false}transform_param {mean_value: 104.008mean_value: 116.669mean_value: 122.675crop_size: 48mirror: false}include: { phase: TEST}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 12pad: 1kernel_size: 3stride: 2weight_filler {type: "xavier"std: 0.01}bias_filler {type: "constant"value: 0.2}}
}
layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1"
}
layer {name: "conv2"type: "Convolution"bottom: "conv1"top: "conv2"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 20kernel_size: 3stride: 2pad: 1weight_filler {type: "xavier"std: 0.1}bias_filler {type: "constant"value: 0.2}}
}
layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2"
}
layer {name: "conv3"type: "Convolution"bottom: "conv2"top: "conv3"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 40kernel_size: 3stride: 2pad: 1weight_filler {type: "xavier"std: 0.1}bias_filler {type: "constant"value: 0.2}}
}
layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3"
}
layer {name: "ip1-mouth"type: "InnerProduct"bottom: "conv3"top: "pool-mouth"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 128weight_filler {type: "xavier"}bias_filler {type: "constant"value: 0}}
}layer {bottom: "pool-mouth"top: "fc-mouth"name: "fc-mouth"type: "InnerProduct"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 1}inner_product_param {num_output: 2weight_filler {type: "xavier"}bias_filler {type: "constant"value: 0}}
}
layer {bottom: "fc-mouth"bottom: "clc-label"name: "loss"type: "SoftmaxWithLoss"top: "loss"
}
layer {bottom: "fc-mouth"bottom: "clc-label"top: "acc"name: "acc"type: "Accuracy"include {phase: TRAIN}include {phase: TEST}
}

可以看出，Caffe 的这个网络配置文件，每一个卷积层，都是以 layer{} 的形式定义，layer 的bottom、top 就是它的输入输出，type 就是它的类型，有的是数据层、有的是卷积层、有的是 loss 层。

采用 netscope 来可视化一下这个模型。

从上面看很直观的看到，网络的输入层是 data 层，后面接了3个卷积层，其中每一个卷积层都后接了一个 relu 层，最后 ip1-mouth、fc-mouth 是全连接层。Loss 和 acc 分别是计算 loss 和 acc 的层。
各层的配置有一些参数，比如 conv1 有卷积核的学习率、卷积核的大小、输出通道数、初始化方法等，这些可以后续详细了解。
③准备训练 list
我们看上面的 data layer，可以到看到image_data_param 里面有

source: "all_shuffle_train.txt"

它是什么呢，就是输入用于训练的 list，它的内容是这样的：

../../../../datas/mouth/1/182smile.jpg 1../../../../datas/mouth/1/435smile.jpg 1../../../../datas/mouth/0/40neutral.jpg 0../../../../datas/mouth/1/206smile.jpg 1../../../../datas/mouth/0/458neutral.jpg 0../../../../datas/mouth/0/158neutral.jpg 0../../../../datas/mouth/1/322smile.jpg 1../../../../datas/mouth/1/83smile.jpg 1../../../../datas/mouth/0/403neutral.jpg 0../../../../datas/mouth/1/425smile.jpg 1../../../../datas/mouth/1/180smile.jpg 1../../../../datas/mouth/1/233smile.jpg 1../../../../datas/mouth/1/213smile.jpg 1../../../../datas/mouth/1/144smile.jpg 1../../../../datas/mouth/0/327neutral.jpg 0

格式就是，图片的名字 + 空格 + label，这就是 Caffe 用于图片分类默认的输入格式。
④准备优化配置文件：

net: "./train.prototxt"
test_iter: 100
test_interval: 10
base_lr: 0.00001
momentum: 0.9
type: "Adam"
lr_policy: "fixed"
display: 100
max_iter: 10000
snapshot: 2000
snapshot_prefix: "./snaps/conv3_finetune"
solver_mode: GPU

其中，net 是网络的配置路径。test_interval是指训练迭代多少次之后，进行一次测试。test_iter是测试多少个batch，如果它等于 1，就说明只取一个 batchsize 的数据来做测试，如果 batchsize 太小，那么对于分类任务来说统计出来的指标也不可信，所以最好一次测试，用到所有测试数据。因为，常令test_iter*test_batchsize=测试集合的大小。

base_lr、momentum、type、lr_policy是和学习率有关的参数，base_lr和lr_policy决定了学习率大小如何变化。type 是优化的方法，以后再谈。max_iter是最大的迭代次数，snapshot 是每迭代多少次之后存储迭代结果，snapshot_prefix为存储结果的目录，caffe 存储的模型后缀是 .caffemodel。solver_mode可以指定用 GPU 或者 CPU 进行训练。
⑤训练与结果可视化
利用 C++ 的接口进行训练，命令如下：

SOLVER=./solver.prototxt
WEIGHTS=./init.caffemodel
../../../../libs/Caffe_Long/build/tools/caffe train -solver $SOLVER -weights $WEIGHTS -gpu 0 2>&1 | tee log.txt

其中，caffe train 就是指定训练。可以利用脚本可视化一下训练结果：

(3)Caffe 测试

下面开始采用自己的图片进行测试。

①train.prototxt 与 test.prototxt 的区别
训练时的网络配置与测试时的网络配置是不同的，测试没有 acc 层，也没有 loss 层，取输出的 softmax 就是分类的结果。同时，输入层的格式也有出入，不需要再输入 label，也不需要指定图片 list，但是要指定输入尺度，我们看一下 test.prototxt 和可视化结果。

name: "mouth"
layer {name: "data"type: "Input"top: "data"input_param { shape: { dim: 1 dim: 3 dim: 48 dim: 48 } }
}layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 12pad: 1kernel_size: 3stride: 2weight_filler {type: "xavier"std: 0.01}bias_filler {type: "constant"value: 0.2}}
}
layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1"
}
layer {name: "conv2"type: "Convolution"bottom: "conv1"top: "conv2"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 20kernel_size: 3stride: 2pad: 1weight_filler {type: "xavier"std: 0.1}bias_filler {type: "constant"value: 0.2}}
}
layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2"
}
layer {name: "conv3"type: "Convolution"bottom: "conv2"top: "conv3"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 40kernel_size: 3stride: 2pad: 1weight_filler {type: "xavier"std: 0.1}bias_filler {type: "constant"value: 0.2}}
}
layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3"
}
layer {name: "ip1-mouth"type: "InnerProduct"bottom: "conv3"top: "pool-mouth"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 128weight_filler {type: "xavier"}bias_filler {type: "constant"value: 0}}
}
layer {bottom: "pool-mouth"top: "fc-mouth"name: "fc-mouth"type: "InnerProduct"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 1}inner_product_param {num_output: 2weight_filler {type: "xavier"}bias_filler {type: "constant"value: 0}}
}
layer {bottom: "fc-mouth"name: "loss"type: "Softmax"top: "prob"
}

②使用 Python 进行测试
由于 Python 目前广泛使用，下面使用 Python 来进行测试，它要做的就是导入模型、导入图片、输出结果。
下面是所有的代码，详细解释下：

---代码段1，这一段，我导入一些基本库，同时导入caffe的路径---#_*_ coding:utf8import syssys.path.insert(0, '../../../../../libs/Caffe_Long/python/')import caffeimport os,shutilimport numpy as npfrom PIL import Image as PILImagefrom PIL import ImageMathimport matplotlib.pyplot as pltimport timeimport cv2---代码段2，这一段，我们添加一个参数解释器，方便参数管理---debug=Trueimport argparsedef parse_args():parser = argparse.ArgumentParser(description='test resnet model for portrait segmentation')parser.add_argument('--model', dest='model_proto', help='the model', default='test.prototxt', type=str)parser.add_argument('--weights', dest='model_weight', help='the weights', default='./test.caffemodel', type=str)parser.add_argument('--testsize', dest='testsize', help='inference size', default=60,type=int)parser.add_argument('--src', dest='img_folder', help='the src image folder', type=str, default='./')parser.add_argument('--gt', dest='gt', help='the gt', type=int, default=0)args = parser.parse_args()return argsdef start_test(model_proto,model_weight,img_folder,testsize):---代码段3，这一段，我们就完成了网络的初始化---caffe.set_device(0)#caffe.set_mode_cpu()net = caffe.Net(model_proto, model_weight, caffe.TEST)imgs = os.listdir(img_folder)pos = 0neg = 0for imgname in imgs:---代码段4，这一段，是读取图片并进行预处理，还记得我们之前的训练，是采用 BGR 的输入格式，减去了图像均值吧，同时，输入网络的图像，也需要 resize 到相应尺度。预处理是通过 caffe 的类，transformer 来完成，set_mean 完成均值，set_transpose 完成维度的替换，因为 caffe blob 的格式是 batch、channel、height、width，而 numpy 图像的维度是 height、width、channel 的顺序---imgtype = imgname.split('.')[-1]imgid = imgname.split('.')[0]if imgtype != 'png' and imgtype != 'jpg' and imgtype != 'JPG' and imgtype != 'jpeg' and imgtype != 'tif' and imgtype != 'bmp':print imgtype,"error"continueimgpath = os.path.join(img_folder,imgname)img = cv2.imread(imgpath)if img is None:print "---------img is empty---------",imgpathcontinueimg = cv2.resize(img,(testsize,testsize))transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})transformer.set_mean('data', np.array([104.008,116.669,122.675]))transformer.set_transpose('data', (2,0,1))---代码段5，这一段，就得到了输出结果了，并做一些可视化显示---out = net.forward_all(data=np.asarray([transformer.preprocess('data', img)]))result = out['prob'][0]print "---------result prob---------",result,"-------result size--------",result.shapeprobneutral = result[0]print "prob neutral",probneutral probsmile = result[1]print "prob smile",probsmileproblabel = -1probstr = 'none'if probneutral > probsmile:probstr = "neutral:"+str(probneutral)pos = pos + 1else:probstr = "smile:"+str(probsmile)neg = neg + 1if debug:showimg = cv2.resize(img,(256,256))cv2.putText(showimg,probstr,(30,50),cv2.FONT_HERSHEY_SIMPLEX,1,(0,0,255),1)cv2.imshow("test",showimg)k = cv2.waitKey(0)if k == ord('q'):breakprint "pos=",pos print "neg=",neg if __name__ == '__main__':args = parse_args()start_test(args.model_proto,args.model_weight,args.img_folder,args.testsize)

经过前面的介绍，我们已经学会了 Caffe 的基本使用，但是我们不能停留于此。Caffe 是一个非常优秀的开源框架，有必要去细读它的源代码。

至于怎么读 Caffe 的代码，建议阅读Caffe代码阅读系列内容。

2.2 Tensorflow

（1）特点

TensorFlow最大的特点是计算图，即先定义好图，然后进行运算，所以所有的TensorFlow代码，都包含两部分：

(1) 创建计算图，表示计算的数据流。它做了什么呢？实际上就是定义好了一些操作，你可以将它看做是Caffe中的prototxt的定义过程。

(2)运行会话，执行图中的运算，可以看作是Caffe中的训练过程。只是TensorFlow的会话比Caffe灵活很多，由于是Python 接口，取中间结果分析，Debug等方便很多。

（2）tensorflow:图像分类从模型自定义到测试

①什么是 TensorFlow
TensorFlow = Tensor + Flow
Tensor 就是张量，代表 N 维数组，与 Caffe 中的 blob 是类似的；Flow 即流，代表基于数据流图的计算。神经网络的运算过程，就是数据从一层流动到下一层，在 Caffe 的每一个中间 layer 参数中，都有 bottom 和 top，这就是一个分析和处理的过程。TensorFlow更直接强调了这个过程。
②数据准备
上一节我们说过 Caffe 中的数据准备，只需要准备一个 list 文件，其中每一行存储 image、labelid 就可以了，那是 Caffe 默认的分类网络的 imagedata 层的输入格式。如果想定义自己的输入格式，可以去新建自定义的 Data Layer，而 Caffe 官方的 data layer 和 imagedata layer 都非常稳定，几乎没有变过，这是我更欣赏 Caffe 的一个原因。因为输入数据，简单即可。相比之下，TensorFlow 中的数据输入接口就要复杂很多，更新也非常快，有兴趣可参考知乎上大神写的《从 Caffe 到 TensorFlow 1，IO 操作》。
这里不再说 TensorFlow 中有多少种数据 IO 方法，先确定好我们的数据格式，那就是跟 Caffe一样，准备好一个list，它的格式一样是 image、labelid，然后再看如何将数据读入 TensorFlow 进行训练。
定义一个类，叫 imagedata，模仿 Caffe 中的使用方式。代码如下：

import tensorflow as tffrom tensorflow.contrib.data import Datasetfrom tensorflow.python.framework import dtypesfrom tensorflow.python.framework.ops import convert_to_tensorimport numpy as npclass ImageData:def read_txt_file(self):self.img_paths = []self.labels = []for line in open(self.txt_file, 'r'):items = line.split(' ')self.img_paths.append(items[0])self.labels.append(int(items[1]))def __init__(self, txt_file, batch_size, num_classes,image_size,buffer_scale=100):self.image_size = image_sizeself.batch_size = batch_sizeself.txt_file = txt_file ##txt list file,stored as: imagename idself.num_classes = num_classesbuffer_size = batch_size * buffer_scale# 读取图片self.read_txt_file()self.dataset_size = len(self.labels) print "num of train datas=",self.dataset_size# 转换成Tensorself.img_paths = convert_to_tensor(self.img_paths, dtype=dtypes.string)self.labels = convert_to_tensor(self.labels, dtype=dtypes.int32)# 创建数据集data = Dataset.from_tensor_slices((self.img_paths, self.labels))print "data type=",type(data)data = data.map(self.parse_function)data = data.repeat(1000)data = data.shuffle(buffer_size=buffer_size)# 设置self data Batchself.data = data.batch(batch_size)print "self.data type=",type(self.data)def augment_dataset(self,image,size):distorted_image = tf.image.random_brightness(image,max_delta=63)distorted_image = tf.image.random_contrast(distorted_image,lower=0.2, upper=1.8)# Subtract off the mean and divide by the variance of the pixels.float_image = tf.image.per_image_standardization(distorted_image)return float_imagedef parse_function(self, filename, label):label_ = tf.one_hot(label, self.num_classes)img = tf.read_file(filename)img = tf.image.decode_jpeg(img, channels=3)img = tf.image.convert_image_dtype(img, dtype = tf.float32)img = tf.random_crop(img,[self.image_size[0],self.image_size[1],3])img = tf.image.random_flip_left_right(img)img = self.augment_dataset(img,self.image_size)return img, label_

下面来分析上面的代码，类是 ImageData，它包含几个函数，__init__构造函数，read_txt_file数据读取函数，parse_function数据预处理函数，augment_dataset数据增强函数。直接看构造函数吧，分为几个步骤：

（1）读取变量，文本 list 文件txt_file，批处理大小batch_size，类别数num_classes，要处理成的图片大小image_size，一个内存变量buffer_scale=100。

（2）在获取完这些值之后，就到了read_txt_file函数。代码很简单，就是利用self.img_paths和 self.labels存储输入 txt 中的文件列表和对应的 label，这一点和 Caffe 很像了。

（3）然后，就是分别将img_paths和 labels 转换为 Tensor，函数是convert_to_tensor，这是 Tensor 内部的数据结构。

（4）创建 dataset，Dataset.from_tensor_slices，这一步，是为了将 img 和 label 合并到一个数据格式，此后我们将利用它的接口，来循环读取数据做训练。当然，创建好 dataset 之后，我们需要给它赋值才能真正的有数据。data.map 就是数据的预处理，包括读取图片、转换格式、随机旋转等操作，可以在这里做。

data = data.repeat(1000) 是将数据复制 1000 份，这可以满足我们训练 1000 个 epochs。data = data.shuffle(buffer_size=buffer_size)就是数据 shuffle 了，buffer_size就是在做 shuffle 操作时的控制变量，内存越大，就可以用越大的值。

（5）给 selft.data 赋值，我们每次训练的时候，是取一个 batchsize 的数据，所以 self.data = data.batch(batch_size)，就是从上面创建的 dataset 中，一次取一个 batch 的数据。

到此，数据接口就定义完毕了，接下来在训练代码中看如何使用迭代器进行数据读取就可以了。

③模型定义
创建数据接口后，我们开始定义一个网络

def simpleconv3net(x):x_shape = tf.shape(x)with tf.variable_scope("conv3_net"):conv1 = tf.layers.conv2d(x, name="conv1", filters=12,kernel_size=[3,3], strides=(2,2), activation=tf.nn.relu,kernel_initializer=tf.contrib.layers.xavier_initializer(),bias_initializer=tf.contrib.layers.xavier_initializer())bn1 = tf.layers.batch_normalization(conv1, training=True, name='bn1')conv2 = tf.layers.conv2d(bn1, name="conv2", filters=24,kernel_size=[3,3], strides=(2,2), activation=tf.nn.relu,kernel_initializer=tf.contrib.layers.xavier_initializer(),bias_initializer=tf.contrib.layers.xavier_initializer())bn2 = tf.layers.batch_normalization(conv2, training=True, name='bn2')conv3 = tf.layers.conv2d(bn2, name="conv3", filters=48,kernel_size=[3,3], strides=(2,2), activation=tf.nn.relu,kernel_initializer=tf.contrib.layers.xavier_initializer(),bias_initializer=tf.contrib.layers.xavier_initializer())bn3 = tf.layers.batch_normalization(conv3, training=True, name='bn3')conv3_flat = tf.reshape(bn3, [-1, 5 * 5 * 48])dense = tf.layers.dense(inputs=conv3_flat, units=128, activation=tf.nn.relu,name="dense",kernel_initializer=tf.contrib.layers.xavier_initializer())logits= tf.layers.dense(inputs=dense, units=2, activation=tf.nn.relu,name="logits",kernel_initializer=tf.contrib.layers.xavier_initializer())if debug:print "x size=",x.shapeprint "relu_conv1 size=",conv1.shapeprint "relu_conv2 size=",conv2.shapeprint "relu_conv3 size=",conv3.shapeprint "dense size=",dense.shapeprint "logits size=",logits.shapereturn logits

上面就是我们定义的网络，是一个简单的3层卷积。在 tf.layers 下，有各种网络层，这里就用到了tf.layers.conv2d，tf.layers.batch_normalization和 tf.layers.dense，分别是卷积层，BN 层和全连接层。我们以一个卷积层为例：

conv1 = tf.layers.conv2d(x, name="conv1", filters=12,kernel_size=[3,3], strides=(2,2), activation=tf.nn.relu,kernel_initializer=tf.contrib.layers.xavier_initializer(),bias_initializer=tf.contrib.layers.xavier_initializer())

x 即输入，name 是网络名字，filters 是卷积核数量，kernel_size即卷积核大小，strides 是卷积 stride，activation 即激活函数，kernel_initializer和bias_initializer分别是初始化方法。可见已经将激活函数整合进了卷积层，更全面的参数，请自查 API。其实网络的定义，还有其他接口，tf.nn、tf.layers、tf.contrib，各自重复，在我看来有些混乱。这里之所以用 tf.layers，就是因为参数丰富，适合从头训练一个模型。
④ 模型训练

from dataset import *
from net import simpleconv3net
import sys
import os
import cv2-------1 定义一些全局变量-------txtfile = sys.argv[1]
batch_size = 64
num_classes = 2
image_size = (48,48)
learning_rate = 0.0001debug=Falseif __name__=="__main__":-------2 载入网络结构，定义损失函数，创建计算图-------dataset = ImageData(txtfile,batch_size,num_classes,image_size)iterator = dataset.data.make_one_shot_iterator()dataset_size = dataset.dataset_sizebatch_images,batch_labels = iterator.get_next()Ylogits = simpleconv3net(batch_images)print "Ylogits size=",Ylogits.shapeY = tf.nn.softmax(Ylogits)cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=Ylogits, labels=batch_labels)cross_entropy = tf.reduce_mean(cross_entropy)correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(batch_labels, 1))accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)with tf.control_dependencies(update_ops):train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)saver = tf.train.Saver()in_steps = 100checkpoint_dir = 'checkpoints/'if not os.path.exists(checkpoint_dir):os.mkdir(checkpoint_dir)log_dir = 'logs/'if not os.path.exists(log_dir):os.mkdir(log_dir)summary = tf.summary.FileWriter(logdir=log_dir)loss_summary = tf.summary.scalar("loss", cross_entropy)acc_summary = tf.summary.scalar("acc", accuracy)image_summary = tf.summary.image("image", batch_images)
-------3 执行会话，保存相关变量，还可以添加一些debug函数来查看中间结果-------with tf.Session() as sess:  init = tf.global_variables_initializer()sess.run(init)  steps = 10000  for i in range(steps):_,cross_entropy_,accuracy_,batch_images_,batch_labels_,loss_summary_,acc_summary_,image_summary_ = sess.run([train_step,cross_entropy,accuracy,batch_images,batch_labels,loss_summary,acc_summary,image_summary])if i % in_steps == 0 :print i,"iterations,loss=",cross_entropy_,"acc=",accuracy_saver.save(sess, checkpoint_dir + 'model.ckpt', global_step=i)    summary.add_summary(loss_summary_, i)summary.add_summary(acc_summary_, i)summary.add_summary(image_summary_, i)#print "predict=",Ylogits," labels=",batch_labelsif debug:imagedebug = batch_images_[0].copy()imagedebug = np.squeeze(imagedebug)print imagedebug,imagedebug.shapeprint np.max(imagedebug)imagelabel = batch_labels_[0].copy()print np.squeeze(imagelabel)imagedebug = cv2.cvtColor((imagedebug*255).astype(np.uint8),cv2.COLOR_RGB2BGR)cv2.namedWindow("debug image",0)cv2.imshow("debug image",imagedebug)k = cv2.waitKey(0)if k == ord('q'):break

⑤可视化
TensorFlow 很方便的一点，就是 Tensorboard 可视化。Tensorboard 的具体原理就不细说了，很简单，就是三步。

第一步，创建日志目录。

 log_dir = 'logs/'   if not os.path.exists(log_dir):        os.mkdir(log_dir)

第二步，创建 summary 操作并分配标签，如我们要记录 loss、acc 和迭代中的图片，则创建了下面的变量：

loss_summary = tf.summary.scalar("loss", cross_entropy)acc_summary = tf.summary.scalar("acc", accuracy)image_summary = tf.summary.image("image", batch_images)

第三步，session 中记录结果，如下面代码：

_,cross_entropy_,accuracy_,batch_images_,batch_labels_,loss_summary_,acc_summary_,image_summary_ = sess.run([train_step,cross_entropy,accuracy,batch_images,batch_labels,loss_summary,acc_summary,image_summary])

查看训练过程和最终结果时使用：

tensorboard --logdir=logs

Loss 和 acc 的曲线图如下：

（3）TensorFlow 测试

上面已经训练好了模型，我们接下来的目标，就是要用它来做 inference 了。同样给出代码。

import tensorflow as tf
from net import simpleconv3net
import sys
import numpy as np
import cv2
import ostestsize = 48
x = tf.placeholder(tf.float32, [1,testsize,testsize,3])
y = simpleconv3net(x)
y = tf.nn.softmax(y)lines = open(sys.argv[2]).readlines()
count = 0
acc = 0
posacc = 0
negacc = 0
poscount = 0
negcount = 0with tf.Session() as sess:  init = tf.global_variables_initializer()sess.run(init)  saver = tf.train.Saver()saver.restore(sess,sys.argv[1])#test one by one, you can change it into batch inputsfor line in lines:imagename,label = line.strip().split(' ')img = tf.read_file(imagename)img = tf.image.decode_jpeg(img,channels = 3)img = tf.image.convert_image_dtype(img,dtype = tf.float32)img = tf.image.resize_images(img,(testsize,testsize),method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)img = tf.image.per_image_standardization(img)imgnumpy = img.eval()imgs = np.zeros([1,testsize,testsize,3],dtype=np.float32)imgs[0:1,] = imgnumpyresult = sess.run(y, feed_dict={x:imgs})result = np.squeeze(result)if result[0] > result[1]:predict = 0else:predict = 1count = count + 1if str(predict) == '0':negcount = negcount + 1if str(label) == str(predict):negacc = negacc + 1acc = acc + 1else:poscount = poscount + 1if str(label) == str(predict):posacc = posacc + 1acc = acc + 1print result
print "acc = ",float(acc) / float(count)
print "poscount=",poscount
print "posacc = ",float(posacc) / float(poscount)
print "negcount=",negcount
print "negacc = ",float(negacc) / float(negcount)

从上面的代码可知，与 Train 时同样，需要定义模型，这个跟 Caffe 在测试时使用的 Deploy 是一样的。

然后，用 restore 函数从 saver 中载入参数，读取图像并准备好网络的格式，sess.run 就可以得到最终的结果了。

深度学习开源框架知识汇总相关推荐

12大深度学习开源框架（caffe,tensorflow,pytorch,mxnet等）汇总详解
这是一篇总结文,给大家来捋清楚12大深度学习开源框架的快速入门,这是有三AI的GitHub项目,欢迎大家star/fork. https://github.com/longpeng2008/yousa ...
【完结】12大深度学习开源框架(caffe,tf,pytorch,mxnet等)快速入门项目
这是一篇总结文,给大家来捋清楚12大深度学习开源框架的快速入门,这是有三AI的GitHub项目,欢迎大家star/fork. https://github.com/longpeng2008/yousa ...
【完结】给新手的12大深度学习开源框架快速入门项目
文/编辑 | 言有三这是一篇总结文,给大家来捋清楚12大深度学习开源框架的快速入门,这是有三AI的GitHub项目,欢迎大家star/fork. https://github.com/longpen ...
【CV实战】年轻人的第一个深度学习CV项目应该是什么样的？（支持13大深度学习开源框架）...
计算机视觉发展至今,许多技术已经非常成熟了,在各行各业落地业务非常多,因此不断的有新同学入行.本次我们就来介绍,对于新手来说,如何做一个最合适的项目.本次讲述一个完整的工业级别图像分类项目的标准流程, ...
飞桨深度学习开源框架2.0抢先看：成熟完备的动态图开发模式
百度飞桨于近期宣布,深度学习开源框架2.0抢先版本正式发布,进入2.0时代.其中一项重大升级,就是推出更加成熟完备的命令式编程模式,即通常说的动态图模式.同时在该版本中将默认的开发模式定为动态图模式, ...
Karpathy更新深度学习开源框架排名：TensorFlow第一，PyTorch第二
上周,Keras作者.谷歌研究科学家François Chollet晒出一张图,他使用Google Search Index,展示了过去三个月,ArXiv上提到的深度学习框架排行,新智元也做了报道: ...
【杂谈】超过12个，150页深度学习开源框架指导手册与GitHub项目，初学CV你值得拥有...
之前我们公众号输出了很多深度学习开源框架相关的内容,今天整理成技术手册给大家分享以方便阅读,下面是详细信息. 开源框架背景现如今开源生态非常完善,深度学习相关的开源框架众多,光是为人熟知的就有caf ...
【杂谈】面向新手的深度学习开源框架指导手册与GitHub项目，欢迎加入我们的开源团队...
之前我们公众号输出了很多深度学习开源框架相关的内容,现在整理成技术手册给大家分享以方便阅读,下面是详细信息. 开源框架背景现如今开源生态非常完善,深度学习相关的开源框架众多,光是为人熟知的就有caf ...
【通知】有三AI发布150页深度学习开源框架指导手册与GitHub项目，欢迎加入我们的开源团队...
之前我们公众号输出了很多深度学习开源框架相关的内容,今天整理成技术手册给大家分享以方便阅读,下面是详细信息. 开源框架背景现如今开源生态非常完善,深度学习相关的开源框架众多,光是为人熟知的就有caf ...

深度学习开源框架知识汇总

深度学习开源框架知识汇总

1 概述

1.1开源框架总览

1.2如何学习开源框架

2 开源框架

2.1 Caffe

(1)caffe的使用通常是下面的流程：

(2)caffe:图像分类从模型自定义到测试

(3)Caffe 测试

2.2 Tensorflow

（1）特点

（2）tensorflow:图像分类从模型自定义到测试

（3）TensorFlow 测试

深度学习开源框架知识汇总相关推荐

最新文章

热门文章