1、基本的layer定义,参数

1、基本的layer定义,参数

如何利用caffe定义一个网络,首先要了解caffe中的基本接口,下面分别对五类layer进行介绍

Vision Layers

可视化层来自于头文件 Header: ./include/caffe/vision_layers.hpp 一般输入和输出都是图像,这一层关注图像的2维的几何结构,并根据此结构对输入进行处理,特别是,大多数可视化层都通过对一些区域的操作,产生相关的区域进行输出,相反的是其他层忽视结合结构,只是把输入当作一个一维的大规模的向量进行处理。
Convolution:
Convolution

Layer type: Convolution
CPU implementation: ./src/caffe/layers/convolution_layer.cpp

CUDA GPU implementation: ./src/caffe/layers/convolution_layer.cu
Parameters (ConvolutionParameter convolution_param)
Required:

num_output (c_o): the number of filters
//卷积的个数
kernel_size (or kernel_h and kernel_w): specifies height and width of each filter
//每个卷积的size
Strongly Recommended
weight_filler [default type: 'constant' value: 0]
Optionalbias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
//偏移量
pad (or pad_h and pad_w) [default 0]: specifies the number of pixels to (implicitly) add to each side of the input
//pad是对输入图像的扩充,边缘增加的大小
stride (or stride_h and stride_w) [default 1]: specifies the intervals at which to apply the filters to the input
//定义引用卷积的区间
group (g) [default 1]: If g > 1, we restrict the connectivity of each filter to a subset of the input. Specifically, the input and output channels are separated into g groups, and the ith output group channels will be only connected to the ith input group channels.
//限定输入的连通性,输入通道被分成g组,输出和输入的联通性是一致的,第i个输出通道仅仅和第i个输入通道联通。

每个filter产生一个featuremap.
输入的大小: n∗ci(channel)∗hi(height)∗wi(weight) n*c_i(channel)*h_i(height)*w_i(weight)
输出的大小:
n∗co∗ho∗wo,whereho=(hi+2∗padh−kernelh)/strideh+1andwo likewise. n * c_o * h_o * w_o, where h_o = (h_i + 2 * pad_h - kernel_h) / stride_h + 1 and w_o\ likewise.

Pooling:
池化层的作用是压缩特征的维度,把相邻的区域变成一个值。目前的类型包括:最大化,平均,随机
参数有:
kernel_size,filter的大小
pool:类型
pad:每个输入图像的增加的边界的大小
stride:filter之间的大小
输入大小:
n∗c∗hi∗wi n * c * h_i * w_i
输出大小:
n∗c∗ho∗wo n * c * h_o * w_o, where h_o and w_o are computed in the same way as convolution.

Local Response Normalization (LRN):
Layer type: LRN
CPU Implementation: ./src/caffe/layers/lrn_layer.cpp
CUDA GPU Implementation: ./src/caffe/layers/lrn_layer.cu
Parameters (LRNParameter lrn_param)
Optional
local_size [default 5]: the number of channels to sum over (for cross channel LRN) or the side length of the square region to sum over (for within channel LRN)
alpha [default 1]: the scaling parameter (see below)
beta [default 5]: the exponent (see below)
norm_region [default ACROSS_CHANNELS]: whether to sum over adjacent channels (ACROSS_CHANNELS) or nearby spatial locaitons (WITHIN_CHANNEL)
The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions. In ACROSS_CHANNELS mode, the local regions extend across nearby channels, but have no spatial extent (i.e., they have shape local_size x 1 x 1). In WITHIN_CHANNEL mode, the local regions extend spatially, but are in separate channels (i.e., they have shape 1 x local_size x local_size). Each input value is divided by (1+(α/n)∑ix2i)β (1+(\alpha/n)\sum_ix_i^2)^{\beta}, where n nis the size of each local region, and the sum is taken over the region centered at that value (zero padding is added where necessary).

im2col
图像转化为列向量

Loss Layers

损失层是网络在学习过程的依据,一般最小化一个损失函数,通过FP和梯度
softmax:
本层计算输入的多元的Logistic 损失l(θ)=−log(oy)l(\theta)=-log(o_y)其中 oy o_y是分类是y的概率.
注意与softmax-loss的区别softmax-loss其实就是把 oy o_y展开

l˜(y,z)=−log(ezy∑mj=1ezj)=log(∑j=1mezj)−zy

\widetilde{l}(y,z)=-log(\frac{e^{z_y}}{\sum_{j=1}^me^z_j})=log(\sum_{j=1}^me^{z_j})-z_y.其中 zy z_y是 zi=ωTix+bi z_i=\omega_i^Tx+b_i是第i个类别的线性预测结果。

平方和
类型: EuclideanLoss
欧式损失层计算的是两个输入向量之间的损失函数,

12N∑i=1N||x1i−x2i||22.

\frac{1}{2N}\sum_{i=1}^N||x_i^1-x_i^2||_2^2.
hinge:
类型:hingeloss
选项:L1,L2范数
输入:n*c*h*w的预测结果,n*1*1*1的label
输出:1*1*1*1的损失计算结果
样例:

# L1 Norm
layer {name: "loss"type: "HingeLoss"bottom: "pred"bottom: "label"
}# L2 Norm
layer {name: "loss"type: "HingeLoss"bottom: "pred"bottom: "label"top: "loss"hinge_loss_param {norm: L2}
}

hinge loss层计算了一个一对多的,或者是平方的损失函数
Sigmoid Cross-Entropy:
类型:

 31 template <typename Dtype>32 void SigmoidCrossEntropyLossLayer<Dtype>::Forward_cpu(33     const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {34   // The forward pass computes the sigmoid outputs.35   sigmoid_bottom_vec_[0] = bottom[0];36   sigmoid_layer_->Forward(sigmoid_bottom_vec_, sigmoid_top_vec_);37   // Compute the loss (negative log likelihood)38   const int count = bottom[0]->count();39   const int num = bottom[0]->num();40   // Stable version of loss computation from input data41   const Dtype* input_data = bottom[0]->cpu_data();42   const Dtype* target = bottom[1]->cpu_data();43   Dtype loss = 0;44   for (int i = 0; i < count; ++i) {45     loss -= input_data[i] * (target[i] - (input_data[i] >= 0)) -46         log(1 + exp(input_data[i] - 2 * input_data[i] * (input_data[i] >= 0)));47   }48   top[0]->mutable_cpu_data()[0] = loss / num;49 }50 

Infogain:

 49 template <typename Dtype>50 void InfogainLossLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,51     const vector<Blob<Dtype>*>& top) {52   const Dtype* bottom_data = bottom[0]->cpu_data();53   const Dtype* bottom_label = bottom[1]->cpu_data();54   const Dtype* infogain_mat = NULL;55   if (bottom.size() < 3) {56     infogain_mat = infogain_.cpu_data();57   } else {58     infogain_mat = bottom[2]->cpu_data();59   }60   int num = bottom[0]->num();61   int dim = bottom[0]->count() / bottom[0]->num();62   Dtype loss = 0;63   for (int i = 0; i < num; ++i) {64     int label = static_cast<int>(bottom_label[i]);65     for (int j = 0; j < dim; ++j) {66       Dtype prob = std::max(bottom_data[i * dim + j], Dtype(kLOG_THRESHOLD));67       loss -= infogain_mat[label * dim + j] * log(prob);68     }69   }70   top[0]->mutable_cpu_data()[0] = loss / num;71 }72 73 template <typename Dtype>74 void InfogainLossLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,75     const vector<bool>& propagate_down,76     const vector<Blob<Dtype>*>& bottom) {77   if (propagate_down[1]) {78     LOG(FATAL) << this->type()79                << " Layer cannot backpropagate to label inputs.";80   }81   if (propagate_down.size() > 2 && propagate_down[2]) {82     LOG(FATAL) << this->type()83                << " Layer cannot backpropagate to infogain inputs.";84   }85   if (propagate_down[0]) {86     const Dtype* bottom_data = bottom[0]->cpu_data();87     const Dtype* bottom_label = bottom[1]->cpu_data();88     const Dtype* infogain_mat = NULL;89     if (bottom.size() < 3) {90       infogain_mat = infogain_.cpu_data();91     } else {92       infogain_mat = bottom[2]->cpu_data();93     }94     Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();95     int num = bottom[0]->num();96     int dim = bottom[0]->count() / bottom[0]->num();97     const Dtype scale = - top[0]->cpu_diff()[0] / num;98     for (int i = 0; i < num; ++i) {99       const int label = static_cast<int>(bottom_label[i]);
100       for (int j = 0; j < dim; ++j) {
101         Dtype prob = std::max(bottom_data[i * dim + j], Dtype(kLOG_THRESHOLD));
102         bottom_diff[i * dim + j] = scale * infogain_mat[label * dim + j] / prob;
103       }
104     }
105   }
106 }
107
108 INSTANTIATE_CLASS(InfogainLossLayer);
109 REGISTER_LAYER_CLASS(InfogainLoss);
110 }  // namespace caffe

Accuracy and Top-k:

这个是对输出的结果与实际目标之间的准确率,实际上不是一个bp过程

Activation / Neuron Layers

一般激活/神经层是元操作,输入一个底层的数据blob,输出一个同样大小的顶层的blob,下面的层中,我们将忽略输入输出的大小,由于他们是同样的大小的。
Input: n∗c∗h∗w n*c*h*w
Output: n∗c∗h∗w n*c*h*w

ReLU/Rectified inner and leaky-ReLU:
Parameters (ReLUParameter relu_param)
Optional
negative_slope [default 0]: specifies whether to leak the negative part by multiplying it with the slope value rather than setting it to 0.

layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1"
}

ReLU函数如下定义,设输入值为X

f(x)={xnegative_slope∗xif x>0,otherwise.

\begin{equation}f(x)=\left\{ \begin{aligned} x & &if\ x>0,\\ negative\_slope*x & &otherwise.\\ \end{aligned} \right.\end{equation}
其中 negative_slope negative\_slope不是设定的,与 max(0,x) max(0,x)相等,详情见我的另外一个小博客
http://blog.csdn.net/swfa1/article/details/45601789

sigmoid层
层的类型:sigmoid
样例:

layer {name: "encode1neuron"bottom: "encode1"top: "encode1neuron"type: "Sigmoid"
}

公式:

f(x)=sigmoid(x)

f(x)=sigmoid(x)
TanH / Hyperbolic Tangent:
类型:TanH
样例:

layer {name: "layer"bottom: "in"top: "out"type: "TanH"
}
f(x)=tanh(x)

f(x)=tanh(x)
绝对值:
类型:AbsVal

layer {name: "layer"bottom: "in"top: "out"type: "AbsVal"
}

公式:

f(x)=abs(x)

f(x)=abs(x)
幂函数:
类型:Power
参数:
power [default 1]
scale [default 1]
shift [default 0]
样例:

layer {name: "layer"bottom: "in"top: "out"type: "Power"power_param {power: 1scale: 1shift: 0}
}

公式:

f(x)=(shift+scale∗x)power

f(x)=(shift + scale * x) ^ {power}

BNLL:
type:BNLL

layer {name: "layer"bottom: "in"top: "out"type: BNLL
}

公式:
The BNLL (binomial normal log likelihood) layer computes the output as

log(1+exp(x))

log(1 + exp(x))

Data Layers

Common Layers

InnerProduct
类型:InnerProduct
参数:
必须的:
num_output (c_o): the number of filters
强烈建议的:weight_filler [default type: ‘constant’ value: 0]
可选的:
bias_filler [default type: ‘constant’ value: 0]
bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs
样例:

layer {name: "fc8"type: "InnerProduct"# learning rate and decay multipliers for the weightsparam { lr_mult: 1 decay_mult: 1 }# learning rate and decay multipliers for the biasesparam { lr_mult: 2 decay_mult: 0 }inner_product_param {num_output: 1000weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}}bottom: "fc7"top: "fc8"
}

作用:
内积层又叫全连接层,输入当做一个以为想想,产生的输出也是以向量的形式输出,相当于blob的height 和width是1.

经过一段时间的学习之后,我发现上面的一些网络写的不是很详细,下面详细解释一下其中的
slice,ArgMaxLayer以及elementwise
slice layer
对输入进行分块处理,处理之后再进行剩下的计算,
ArgMaxLayer
Compute the index of the K K max values for each datum across all dimensions (C×H×W) (C \times H \times W) .

Intended for use after a classification layer to produce a prediction. If parameter out_max_val is set to true, output is a vector of pairs (max_ind, max_val) for each image. The axis parameter specifies an axis along which to maximise.
NOTE: does not implement Backwards operation.
elementwise
Compute elementwise operations, such as product and sum, along multiple input Blobs.

2、Alex 网络定义

3、如何增加一个新层

Add a class declaration for your layer to the appropriate one of common_layers.hpp,data_layers.hpp, loss_layers.hpp, neuron_layers.hpp, or vision_layers.hpp. Include an inline implementation of type and the *Blobs() methods to specify blob number requirements. Omit the*_gpu declarations if you’ll only be implementing CPU code.

Implement your layer in layers/your_layer.cpp.

SetUp for initialization: reading parameters, allocating buffers, etc.

Forward_cpu for the function your layer computes

Backward_cpu for its gradient

(Optional) Implement the GPU versions Forward_gpu and Backward_gpu in layers/your_layer.cu.

Add your layer to proto/caffe.proto, updating the next available ID. Also declare parameters, if needed, in this file.

Make your layer createable by adding it to layer_factory.cpp.

Write tests in test/test_your_layer.cpp. Use test/test_gradient_check_util.hpp to check that your Forward and Backward implementations are in numerical agreement.

以上是github上某大神的解答,步骤很清晰,具体说一下,比如现在要添加一个vision layer,名字叫Aaa_Layer:

1、属于哪个类型的layer,就打开哪个hpp文件,这里就打开vision_layers.hpp,然后自己添加该layer的定义,或者直接复制Convolution_Layer的相关代码来修改类名和构造函数名都改为Aaa_Layer,如果不用GPU,将*_gpu的声明都去掉。

2、实现自己的layer,编写Aaa_Layer.cpp,加入到src/caffe/layers,主要实现Setup、Forward_cpu、Backward_cpu。

3、如果需要GPU实现,那么在Aaa_Layer.cu中实现Forward_gpu和Backward_gpu。

4、修改src/caffe/proto/caffe.proto,好到LayerType,添加Aaa,并更新ID,如果Layer有参数,添加AaaParameter类。

5、在src/caffe/layer_factory.cpp中添加响应代码。

6、在src/caffe/test中写一个test_Aaa_layer.cpp,用include/caffe/test/test_gradient_check_util.hpp来检查前向后向传播是否正确。

caffe layer层详解相关推荐

  1. caffe网络模型各层详解(中文版)

    caffe网络模型各层详解(中文版) 参考网址:https://blog.csdn.net/qq_34220460/article/details/79872830 一.数据层及参数 要运行caffe ...

  2. caffe linux 教程,CentOS7安装Caffe的教程详解

    安装依赖包 sudo yum install protobuf-devel leveldb-devel snappy-devel opencv-devel boost-devel hdf5-devel ...

  3. [4G5G专题-57]:L2 RLC层-详解RLC架构、数据封装、三种模式:透明TM、非确认模式UM、确认模式AM

    目录 第1章  L2 RLC层的架构 1.1 RAN的架构 1.2 L2架构概述 1.3 RLC软件系统结构图 第2章 TCP/IP协议提供的三种传输服务 ​2.1 TCP 2.2 UDP 2.3 R ...

  4. torch.nn模块之池化层详解

    torch中的池化层 1. torch.nn模块中的池化层简介 2. 池化的调用方式 3. 图像池化演示 3.1 最大值池化 3.2 平均值池化 3.3 自适应平均值池化 参考资料 1. torch. ...

  5. caffe中常用层: BatchNorm层详解

    Batchnorm原理详解 前言:Batchnorm是深度网络中经常用到的加速神经网络训练,加速收敛速度及稳定性的算法,可以说是目前深度网络必不可少的一部分.  本文旨在用通俗易懂的语言,对深度学习的 ...

  6. 网络基础知识-TCP/IP协议各层详解

    TCP/IP简介 虽然大家现在对互联网很熟悉,但是计算机网络的出现比互联网要早很多. 计算机为了联网,就必须规定通信协议,早期的计算机网络,都是由各厂商自己规定一套协议,IBM.Apple和Micro ...

  7. [pytorch]yolov3.cfg参数详解(每层输出及route、yolo、shortcut层详解)

    文章目录 Backbone(Darknet53) 第一次下采样(to 208) 第二次下采样(to 104) 第三次下采样(to 52) 第四次下采样(to 26) 第五次下采样(to 13) YOL ...

  8. Django框架的模板层详解

    目录 一.模板简介 二.模板语法之变量 三.模板之过滤器 四.模板之标签 for标签 for ... empty if 标签 with 五.自定义标签和过滤器 六.模板导入和继承 模板导入: 模板继承 ...

  9. caffe网络模型各层详解(一)

    一:数据层及参数 caffe层次有许多类型,比如Data,Covolution,Pooling,层次之间的数据流动是以blobs的方式进行 首先,我们介绍数据层: 数据层是每个模型的最底层,是模型的入 ...

最新文章

  1. WINDOWS SERVER 2003从入门到精通之“域控制器安全策略”打开错误的解决方法
  2. js中报错“Maximum call stack size exceeded“解决方法
  3. TCL with SNPS get_attributesget_lib_attributelist_attributsreport_attribute
  4. Linux Centos7 下安装Mysql - 8.0.15
  5. c语言产生随机数_C语言 求的近似值
  6. 《Learning.Python》pdf
  7. CG CTF MISC Remove Boyfriend
  8. 快速学习使用springmvc、strust2、strust1以及它们的对比
  9. 从思维图到基础再到深入,java空间查询
  10. 德勤预判:2022技术七大趋势
  11. 安卓抓包软件_Packet Capture安卓抓包神器介绍及使用教程
  12. 复合文档(Compound Document)读写栗子
  13. 多通道卷积的参数数量计算
  14. weblogic调整多个服务启动顺序方法
  15. 2.6 使用for循环遍历文件 2.7 使用while循环遍历文件 2.8 统计系统剩余的内存 2.9 数据类型转换计算(计算mac地址) 3.0 数据类型转换(列表与字典相互转换)...
  16. 解决jsp页面数据传递乱码问题
  17. 线性代数(9):线性正交
  18. AD15实际工程的基本操作
  19. 运用regedit编辑器恢复清空回收站之后的文件
  20. 了解Google发展的下一代搜索Knowledge Graph:Emily Moxley访谈录

热门文章

  1. 2015520吴思其 基于《Arm试验箱的国密算法应用》课程设计个人报告
  2. Java字符串转时间
  3. for循环的一个小面试题(请问循环要循环几次?)
  4. Linux sort命令的细节问题 -k选项的真实用法
  5. 告别2022,重新出发
  6. css控制中文字体间距
  7. 中国造富运动惊人 30年跃为全球钱最多国家
  8. 20几岁,不要急着长大
  9. 【排序算法】冒泡排序、简单选择排序、直接插入排序比较和分析
  10. 功能测试与非功能测试