Body estimation 代码复现之:结合 keras 对 Stack Hourglass 网络架构分析(全网最详细分析)
文章目录
- 代码
- 网络架构分析
- 说在前面
- 一个 stack 内部的结构
- create_left_half_blocks(bottom, bottleneck, hglayer, num_channels)
- bottom_layer(lf8, bottleneck, hgid, num_channels)
- create_right_half_blocks(leftfeatures, bottleneck, hglayer, num_channels)
- 两个 stack 之间的部分处理
- create_heads(prelayerfeatures, rf1, num_classes, hgid, num_channels)
- create_front_module(input, num_channels, bottleneck)
- 回顾一个 hourglass 中的整个过程
- 通过 for 循环,将多个 hourglass 进行叠加,组成 stackHourglass 网络
使用的 github 资源路径:
https://github.com/yuanyuanli85/Stacked_Hourglass_Network_Keras/tree/master/src/net
大家可以参考着进行复现
代码
from keras.models import *
from keras.layers import *
from keras.optimizers import Adam, RMSprop
from keras.losses import mean_squared_error
import keras.backend as Kdef create_hourglass_network(num_classes, num_stacks, num_channels, inres, outres, bottleneck):input = Input(shape=(inres[0], inres[1], 3))'''vgg16 提取出的 feature maps'''front_features = create_front_module(input, num_channels, bottleneck)head_next_stage = front_featuresoutputs = []for i in range(num_stacks):'''每一个 stack 都有一部分信息是直接来自原始的 feature maps'''head_next_stage, head_to_loss = hourglass_module(head_next_stage, num_classes, num_channels, bottleneck, i)outputs.append(head_to_loss)model = Model(inputs=input, outputs=outputs)rms = RMSprop(lr=5e-4)model.compile(optimizer=rms, loss=mean_squared_error, metrics=["accuracy"])return modeldef hourglass_module(bottom, num_classes, num_channels, bottleneck, hgid):'''bottom 在第一个 hourglass 中代表的就是最开始的 feature maps在后面的 hourglass 中代表前一个 hourglass 的输出:param bottom::param num_classes::param num_channels::param bottleneck::param hgid::return:'''# create left features , f1, f2, f4, and f8left_features = create_left_half_blocks(bottom, bottleneck, hgid, num_channels)# create right features, connect with left featuresrf1 = create_right_half_blocks(left_features, bottleneck, hgid, num_channels)# add 1x1 conv with two heads, head_next_stage is sent to next stage# head_parts is used for intermediate supervisionhead_next_stage, head_parts = create_heads(bottom, rf1, num_classes, hgid, num_channels)'''经过这个 hourglass 的输出为 head_next_stage'''return head_next_stage, head_partsdef bottleneck_block(bottom, num_out_channels, block_name):# skip layerif K.int_shape(bottom)[-1] == num_out_channels:_skip = bottomelse:_skip = Conv2D(num_out_channels, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + 'skip')(bottom)# residual: 3 conv blocks, [num_out_channels/2 -> num_out_channels/2 -> num_out_channels]_x = Conv2D(num_out_channels / 2, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + '_conv_1x1_x1')(bottom)_x = BatchNormalization()(_x)_x = Conv2D(num_out_channels / 2, kernel_size=(3, 3), activation='relu', padding='same',name=block_name + '_conv_3x3_x2')(_x)_x = BatchNormalization()(_x)_x = Conv2D(num_out_channels, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + '_conv_1x1_x3')(_x)_x = BatchNormalization()(_x)_x = Add(name=block_name + '_residual')([_skip, _x])return _xdef bottleneck_mobile(bottom, num_out_channels, block_name):# skip layerif K.int_shape(bottom)[-1] == num_out_channels:_skip = bottomelse:_skip = SeparableConv2D(num_out_channels, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + 'skip')(bottom)# residual: 3 conv blocks, [num_out_channels/2 -> num_out_channels/2 -> num_out_channels]_x = SeparableConv2D(num_out_channels / 2, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + '_conv_1x1_x1')(bottom)_x = BatchNormalization()(_x)_x = SeparableConv2D(num_out_channels / 2, kernel_size=(3, 3), activation='relu', padding='same',name=block_name + '_conv_3x3_x2')(_x)_x = BatchNormalization()(_x)_x = SeparableConv2D(num_out_channels, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + '_conv_1x1_x3')(_x)_x = BatchNormalization()(_x)_x = Add(name=block_name + '_residual')([_skip, _x])return _xdef create_front_module(input, num_channels, bottleneck):# front module, input to 1/4 resolution# 1 7x7 conv + maxpooling# 3 residual block_x = Conv2D(64, kernel_size=(7, 7), strides=(2, 2), padding='same', activation='relu', name='front_conv_1x1_x1')(input)_x = BatchNormalization()(_x)_x = bottleneck(_x, num_channels // 2, 'front_residual_x1')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(_x)_x = bottleneck(_x, num_channels // 2, 'front_residual_x2')_x = bottleneck(_x, num_channels, 'front_residual_x3')return _xdef create_left_half_blocks(bottom, bottleneck, hglayer, num_channels):# create left half blocks for hourglass module# f1, f2, f4 , f8 : 1, 1/2, 1/4 1/8 resolutionhgname = 'hg' + str(hglayer)f1 = bottleneck(bottom, num_channels, hgname + '_l1')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(f1)f2 = bottleneck(_x, num_channels, hgname + '_l2')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(f2)f4 = bottleneck(_x, num_channels, hgname + '_l4')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(f4)f8 = bottleneck(_x, num_channels, hgname + '_l8')return (f1, f2, f4, f8)def connect_left_to_right(left, right, bottleneck, name, num_channels):''':param left: connect left feature to right feature:param name: layer name:return:'''# left -> 1 bottlenect# right -> upsampling# Add -> left + right_xleft = bottleneck(left, num_channels, name + '_connect')_xright = UpSampling2D()(right)add = Add()([_xleft, _xright])out = bottleneck(add, num_channels, name + '_connect_conv')return outdef bottom_layer(lf8, bottleneck, hgid, num_channels):# blocks in lowest resolution# 3 bottlenect blocks + Addlf8_connect = bottleneck(lf8, num_channels, str(hgid) + "_lf8")_x = bottleneck(lf8, num_channels, str(hgid) + "_lf8_x1")_x = bottleneck(_x, num_channels, str(hgid) + "_lf8_x2")_x = bottleneck(_x, num_channels, str(hgid) + "_lf8_x3")rf8 = Add()([_x, lf8_connect])return rf8def create_right_half_blocks(leftfeatures, bottleneck, hglayer, num_channels):lf1, lf2, lf4, lf8 = leftfeaturesrf8 = bottom_layer(lf8, bottleneck, hglayer, num_channels)rf4 = connect_left_to_right(lf4, rf8, bottleneck, 'hg' + str(hglayer) + '_rf4', num_channels)rf2 = connect_left_to_right(lf2, rf4, bottleneck, 'hg' + str(hglayer) + '_rf2', num_channels)rf1 = connect_left_to_right(lf1, rf2, bottleneck, 'hg' + str(hglayer) + '_rf1', num_channels)return rf1def create_heads(prelayerfeatures, rf1, num_classes, hgid, num_channels):# two head, one head to next stage, one head to intermediate featureshead = Conv2D(num_channels, kernel_size=(1, 1), activation='relu', padding='same', name=str(hgid) + '_conv_1x1_x1')(rf1)head = BatchNormalization()(head)# for head as intermediate supervision, use 'linear' as activation.head_parts = Conv2D(num_classes, kernel_size=(1, 1), activation='linear', padding='same',name=str(hgid) + '_conv_1x1_parts')(head)# use linear activationhead = Conv2D(num_channels, kernel_size=(1, 1), activation='linear', padding='same',name=str(hgid) + '_conv_1x1_x2')(head)head_m = Conv2D(num_channels, kernel_size=(1, 1), activation='linear', padding='same',name=str(hgid) + '_conv_1x1_x3')(head_parts)head_next_stage = Add()([head, head_m, prelayerfeatures])return head_next_stage, head_partsdef euclidean_loss(x, y):return K.sqrt(K.sum(K.square(x - y)))
网络架构分析
说在前面
stack hourglass
的原意是堆叠多个hourglass
结构的网络, stack 不代指 hourglass 中的结构,但是下文在分析的时候,我们认为一个 hourglass 是一个 stack 的内部和 stack外部组成的,这样更容易弄清楚每个部分的对应关系。- 如果有什么歧义或者不理解的欢迎留言讨论
一个 stack 内部的结构
create_left_half_blocks(bottom, bottleneck, hglayer, num_channels)
- bottom 指的是整个
left_half_blocks
的输入,也就是每个stack
中的f1
的输入 - bottleneck 指的是构建网络时选用
bottleneck_block()
还是bottleneck_mobile()
- hglayer 是用来组成
layer
命名的参数 - num_channels 是这个
block
里面最终输出的通道数目 f1, f2, f4, f8
分别是四个bottleneck block
,他们内部其实通过 1×11×11×1 卷积来调整通道维度,通道维度一直是保持一致的,也就是说f1, f2, f4, f8
各自结构内部都是残差结构,而且通道数是一样的。但是在每一个f
层之间,都进行了一次池化操作,使得图像分辨率到f8
的时候变成了 18\frac{1}{8}81- 最后,
create_left_half_blocks
返回的是四个bottleneck block
的输出特征图
def bottleneck_block(bottom, num_out_channels, block_name):# skip layerif K.int_shape(bottom)[-1] == num_out_channels:_skip = bottomelse:_skip = Conv2D(num_out_channels, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + 'skip')(bottom)# residual: 3 conv blocks, [num_out_channels/2 -> num_out_channels/2 -> num_out_channels]_x = Conv2D(num_out_channels / 2, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + '_conv_1x1_x1')(bottom)_x = BatchNormalization()(_x)_x = Conv2D(num_out_channels / 2, kernel_size=(3, 3), activation='relu', padding='same',name=block_name + '_conv_3x3_x2')(_x)_x = BatchNormalization()(_x)_x = Conv2D(num_out_channels, kernel_size=(1, 1), activation='relu', padding='same',name=block_name + '_conv_1x1_x3')(_x)_x = BatchNormalization()(_x)_x = Add(name=block_name + '_residual')([_skip, _x])return _xdef create_left_half_blocks(bottom, bottleneck, hglayer, num_channels):# create left half blocks for hourglass module# f1, f2, f4 , f8 : 1, 1/2, 1/4 1/8 resolutionhgname = 'hg' + str(hglayer)f1 = bottleneck(bottom, num_channels, hgname + '_l1')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(f1)f2 = bottleneck(_x, num_channels, hgname + '_l2')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(f2)f4 = bottleneck(_x, num_channels, hgname + '_l4')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(f4)f8 = bottleneck(_x, num_channels, hgname + '_l8')return (f1, f2, f4, f8)
bottom_layer(lf8, bottleneck, hgid, num_channels)
lf8
就是f8
的输出hgid
也是用来组成layer
名称的参数bottom layer
部分包含下面几个部分:f8
的输出进入lf8_connect
和_x
_x
再通过两个bottleneck block
进行卷积操作- 最终
_x
和lf8_connect
的值进行加和得到rf8
def bottom_layer(lf8, bottleneck, hgid, num_channels):# blocks in lowest resolution# 3 bottlenect blocks + Addlf8_connect = bottleneck(lf8, num_channels, str(hgid) + "_lf8")_x = bottleneck(lf8, num_channels, str(hgid) + "_lf8_x1")_x = bottleneck(_x, num_channels, str(hgid) + "_lf8_x2")_x = bottleneck(_x, num_channels, str(hgid) + "_lf8_x3")rf8 = Add()([_x, lf8_connect])return rf8
create_right_half_blocks(leftfeatures, bottleneck, hglayer, num_channels)
leftfeatures 指的是左边的卷积层(f1,f2,f4 )(f8 的处理放在 bottom_layer 中)
我们这里就拿
f1
的从左到右融合的过程详细解释一下,其他的f2 f4 f8
也都是一样的:- 首先,
f1
的输出lf1
通过一个bottleneck block
进行卷积 rf2
的输出经过upsample2D
操作之后,和左边过来的特征进行融合- 这两步就是
connect_left_to_right
这个函数做的事情 - 当然值得注意的是,在
create_right_half_blocks
中,要从rf8
特征开始生成,又内层到外层进行生成,即:
- 首先,
def connect_left_to_right(left, right, bottleneck, name, num_channels):''':param left: connect left feature to right feature:param name: layer name:return:'''# left -> 1 bottlenect# right -> upsampling# Add -> left + right_xleft = bottleneck(left, num_channels, name + '_connect')_xright = UpSampling2D()(right)add = Add()([_xleft, _xright])out = bottleneck(add, num_channels, name + '_connect_conv')return outdef create_right_half_blocks(leftfeatures, bottleneck, hglayer, num_channels):lf1, lf2, lf4, lf8 = leftfeaturesrf8 = bottom_layer(lf8, bottleneck, hglayer, num_channels)rf4 = connect_left_to_right(lf4, rf8, bottleneck, 'hg' + str(hglayer) + '_rf4', num_channels)rf2 = connect_left_to_right(lf2, rf4, bottleneck, 'hg' + str(hglayer) + '_rf2', num_channels)rf1 = connect_left_to_right(lf1, rf2, bottleneck, 'hg' + str(hglayer) + '_rf1', num_channels)return rf1
- 上面完成的部分相当于下图中,用绿色的线表示的部分
- 接下来要表示的是进入一个 stack 之前和一个 stack 出来之后的操作
两个 stack 之间的部分处理
create_heads(prelayerfeatures, rf1, num_classes, hgid, num_channels)
prelayerfeatures
即上一个 stack 产生的最终特征图,对于第一个 stack 来说,prelayerfeatures
就是输入input
通过create_front_module()
而产生的特征图head
就是一个卷积层(这里不是bottleneck
了而是卷积层,激活函数是relu
)head_parts
是一个卷积层,激活函数是“linear”
- 最终 head 和 head_parts 分别再经过一个卷积层通过 1×11×11×1 卷积调整维度之后得到
head, head_m
然后head, head_m, prelayerfeatures
共同组成了下一个 stack 的输入特征 head_parts
最终会作为中间监督层使用的feature map
create_front_module(input, num_channels, bottleneck)
- 就是个普通的卷积网络,用于提取最初的特征
def create_front_module(input, num_channels, bottleneck):# front module, input to 1/4 resolution# 1 7x7 conv + maxpooling# 3 residual block_x = Conv2D(64, kernel_size=(7, 7), strides=(2, 2), padding='same', activation='relu', name='front_conv_1x1_x1')(input)_x = BatchNormalization()(_x)_x = bottleneck(_x, num_channels // 2, 'front_residual_x1')_x = MaxPool2D(pool_size=(2, 2), strides=(2, 2))(_x)_x = bottleneck(_x, num_channels // 2, 'front_residual_x2')_x = bottleneck(_x, num_channels, 'front_residual_x3')return _x
回顾一个 hourglass 中的整个过程
ef hourglass_module(bottom, num_classes, num_channels, bottleneck, hgid):'''bottom 在第一个 hourglass 中代表的就是最开始的 feature maps在后面的 hourglass 中代表前一个 hourglass 的输出:param bottom::param num_classes::param num_channels::param bottleneck::param hgid::return:'''# create left features , f1, f2, f4, and f8left_features = create_left_half_blocks(bottom, bottleneck, hgid, num_channels)# create right features, connect with left featuresrf1 = create_right_half_blocks(left_features, bottleneck, hgid, num_channels)# add 1x1 conv with two heads, head_next_stage is sent to next stage# head_parts is used for intermediate supervisionhead_next_stage, head_parts = create_heads(bottom, rf1, num_classes, hgid, num_channels)'''经过这个 hourglass 的输出为 head_next_stage'''return head_next_stage, head_parts
- 很显然要组成一个
hourglass
的module
就要既包含一个stack
内,也要包含一个stack
外面。 - 首先输入上一个
hourglass
的特征图bottom
,我们通过create_left_half_blocks
得到 left_features - 然后在一个 stack 内部通过
create_right_half_blocks
得到最终一个stack
的输出特征rf1
- 然后通过
create_heads()
我们进行特征的整合并返回了我们进行中间监督所使用的head_parts
特征图和这一个hourglass
的总的输出特征head_next_stage
通过 for 循环,将多个 hourglass 进行叠加,组成 stackHourglass 网络
def create_hourglass_network(num_classes, num_stacks, num_channels, inres, outres, bottleneck):input = Input(shape=(inres[0], inres[1], 3))'''vgg16 提取出的 feature maps'''front_features = create_front_module(input, num_channels, bottleneck)head_next_stage = front_featuresoutputs = []for i in range(num_stacks):'''每一个 stack 都有一部分信息是直接来自 原始的 feature maps'''head_next_stage, head_to_loss = hourglass_module(head_next_stage, num_classes, num_channels, bottleneck, i)outputs.append(head_to_loss)model = Model(inputs=input, outputs=outputs)rms = RMSprop(lr=5e-4)model.compile(optimizer=rms, loss=mean_squared_error, metrics=["accuracy"])return model
Body estimation 代码复现之:结合 keras 对 Stack Hourglass 网络架构分析(全网最详细分析)相关推荐
- 【神经网络】(13) ShuffleNetV2 代码复现,网络解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 复现轻量化神经网络 ShuffleNetV2. 为了能将神经网络模型用于移动端(手机)和终端(安防监控.无人驾驶)的实时计算,通常这些设备 ...
- 【神经网络】(17) EfficientNet 代码复现,网络解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 复现 EfficientNet 卷积神经网络模型. EfficientNet 的网络结构和 MobileNetV3 比较相似,建议大家在学 ...
- 【神经网络】(16) MobileNetV3 代码复现,网络解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 构建 MobileNetV3 轻量化网络模型. MobileNetV3 做了如下改动(1)更新了V2中的逆转残差结构:(2)使用NAS搜索 ...
- 【神经网络】(15) Xception 代码复现,网络解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 构建 Xception 神经网络模型. 在前面章节中,我已经介绍了很多种轻量化卷积神经网络模型,感兴趣的可以看一下:https://blo ...
- 【神经网络】(14) MnasNet 代码复现,网络解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 复现谷歌轻量化神经网络 MnasNet 通常而言,移动端(手机)和终端(安防监控.无人驾驶)上的设备计算能力有限,无法搭载庞大的神经网络 ...
- 【神经网络】(11) 轻量化网络MobileNetV1代码复现、解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 复现轻量化神经网络模型 MobileNetV1.为了能将神经网络模型用于移动端(手机)和终端(安防监控.无人驾驶)的实时计算,通常这些设备 ...
- NLP中遇到的各类Attention结构汇总以及代码复现
点击下方标题,迅速定位到你感兴趣的内容 前言 Bahdanau Attention Luong Attention Self-Attention.Multi-Head Attention Locati ...
- HAN论文模型代码复现与重构
论文简介 本文主要介绍CMU在2016年发表在ACL的一篇论文:Hierarchical Attention Networks for Document Classification及其代码复现. 该 ...
- 【神经网络】(12) MobileNetV2 代码复现,网络解析,附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 Tensorflow 复现谷歌轻量化神经网络 MobileNetV2. 在上一篇中我介绍了MobileNetV1,探讨了深度可分离卷积,感兴趣的可以看一下:ht ...
- 【YOLOV4】(7) 特征提取网络代码复现(CSPDarknet53+SPP+PANet+Head),附Tensorflow完整代码
各位同学好,今天和大家分享一下如何使用 TensorFlow 构建YOLOV4目标检测算法的特征提取网络. 完整代码在我的Gitee中,有需要的自取:https://gitee.com/dgvv4/y ...
最新文章
- leetcode 8. String to Integer (atoi)
- HarmonyOS之深入解析服务卡片的使用
- 微信公众号消息模板开发
- [html] 你喜欢哪种布局风格?说说你的理由
- [教程]一份简单易懂的 TensorFlow 教程
- Java连接数据库所遇到的坑,连接数据库,遇到一个很奇怪的问题……
- C#、VB.NET 使用System.Media.SoundPlayer播放音乐
- 启航~算法刷题第一天
- 《Approximation Capabilities of Multilayer Feedforward Networks》的学习笔记
- 2018 开源分布式中间件 DBLE 年报
- L1-1 PTA使我精神焕发 (5 分)
- 服务器为啥要搭建在2012系统,Windows Server2012R2怎么配置为DNS服务器
- Python实战项目7个有趣的小游戏
- Xilinx ZYNQ Ultrascale+ 性能测试之 Memory Stream
- 去除WINRAR的广告
- 不能创建对象qmdispatch_关于系统弹出错误:429 , ActiveX 部件不能创建对象 的解决方法...
- 财务系统服务器ebs系统,ebs系统(ebs财务系统是什么)
- 由limits.h看整型范围
- 01_摄像头基础知识
- maxima安装使用
热门文章
- c语言解惑 指针 数组 函数和多文件编程,C语言解惑 指针、数组、函数和多文件编程...
- 23种设计模式用英语如何表达?
- 量化投资之工具篇一:Backtrader从入门到精通(5)-Strategy类源代码解读
- 程序员真的需要读研究生么?
- 计算机合成图像的过程码,专转本计算机习题
- Python:合成图片
- 从网秦安全报告看各国各城百态
- 关于萨蒂亚·纳德拉安全演讲你所要知道的
- CapstoneCS5263|DP转HDMI 4K60HZ方案|替代PS176芯片
- C++求解一元三次方程的实根