最近一直状态不好,从什么时候开始的呢,自己也忘啦,积极的调整和永远的相信自己可以~废话不多说

一、源码中给出的resnet50_fpn_backbone,解析

1.backbone的body层,也就是resnet层提取的输出

Resnet中的基本组成单元residual结构,分为左右两种,50用的是后面一种bottleneck结构50 101 152的区别其实就是每组layer里面bottleneck的个数不同。

class ResNet(nn.Module):def __init__(self, block, blocks_num, num_classes=1000, include_top=True, norm_layer=None):super(ResNet, self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dself._norm_layer = norm_layerself.include_top = include_topself.in_channel = 64#通过Maxpooling之后的得到的特征矩阵的深度self.conv1 = nn.Conv2d(3, self.in_channel, kernel_size=7, stride=2,padding=3, bias=False)#224*224*3 -> 112*112*64self.bn1 = norm_layer(self.in_channel)self.relu = nn.ReLU(inplace=True)self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)#56*56*64self.layer1 = self._make_layer(block, 64, blocks_num[0])self.layer2 = self._make_layer(block, 128, blocks_num[1], stride=2)self.layer3 = self._make_layer(block, 256, blocks_num[2], stride=2)self.layer4 = self._make_layer(block, 512, blocks_num[3], stride=2)#一张3*224*224的图像经过layer4之后会变为2048*7*7if self.include_top:self.avgpool = nn.AdaptiveAvgPool2d((1, 1))  # output size = (1, 1)self.fc = nn.Linear(512 * block.expansion, num_classes)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')#Bottleneck, block_num = [3, 4, 6, 3],channel代表该层的第一个conv的输出通道数,channel * block.expansion代表该层的输出通道数,stride代表该总的layer是否会stridedef _make_layer(self, block, channel, block_num, stride=1):norm_layer = self._norm_layer#batch normdownsample = None#self.in_channel代表每层输入的通道数,channel * block.expansion就是该层输出通道数if stride != 1 or self.in_channel != channel * block.expansion:downsample = nn.Sequential(nn.Conv2d(self.in_channel, channel * block.expansion, kernel_size=1, stride=stride, bias=False),norm_layer(channel * block.expansion))layers = []layers.append(block(self.in_channel, channel, downsample=downsample,stride=stride, norm_layer=norm_layer))self.in_channel = channel * block.expansionfor _ in range(1, block_num):layers.append(block(self.in_channel, channel, norm_layer=norm_layer))return nn.Sequential(*layers)def forward(self, x):x = self.conv1(x)x = self.bn1(x)x = self.relu(x)x = self.maxpool(x)x = self.layer1(x)x = self.layer2(x)x = self.layer3(x)x = self.layer4(x)if self.include_top:x = self.avgpool(x)x = torch.flatten(x, 1)#将索引为 start_dim 和 end_dim 之间(包括该位置)的数量相乘,其余位置不变。因为默认 start_dim=0,end_dim=-1,所以 torch.flatten(t) 返回只有一维的数据x = self.fc(x)return x
 resnet_backbone = ResNet(Bottleneck, [3, 4, 6, 3],#50Layer得Resnetinclude_top=False)

从源码可以看到它的结构,[3,4,6,3]从而选择了为50层的结构。这里和之前单层baakbone不同的是,因为要的是多层输出,所以returnlayers这个字典会有多组值。

  return_layers = {'layer1': '0', 'layer2': '1', 'layer3': '2', 'layer4': '3'}#这个是用来告诉要提取哪些层的输出

然后这里又构造了一个类,类不重要,里面的第一步分内容是这个,

 body = IntermediateLayerGetter(backbone, return_layers=return_layers)#类似于pytoch自带的create_feature_extractor,但是这个只能定位到子模块第一层

其实和上节课的create_feature_extractor函数作用差不多。这样我们就提取到了resnet50的四个特征层的输出。

2.backbone的fpn层层,也就是resnet层提取的输出之后,进行特征融合和backbone的最后输出

如图,也就是初始化函数中8个卷积层,forward()函数中再加3次上采样和一次Maxpool.

经过fpn的输出其实和body输出一样是字典的形式,只不过多了一个"pool"

FPN的八个卷积核,左边四个的输入输出,输入为in_channels_list,输出只有一个为

out_channels,后面四个卷积核输入输出都为out_channels。

3.将得到的baakbone作为参数传入FasterRCNN作为其中一个形参创建FasterRCNN模型。

model = FasterRCNN(backbone=backbone, num_classes=21)

二、换MobileNet V3+FPN(MBCONV)

1.MBCONV模块

2.代码第一部分:截取主干网络

    monile_v3_backbone = torchvision.models.mobilenet_v3_large()return_layers = {"features.6": "0" , "features.12": "1" , "features.16": "2"}monile_v3_backbone = create_feature_extractor(monile_v3_backbone, return_nodes=return_layers)
InvertedResidualConfig模块:
class InvertedResidualConfig:# Stores information listed at Tables 1 and 2 of the MobileNetV3 paperdef __init__(self, input_channels: int, kernel: int, expanded_channels: int, out_channels: int, use_se: bool,activation: str, stride: int, dilation: int, width_mult: float):self.input_channels = self.adjust_channels(input_channels, width_mult)self.kernel = kernelself.expanded_channels = self.adjust_channels(expanded_channels, width_mult)self.out_channels = self.adjust_channels(out_channels, width_mult)self.use_se = use_seself.use_hs = activation == "HS"self.stride = strideself.dilation = dilation@staticmethoddef adjust_channels(channels: int, width_mult: float):return _make_divisible(channels * width_mult, 8)
class InvertedResidual(nn.Module):# Implemented as described at section 5 of MobileNetV3 paperdef __init__(self, cnf: InvertedResidualConfig, norm_layer: Callable[..., nn.Module],se_layer: Callable[..., nn.Module] = partial(SElayer, scale_activation=nn.Hardsigmoid)):super().__init__()if not (1 <= cnf.stride <= 2):raise ValueError('illegal stride value')self.use_res_connect = cnf.stride == 1 and cnf.input_channels == cnf.out_channelslayers: List[nn.Module] = []activation_layer = nn.Hardswish if cnf.use_hs else nn.ReLU# expandif cnf.expanded_channels != cnf.input_channels:layers.append(ConvNormActivation(cnf.input_channels, cnf.expanded_channels, kernel_size=1,norm_layer=norm_layer, activation_layer=activation_layer))# depthwisestride = 1 if cnf.dilation > 1 else cnf.stridelayers.append(ConvNormActivation(cnf.expanded_channels, cnf.expanded_channels, kernel_size=cnf.kernel,stride=stride, dilation=cnf.dilation, groups=cnf.expanded_channels,norm_layer=norm_layer, activation_layer=activation_layer))if cnf.use_se:squeeze_channels = _make_divisible(cnf.expanded_channels // 4, 8)layers.append(se_layer(cnf.expanded_channels, squeeze_channels))# projectlayers.append(ConvNormActivation(cnf.expanded_channels, cnf.out_channels, kernel_size=1, norm_layer=norm_layer,activation_layer=None))self.block = nn.Sequential(*layers)self.out_channels = cnf.out_channelsself._is_cn = cnf.stride > 1def forward(self, input: Tensor) -> Tensor:result = self.block(input)if self.use_res_connect:result += inputreturn result
        self.features = nn.Sequential(*layers)self.avgpool = nn.AdaptiveAvgPool2d(1)self.classifier = nn.Sequential(nn.Linear(lastconv_output_channels, last_channel),nn.Hardswish(inplace=True),nn.Dropout(p=0.2, inplace=True),nn.Linear(last_channel, num_classes),)

2.确定FPN层

    in_channels_list = [40,112,960]out_channels = 256backbone = fpn.BackboneWithFPN(monile_v3_backbone,return_layers,in_channels_list,out_channels)

3.确定anchor_generator

anchor_sizes=((64,),(128,),(256,),(512,))  # 这里是元组里面的一组,所以是生成3*4=12中anchor
aspect_ratios=((0.5, 1.0, 2.0),)*len(anchor_sizes)
anchor_generator = AnchorsGenerator(sizes=anchor_sizes,  # 这里是元组里面的一组,所以是生成3*4=12中anchoraspect_ratios=aspect_ratios)

4.确定roi_pooler

   roi_pooler = torchvision.ops.MultiScaleRoIAlign(featmap_names=['0','1','2'],  # 在哪些特征层上进行roi poolingoutput_size=[7, 7],   # roi_pooling输出特征矩阵尺寸sampling_ratio=2)  # 采样率

5.生成FasterRCNN

   model = FasterRCNN(backbone=backbone,num_classes=num_classes,rpn_anchor_generator=anchor_generator,box_roi_pool=roi_pooler)shen

三、

stage4 5 以及最后1*1卷积的输出

#efficientB0
def create_model(num_classes):monile_v3_backbone = torchvision.models.efficientnet_b0()return_layers = {"features.3": "0" , "features.5": "1" , "features.8": "2"}monile_v3_backbone = create_feature_extractor(monile_v3_backbone, return_nodes=return_layers)img = torch.randn(1,3,224,224)# outputs = monile_v3_backbone(img)in_channels_list = [40,112,1280]out_channels = 256backbone = fpn.BackboneWithFPN(monile_v3_backbone,return_layers,in_channels_list,out_channels)anchor_sizes=((64,),(128,),(256,),(512,))  # 这里是元组里面的一组,所以是生成3*4=12中anchoraspect_ratios=((0.5, 1.0, 2.0),)*len(anchor_sizes)anchor_generator = AnchorsGenerator(sizes=anchor_sizes,  # 这里是元组里面的一组,所以是生成3*4=12中anchoraspect_ratios=aspect_ratios)roi_pooler = torchvision.ops.MultiScaleRoIAlign(featmap_names=['0','1','2'],  # 在哪些特征层上进行roi poolingoutput_size=[7, 7],   # roi_pooling输出特征矩阵尺寸sampling_ratio=2)  # 采样率model = FasterRCNN(backbone=backbone,num_classes=num_classes,rpn_anchor_generator=anchor_generator,box_roi_pool=roi_pooler)return model
class MBConv(nn.Module):def __init__(self, cnf: MBConvConfig, stochastic_depth_prob: float, norm_layer: Callable[..., nn.Module],se_layer: Callable[..., nn.Module] = SqueezeExcitation) -> None:super().__init__()if not (1 <= cnf.stride <= 2):raise ValueError('illegal stride value')self.use_res_connect = cnf.stride == 1 and cnf.input_channels == cnf.out_channelslayers: List[nn.Module] = []activation_layer = nn.SiLU# expandexpanded_channels = cnf.adjust_channels(cnf.input_channels, cnf.expand_ratio)if expanded_channels != cnf.input_channels:layers.append(ConvNormActivation(cnf.input_channels, expanded_channels, kernel_size=1,norm_layer=norm_layer, activation_layer=activation_layer))# depthwiselayers.append(ConvNormActivation(expanded_channels, expanded_channels, kernel_size=cnf.kernel,stride=cnf.stride, groups=expanded_channels,norm_layer=norm_layer, activation_layer=activation_layer))# squeeze and excitationsqueeze_channels = max(1, cnf.input_channels // 4)layers.append(se_layer(expanded_channels, squeeze_channels, activation=partial(nn.SiLU, inplace=True)))# projectlayers.append(ConvNormActivation(expanded_channels, cnf.out_channels, kernel_size=1, norm_layer=norm_layer,activation_layer=None))self.block = nn.Sequential(*layers)self.stochastic_depth = StochasticDepth(stochastic_depth_prob, "row")self.out_channels = cnf.out_channelsdef forward(self, input: Tensor) -> Tensor:result = self.block(input)if self.use_res_connect:result = self.stochastic_depth(result)result += inputreturn result

Fater RCNN 试着加入注意力机制模型相关推荐

  1. Attention!注意力机制模型最新综述(附下载)

    来源:专知 本文多资源,建议阅读5分钟. 本文详细描述了Attention模型的概念.定义.影响以及如何着手进行实践工作. [导 读]Attention模型目前已经成为神经网络中的一个重要概念,本文为 ...

  2. Context R-CNN一种基于注意力机制的视频检测算法

    最近遇到同一环境下,拍摄多张图片,检测结果存在差异的问题,故调研,考虑使用融合多帧信息去解决上述问题,发现这篇论文,该算法适用于我当前的问题,更适用于从事监控领域的同学,算法细节不赘述,看算法主体思路 ...

  3. NLP基础模型和注意力机制

    3.1 基础模型 欢迎来到本次课程的最后一周的内容,同时这也是五门深度学习课程的最后一门,你即将抵达本课程的终点. 你将会学习seq2seq(sequence to sequence)模型,从机器翻译 ...

  4. 注意力机制(一):注意力提示、注意力汇聚、Nadaraya-Watson 核回归

    专栏:神经网络复现目录 注意力机制 注意力机制(Attention Mechanism)是一种人工智能技术,它可以让神经网络在处理序列数据时,专注于关键信息的部分,同时忽略不重要的部分.在自然语言处理 ...

  5. Deep Reading | 从0到1再读注意力机制,此文必收藏!

    译者 | forencegan 编辑 | 琥珀 出品 | AI科技大本营(ID: rgznai100) [AI科技大本营导语]注意力机制(Attention)已经成为深度学习必学内容之一,无论是计算机 ...

  6. 【Pytorch神经网络理论篇】 20 神经网络中的注意力机制

    注意力机制可以使神经网络忽略不重要的特征向量,而重点计算有用的特征向量.在抛去无用特征对拟合结果于扰的同时,又提升了运算速度. 1 注意力机制 所谓Attention机制,便是聚焦于局部信息的机制,比 ...

  7. 论文浅尝 | 嵌入常识知识的注意力 LSTM 模型用于特定目标的基于侧面的情感分析...

    MaY, Peng H, Cambria E. Targeted aspect-based sentiment analysis via embedding commonsense knowledge ...

  8. 《Effective Approaches to Attention-based Neural Machine Translation》—— 基于注意力机制的有效神经机器翻译方法

    目录 <Effective Approaches to Attention-based Neural Machine Translation> 一.论文结构总览 二.论文背景知识 2.1 ...

  9. 论文解读:医学影像中的注意力机制

    点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达 来源|Daniel Liu@知乎,https://zhuanlan ...

最新文章

  1. poj2139(Flody算法)
  2. 02-NLP-01-jieba中文处理
  3. T-SQL查询进阶--流程控制语句
  4. 总结:一款Loading动画的实现思路
  5. wxpython 安装教程
  6. SAP Spartacus user form页面的css设计重构
  7. 使用SSL和Spring Security保护Tomcat应用程序的安全
  8. 基于uniapp开发的适用于微信小程序,头条小程序
  9. 李迟2021年11月知识总结
  10. c语言 __FILE__,__DATE__,__TIME__ (宏)
  11. bzoj4517[Sdoi2016]排列计数(组合数,错排)
  12. 数字式PID控制MATLAB仿真
  13. 将vim打造成强大的python和c的ide
  14. Farkas'Lemma 和 S-Lemma
  15. 2017 最新qq登录算法
  16. 流量分类方法设计(一)——参考论文整理
  17. 2019最新前端薪资报告来啦!前端的工资到底有多高?其实真相是这样的......
  18. Linux使用tar命令进行磁带备份
  19. 【SDOI2009】学校食堂
  20. BZOJ_4398_福慧双修BZOJ_2407_探险_分治+dij

热门文章

  1. Python 05 包Packet
  2. sqlplus format 999 A10,以及SQL.PNO的含义
  3. HTML元素分类:inline、inline-block、block
  4. 寒门难出贵子,AI助力教育但问题依然难解
  5. postMan中文修改
  6. 苹果个人账号转公司账号
  7. 普迪文集团:马来西亚留学必须了解的7个真相
  8. 安装MS15-034漏洞补丁KB3042553失败
  9. 运筹说 第29期 | 对偶理论与灵敏度分析—影子价格
  10. CAD中插入外部参照字体会变繁体_知道这些技巧-轻松攻克CAD所有困难