maskrcnn_benchmark 代码详解之　poolers.py

前言：

　　在目标检测的深度网络中最后一个步骤就是RoI层，其中RoI Pooling会实现将RPN提取的各种形状的边框进行池化，从而形成统一尺度的特征层，这一工程中将涉及到ROIAlign操作。Pool中的Scale是一个数组，代表原始图片变换到FPN的各个特征层需要的变换比例，比如到Stage２是1/4, 以此类推。其代码详解为：

# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved.
import torch
import torch.nn.functional as F
from torch import nnfrom maskrcnn_benchmark.layers import ROIAlignfrom .utils import catclass LevelMapper(object):"""Determine which FPN level each RoI in a set of RoIs should map to basedon the heuristic in the FPN paper.""""""LevelMapper函数的作用是获得某个特征区域将会从网络的那一层特征上进行提取，面积越大的目标区往往会在高层进行提取，小目标则在低层卷基层上进行特征提取。本函数的主要目标就是确定某个目标最好从那一层上进行提取。实现FPN论文里的公式"""def __init__(self, k_min, k_max, canonical_scale=224, canonical_level=4, eps=1e-6):"""Arguments:k_min (int)k_max (int)canonical_scale (int)canonical_level (int)eps (float)"""# k_min是进行FPN的最低层网络在第几层，一般为２，表示FPN从第２层开始self.k_min = k_min# k_max是进行FPN的最高层网络在第几层，一般为５，表示FPN到第５层结束self.k_max = k_max# s0表示原始图像的边长为多大，以便确定目标是相对大还是小。这是参考imagenet预训练模型中的图片都是边长为２２４．如有必要，参数要调节self.s0 = canonical_scale# FPN层数self.lvl0 = canonical_level# 防止目标区域过小self.eps = epsdef __call__(self, boxlists):"""Arguments:boxlists (list[BoxList])"""# Compute level ids# 计算目标区域边长s = torch.sqrt(cat([boxlist.area() for boxlist in boxlists]))# Eqn.(1) in FPN paper# 计算FPN论文里的公式１target_lvls = torch.floor(self.lvl0 + torch.log2(s / self.s0 + self.eps))# 吧target_lvls缩小到正确的范围target_lvls = torch.clamp(target_lvls, min=self.k_min, max=self.k_max)return target_lvls.to(torch.int64) - self.k_minclass Pooler(nn.Module):"""Pooler for Detection with or without FPN.It currently hard-code ROIAlign in the implementation,but that can be made more generic later on.Also, the requirement of passing the scales is not strictly necessary, as theycan be inferred from the size of the feature map / size of original image,which is available thanks to the BoxList."""def __init__(self, output_size, scales, sampling_ratio):"""Arguments:output_size (list[tuple[int]] or list[int]): output size for the pooled region输出特征的大小scales (list[float]): scales for each Pooler # 获得参与FPN的最低层sampling_ratio (int): sampling ratio for ROIAlign　每个bin内高和宽方向的采样率，论文中默认的是２．即每个bin采样2*2=4"""super(Pooler, self).__init__()# 按照不同的尺度构造池化层poolers = []for scale in scales:poolers.append(ROIAlign(output_size, spatial_scale=scale, sampling_ratio=sampling_ratio))self.poolers = nn.ModuleList(poolers)self.output_size = output_size# get the levels in the feature map by leveraging the fact that the network always# downsamples by a factor of 2 at each level.# 获得参与FPN的最低层lvl_min = -torch.log2(torch.tensor(scales[0], dtype=torch.float32)).item()# 获得参与FPN的最高层lvl_max = -torch.log2(torch.tensor(scales[-1], dtype=torch.float32)).item()self.map_levels = LevelMapper(lvl_min, lvl_max)# 转换成roi的格式def convert_to_roi_format(self, boxes):concat_boxes = cat([b.bbox for b in boxes], dim=0)device, dtype = concat_boxes.device, concat_boxes.dtypeids = cat([torch.full((len(b), 1), i, dtype=dtype, device=device)for i, b in enumerate(boxes)],dim=0,)rois = torch.cat([ids, concat_boxes], dim=1)return roisdef forward(self, x, boxes):"""Arguments:x (list[Tensor]): feature maps for each levelboxes (list[BoxList]): boxes to be used to perform the pooling operation.Returns:result (Tensor)"""# 得到提取特征的层的个数num_levels = len(self.poolers)rois = self.convert_to_roi_format(boxes)if num_levels == 1:return self.poolers[0](x[0], rois)# 得到目标特征应该映射到的最有的层levels = self.map_levels(boxes)# 获得roi个数num_rois = len(rois)# 获得通道数num_channels = x[0].shape[1]# 获得输出大小output_size = self.output_size[0]# 获得特征的数据类型和它所在的设备dtype, device = x[0].dtype, x[0].device# 初始化返回数据result = torch.zeros((num_rois, num_channels, output_size, output_size),dtype=dtype,device=device,)for level, (per_level_feature, pooler) in enumerate(zip(x, self.poolers)):# 获得所有应该从同一特征层提取特征的roiidx_in_level = torch.nonzero(levels == level).squeeze(1)# 或者这些roi的编号rois_per_level = rois[idx_in_level]# 将大小相似的这些目标特征送入到特定同一个特征层进行池化，得到相应的结果result[idx_in_level] = pooler(per_level_feature, rois_per_level).to(dtype)return resultdef make_pooler(cfg, head_name):# 获得输出特征图的大小resolution = cfg.MODEL[head_name].POOLER_RESOLUTION# 获得参与FPN的最低层scales = cfg.MODEL[head_name].POOLER_SCALES# 每个bin内高和宽方向的采样率，论文中默认的是２．即每个bin采样2 * 2 = 4sampling_ratio = cfg.MODEL[head_name].POOLER_SAMPLING_RATIO# 获得池化层pooler = Pooler(output_size=(resolution, resolution),scales=scales,sampling_ratio=sampling_ratio,)return pooler

maskrcnn_benchmark 代码详解之　poolers.py相关推荐

maskrcnn_benchmark 代码详解之 roi_box_predictors.py
前言: 在对RPN预测到的边框进行进一步特征提取后,需要对边框进行预测,得到边框的类别和位置大小信息.这一操作在maskrcnn_benchmark中由roi_box_predictors.py完成, ...
maskrcnn_benchmark 代码详解之　roi_box_feature_extractors.py
前言: 在经过RPN层之后,网络会生成多个预测边框(proposal), 这时候需要对这些边框进行RoI池化,使之成为尺度一致的特征.接下来就需要对这些特征进行进一步的特征提取,这就需要用到roi_b ...
maskrcnn_benchmark 代码详解（更新中...)
前言: maskrcnn_benchmark是faceboock公司编写的一套用于目标检索的框架,该框架集成了目前用到的大部分使用深度卷积网络来进行目标检测的模型,其中包括Fast RCNN, Fas ...
maskrcnn-benchmar 代码详解之 fpn.py
前言 FPN网络主要应用于多层特征提取,使用多尺度的特征层来进行目标检测,可以利用不同的特征层对于不同大小特征的敏感度不同,将他们充分利用起来,以更有利于目标检测,在maskrcnn benchmar ...
maskrcnn_benchmark 代码详解之　boxlist_ops.py
前言: 与Bounding Box有关的操作有很多,例如对边框列表进行非极大线性抑制.去除过小的边框.计算边框之间的Iou以及对两个边框列表进行合并等操作.在maskrcnn_benchmark中,这 ...
yolov3代码详解（七）
Pytorch | yolov3代码详解七 test.py test.py from __future__ import divisionfrom models import * from utils ...
yolov5的detect.py代码详解
目标检测系列之yolov5的detect.py代码详解前言哈喽呀!今天又是小白挑战读代码啊!所写的是目标检测系列之yolov5的detect.py代码详解.yolov5代码对应的是官网v6.1版本 ...
yolov5-5.0版本代码详解----augmentations.py的augment_hsv函数
yolov5-5.0版本代码详解----augmentations.py的augment_hsv函数 1.用途图片的hsv色域增强模块 2.调用位置在datasets.py的LoadImagesA ...
【Image captioning】Show, Attend, and Tell 从零到掌握之三--train.py代码详解
[Image captioning]Show, Attend, and Tell 从零到掌握之三–train.py代码详解作者:安静到无声个人主页作者简介:人工智能和硬件设计博士生.CSDN与阿 ...

maskrcnn_benchmark 代码详解之　poolers.py

前言：

maskrcnn_benchmark 代码详解之　poolers.py相关推荐

最新文章

热门文章

maskrcnn_benchmark 代码详解之 poolers.py

前言：

maskrcnn_benchmark 代码详解之 poolers.py相关推荐

最新文章

热门文章

maskrcnn_benchmark 代码详解之　poolers.py

maskrcnn_benchmark 代码详解之　poolers.py相关推荐