在上一篇文章《从零实现一个3D目标检测算法(2):点云数据预处理》我们完成了对点云数据的预处理

从本篇文章,我们开始正式实现PointPillars网络,我们将按照本系列第一篇文章介绍的网络具体结构来实现。


文章目录

  • 1. Pytorch基本模块
    • 1.1 Empty模块
    • 1.2 Sequential网络模块
  • 2. Pillar Feature Net 实现
    • 2.1 VFE模块
    • 2.2 Pillar Scatter模块
  • 3.

1. Pytorch基本模块

在工程上,为了方便对网络的搭建和修改,通常会基于Pytorch实现两个基本模块,空网络层模块Empty)和序列网络模块Sequential),文件为pytorch_utils.py

1.1 Empty模块

顾名思义,就是构造一个什么也不做的网络层,当然在这里的具体作用只是让网络更加完整(具体使用我们后面会见到),但在这里未参与计算,如果需要参与计算,也可对其进行修改,其构造也比较简单,就是创建一个名为Empty的类,它会继承nn.Module,代码为:

import torch
import torch.nn as nn
import sys
from collections import OrderedDictclass Empty(torch.nn.Module):def __init__(self, *args, **kwargs):super(Empty, self).__init__()def forward(self, *args, **kwargs):if len(args) == 1:return args[0]elif len(args) == 0:return Nonereturn args

在代码中,*args用来将参数打包成tuple给函数体调用,**kwargs 打包关键字参数成dict给函数体调用。而在编写函数中,参数arg*args**kwargs三个参数的位置是一定的。必须是(arg,*args,**kwargs)这个顺序,否则程序会报错,大家可以运行下面的代码来看看输出结果:

def function(arg,*args,**kwargs):print(arg,args,kwargs)function(6,7,8,9,a=1, b=2, c=3)

1.2 Sequential网络模块

Pytorch本身就带有此模块,这里之所以要单独介绍,是因为在配置文件中,网络的各种超参数是通过字典所给出的,这里重新编写,方便我们后面加载网路超参数,这样我们修改网络模型时,只需要修改字典中的参数即可,我们创建一个名为Sequential的类,代码为:

class Sequential(torch.nn.Module):"""A sequential container.Modules will be added to it in the order they are passed in the constructor.Alternatively, an ordered dict of modules can also be passed in."""def __init__(self, *args, **kwargs):super(Sequential, self).__init__()if len(args) == 1 and isinstance(args[0], OrderedDict):for key, module in args[0].items():self.add_module(key, module)else:for idx, module in enumerate(args):self.add_module(str(idx), module)for name, module in kwargs.items():if sys.version_info < (3, 6):raise ValueError("kwargs only supported in py36+")if name in self._modules:raise ValueError("name exists.")self.add_module(name, module)def __getitem__(self, idx):if not (-len(self) <= idx and idx < len(self)):raise IndexError('index {} is out of range'.format(idx))if idx < 0:idx += len(self)it = iter(self._modules.values())for i in range(idx):next(it)return next(it)def __len__(self):return len(self._modules)def add(self, module, name=None):if name is None:name = str(len(self._modules))if name in self._modules:raise KeyError("name exists")self.add_module(name, module)def forward(self, input):for module in self._modules.values():input = module(input)return input

下面提供 Sequential类使用的三个例子,其效果是等价的,大家可以运行看看输出结果:

 model = Sequential(nn.Conv2d(1,20,5), nn.ReLU(), nn.Conv2d(20,64,5), nn.ReLU())model = Sequential(OrderedDict([('conv1', nn.Conv2d(1,20,5)), ('relu1', nn.ReLU()),('conv2', nn.Conv2d(20,64,5)), ('relu2', nn.ReLU())]))model = Sequential(conv1=nn.Conv2d(1,20,5), relu1=nn.ReLU(),conv2=nn.Conv2d(20,64,5),relu2=nn.ReLU())

2. Pillar Feature Net 实现

现在,我们开始实现PointPillars网络的第一部分Feature Net,这一部分主要是生成伪图像,包括两个模块VFE模块Pillar Scatter模块,文件为vfe_utils.py

2.1 VFE模块

VFE模块的作用是将散乱无序的点云划分为一个个Pillar,然后对其进行特征学习,如下图所示。
首先我们导入需要的包,包括Pytorch以及上一节我们写的Empty类。

import torch
import torch.nn as nn
import torch.nn.functional as F
import sys
sys.path.append('../')
from ..model_utils.pytorch_utils import Empty

首先我们定义一个VoxelFeatureExtractor类,不过这里本身并不会进行任何操作:

class VoxelFeatureExtractor(nn.Module):def __init__(self, **kwargs):super().__init__()def get_output_feature_dim(self):raise NotImplementedErrordef forward(self, **kwargs):raise NotImplementedError

然后我们定义一个paddings_indicator函数。

def get_paddings_indicator(actual_num, max_num, axis=0):"""Create boolean mask by actually number of a padded tensor.Args:actual_num ([type]): [description]max_num ([type]): [description]Returns:[type]: [description]"""actual_num = torch.unsqueeze(actual_num, axis+1)   print('actual_num shape is: ', actual_num.shape)   # tiled_actual_num: [N, M, 1]max_num_shape = [1] * len(actual_num.shape)max_num_shape[axis+1] = -1max_num = torch.arange(max_num, dtype=torch.int, device=actual_num.device).view(max_num_shape)# tiled_actual_num: [[3,3,3,3,3], [4,4,4,4,4], [2,2,2,2,2]]# tiled_max_num: [[0,1,2,3,4], [0,1,2,3,4], [0,1,2,3,4]]paddings_indicator = actual_num.int() > max_num# paddings_indicator shape: [batch_size, max_num]return paddings_indicator

然后,我们定义一个PFNLayer类,这是一个简化的PointNet层,输入特征为10, 输出特征为64,网络是论文中提出的线性网络,只有一层,代码如下:

class PFNLayer(nn.Module):def __init__(self, in_channels, out_channels, use_norm=True, last_layer=False):"""Pillar Feature Net Layer.The Pillar Feature Net could be composed of a series of these layers, but the PointPillars paper resultsonly used a single PFNLayer.:param in_channels: <int>. Number of input channels.      :param out_channels: <int>. Number of output channels.    :param use_norm: <bool>. Whether to include BatchNorm.    :param last_layer: <bool>. If last_layer, there is no concatenation of features."""super().__init__()self.name = 'PFNLayer'self.last_vfe = last_layer            if not self.last_vfe:out_channels = out_channels // 2self.units = out_channels             if use_norm:                          self.linear = nn.Linear(in_channels, self.units, bias=False)self.norm = nn.BatchNorm1d(self.units, eps=1e-3, momentum=0.01)else:self.linear = nn.Linear(in_channels, self.units, bias=True)self.norm = Empty(self.units)def forward(self, inputs):x = self.linear(inputs)total_points, voxel_points, channels = x.shapex = self.norm(x.view(-1, channels)).view(total_points, voxel_points, channels)x = F.relu(x)x_max = torch.max(x, dim=1, keepdim=True)[0]      if self.last_vfe:return x_max                                  else:x_repeat = x_max.repeat(1, inputs_shape[1], 1)x_concatenated = torch.cat([x, x_repeat], dim=2)return x_concatenated

下面我们将实现PillarFeatureNetOld2类,这里的作用是生成一个个Pillar,并将点云原来的4维特征(x,y,z,r)(x,y,z,r)(x,y,z,r)扩充为10维特征(x,y,z,r,xc,yc,zc,xp,yp,zp)(x,y,z,r, x_c,y_c,z_c,x_p,y_p,z_p)(x,y,z,r,xc​,yc​,zc​,xp​,yp​,zp​),代码如下:

class PillarFeatureNetOld2(VoxelFeatureExtractor):def __init__(self, num_input_features=4, use_norm=True, num_filters=(64, ), with_distance=False,voxel_size=(0.2, 0.2, 4), pc_range=(0, -40, -3, 70.4, 40, 1)):"""Pillar Feature Net.The network prepares the pillar features and performs forward pass through PFNLayers.:param num_input_features: <int>. Number of input features, either x, y, z or x, y, z, r.           :param use_norm: <bool>. Whether to include BatchNorm.:param num_filters: (<int>: N). Number of features in each of the N PFNLayers.:param with_distance: <bool>. Whether to include Euclidean distance to points.:param voxel_size: (<float>: 3). Size of voxels, only utilize x and y size.                         :param pc_range: (<float>: 6). Point cloud range, only utilize x and y min.                        """super().__init__()self.name = 'PillarFeatureNetOld2'assert len(num_filters) > 0num_input_features +=6         if with_distance:              num_input_features += 1    self.with_distance = with_distanceself.num_filters = num_filters# Create PillarFeatureNetOld layersnum_filters = [num_input_features] + list(num_filters)    pfn_layers = []for i in range(len(num_filters) - 1):     in_filters = num_filters[i]           out_filters = num_filters[i+1]         if i < len(num_filters) - 2:last_layer = Falseelse:last_layer = True                  pfn_layers.append(PFNLayer(in_filters, out_filters, use_norm, last_layer=last_layer))self.pfn_layers = nn.ModuleList(pfn_layers)# Need pillar (voxel) size and x/y offset in order to calculate pillar offsetself.vx = voxel_size[0]self.vy = voxel_size[1]self.vz = voxel_size[2]self.x_offset = self.vx / 2 + pc_range[0]self.y_offset = self.vy / 2 + pc_range[1]self.z_offset = self.vz / 2 + pc_range[2]def get_output_feature_dim(self):return self.num_filters[-1]         # 64def forward(self, features, num_voxels, coords):""":param features: (N, max_points_of_each_voxel, 3 + C):param num_voxels: (N):param coors: (z ,y, x):return:"""dtype = features.dtype# Find distance of x, y, and z from cluster center (x, y, z mean)points_mean = features[:, :, :3].sum(dim=1, keepdim=True) / num_voxels.type_as(features).view(-1, 1, 1)print('points_mean shape is: ', points_mean.shape)      f_cluster = features[:, :, :3] - points_mean# Find distance of x, y, and z from pillar centerf_center = torch.zeros_like(features[:, :, :3])f_center[:, :, 0] = features[:, :, 0] - (coords[:, 3].to(dtype).unsqueeze(1) * self.vx + self.x_offset)f_center[:, :, 1] = features[:, :, 1] - (coords[:, 2].to(dtype).unsqueeze(1) * self.vy + self.y_offset)f_center[:, :, 2] = features[:, :, 2] - (coords[:, 1].to(dtype).unsqueeze(1) * self.vz + self.z_offset)print('f_center shape is: ', f_center.shape)          # Combine together feature decorationsfeatures_ls = [features, f_cluster, f_center]if self.with_distance:           # Falsepoints_dist = torch.norm(features[:, :, :3], 2, 2, keepdim=True)features_ls.append(points_dist)features = torch.cat(features_ls, dim=-1)# The feature decorations were calculated without regard to whether pillar was empty. # Need to ensure that empty pillars remain set to zeros.voxel_count = features.shape[1]mask = get_paddings_indicator(num_voxels, voxel_count, axis = 0)mask = torch.unsqueeze(mask, -1).type_as(features)features *= maskprint('161 features shape is: ', features.shape)  # Forward pass through PFNLayersfor pfn in self.pfn_layers:features = pfn(features)return features.squeeze()

2.2 Pillar Scatter模块

此模块生成伪造图像,图像维度为(1,64,496,432)(1, 64, 496, 432)(1,64,496,432),文件为pillar_scatter.py

import torch
import torch.nn as nnclass PointPillarsScatter(nn.Module):def __init__(self, input_channels=64, **kwargs):"""Point Pillar's Scatter.Converts learned features from dense tensor to sparse pseudo image.:param output_shape: ([int]: 4). Required output shape of features.:param num_input_features: <int>. Number of input features."""super().__init__()self.nchannels = input_channelsdef forward(self, voxel_features, coords, batch_size, **kwargs):output_shape = kwargs['output_shape']nz, ny, nx = output_shape# batch_canvas will be the final output.batch_canvas = []for batch_itt in range(batch_size):# Create the canvas for this samplecanvas = torch.zeros(self.nchannels, nz*nx*ny, dtype=voxel_features.dtype, \device=voxel_features.device)# Only include non-empty pillarsbatch_mask = coords[:, 0] == batch_itt this_coords = coords[batch_mask, :]indices = this_coords[:, 1].type(torch.long) * nz + this_coords[:, 2].type(torch.long) * nx + \this_coords[:, 3].type(torch.long)indices = indices.type(torch.long)voxels = voxel_features[batch_mask, :]voxels = voxels.t()# Now scatter the blob back to the canvas.canvas[:, indices] = voxels # Append to a list for later stacking.batch_canvas.append(canvas)# Stack to 3-dim tensor (batch-size, nchannels, nrows*ncols)batch_canvas = torch.stack(batch_canvas, 0)# Undo the column stacking to final 4-dim tensorbatch_canvas = batch_canvas.view(batch_size, self.nchannels * nz, ny, nx)return batch_canvas

至此,我们已经实现了特征网络部分,生成了伪图像。


3.

从零实现一个3D目标检测算法(3):PointPillars主干网实现(持续更新中)相关推荐

  1. 从零实现一个3D目标检测算法(2):点云数据预处理

    在上一篇文章<从零实现一个3D目标检测算法(1):3D目标检测概述>对3D目标检测研究现状和PointPillars模型进行了介绍,在本文中我们开始写代码一步步实现PointPillars ...

  2. 激光点云3D目标检测算法之PointPillars

    前言 <PointPillars: Fast Encoders for Object Detection from Point Clouds>是一篇发表在CVPR 2019上关于激光点云3 ...

  3. 论文篇 | 2020-Facebook-DETR :利用Transformers端到端的目标检测=>翻译及理解(持续更新中)

    论文题目:End-to-End Object Detection with Transformers 2020 论文复现可参考:项目复现 | DETR:利用transformers端到端的目标检测_夏 ...

  4. 基于激光雷达点云的3D目标检测算法论文总结

    作者丨eyesighting@知乎 来源丨https://zhuanlan.zhihu.com/p/508859024 编辑丨3D视觉工坊 前言  过去很多年激光雷达的车规标准和高昂价格是阻碍其量产落 ...

  5. 双目立体视觉建立深度图_从单幅图像到双目立体视觉的3D目标检测算法

    原创声明:本文为 SIGAI 原创文章,仅供个人学习使用,未经允许,不能用于商业目的. 其它机器学习.深度学习算法的全面系统讲解可以阅读<机器学习-原理.算法与应用>,清华大学出版社,雷明 ...

  6. 史上最全综述:3D目标检测算法汇总!

    来源:自动驾驶之心 本文约16000字,建议阅读10+分钟 本文将演示如何通过阈值调优来提高模型的性能.本文的结构安排如下:首先,第2节中介绍了3D目标检测问题的定义.数据集和评价指标.然后,我们回顾 ...

  7. 3D单目(mono 3D)目标检测算法综述

    layout: post title: 3D单目(mono 3D)目标检测算法综述 date: 2021-01-22 22:08:39.000000000 +09:00 categories: [算法 ...

  8. 一文尽览 | 基于点云、多模态的3D目标检测算法综述!(Point/Voxel/Point-Voxel)

    点击下方卡片,关注"自动驾驶之心"公众号 ADAS巨卷干货,即可获取 点击进入→自动驾驶之心技术交流群 后台回复[ECCV2022]获取ECCV2022所有自动驾驶方向论文! 目前 ...

  9. 万字长文概述单目3D目标检测算法

    一,理论基础-相机与图像 相机将三维世界中的坐标点(单位为米)映射到二维图像平面(单位为像素)的过程能够用一个几何模型进行描述,这个模型有很多种,其中最简单的称为针孔相机模型.相机的成像过程是也一个射 ...

最新文章

  1. FaaS、PaaS和无服务器体系结构的优势
  2. linux open()调用的注意事项
  3. python while循环语句-Python While 循环语句
  4. highcharts一天时间 与一周时间_一天当中什么时间减肥降重最好的
  5. 高擎信息安全大旗,打造“互联网+”新服务模式
  6. 2021牛年春节海报PSD分层模板,简单一点就好!
  7. BZOJ1370 [Baltic2003]Gang团伙
  8. cdc工具 postgresql_SQLServer CDC数据通过Kafka connect实时同步至分析型数据库 AnalyticDB For PostgreSQL及OSS-阿里云开发者社区...
  9. linux系统查看服务进程,Linux服务器系统详细查看进程启动时间
  10. Microsoft Excel 不能访问文件
  11. 【数据结构和算法笔记】最小生成树(贪心算法讲解 )
  12. 华为y220t android版本升级,华为 Y220T(移动版)救砖教程 救砖包 刷回官方系统支持OTA升级...
  13. Ubuntu18.04关闭docker开机自启动
  14. java--复制文件的方法:
  15. svg html转换器,html – 将嵌入的SVG转换为PNG到位
  16. xshell上传文件到linux很慢,XShell上传文件到Linux服务器上
  17. bibibi 下载_哔哩哔哩下载电脑版_哔哩哔哩官方版下载[bilibili]-下载之家
  18. 哪些排序是不稳定的?稳定又意味着什么?
  19. Never give up!!
  20. 数据分析A/BTest之APP页面

热门文章

  1. C语言 system函数
  2. 给与用户建立dblink的权限_网络安全 之 NTFS安全权限
  3. python 地址_python 解析地址 | 学步园
  4. django 学习 (三) 模板变量
  5. asterisk语音信箱voicemail.conf
  6. Wince6.0编译错误经验总结
  7. java listeners_Java ActionListeners
  8. elementui获取所有树节点_element-ui tree获取子节点全选的父节点信息
  9. 【转】1.C Task.CompletedTask和Task.Result小记
  10. 管理全局包、缓存和临时文件夹