3D点云重建0-10：MVSNet-源码解析（6）-Depth Map Refinement以及Loss讲解

以下链接是个人关于MVSNet(R-MVSNet)-多视角立体深度推导重建所有见解，如有错误欢迎大家指出，我会第一时间纠正。有兴趣的朋友可以加微信：17575010159 相互讨论技术。若是帮助到了你什么，一定要记得点赞！因为这是对我最大的鼓励。文末附带\color{blue}{文末附带}文末附带公众号−\color{blue}{公众号 -}公众号−海量资源。\color{blue}{ 海量资源}。海量资源。

3D点云重建0-00：MVSNet(R-MVSNet)–目录-史上最新无死角讲解：https://blog.csdn.net/weixin_43013761/article/details/102852209

Depth Map Refinement

该小结主要讲两个要点，那就是Depth Map Refinement以及Loss（这么聪明的你，肯定从博客题名就猜出来了）。先找到mvsnet/train.py如下代码：

   if FLAGS.refinement:# 获得r imgref_image = tf.squeeze(tf.slice(images, [0, 0, 0, 0, 0], [-1, 1, -1, -1, 3]), axis=1)# 通过r img与推断的到depth_map的结合，获得精炼的特征图refined_depth_map = depth_refine(depth_map, ref_image,FLAGS.max_d, depth_start, depth_interval, is_master_gpu)

其上的核心函数是depth_refine是在model.py中实现，注释如下：

def depth_refine(init_depth_map, image, depth_num, depth_start, depth_interval, is_master_gpu=True):""" refine depth image with the image """# normalization parameters，获得没有经过提炼深度图的相关信息depth_shape = tf.shape(init_depth_map)# 获得深度图的最大深度depth_end = depth_start + (tf.cast(depth_num, tf.float32) - 1) * depth_intervaldepth_start_mat = tf.tile(tf.reshape(depth_start, [depth_shape[0], 1, 1, 1]), [1, depth_shape[1], depth_shape[2], 1])#该处的操作是为了后面深度图的归一化depth_end_mat = tf.tile(tf.reshape(depth_end, [depth_shape[0], 1, 1, 1]), [1, depth_shape[1], depth_shape[2], 1])depth_scale_mat = depth_end_mat - depth_start_mat# normalize depth map (to 0~1),对深度图进行归一化init_norm_depth_map = tf.div(init_depth_map - depth_start_mat, depth_scale_mat)# resize normalized image to the same size of depth image，把输入的图片大小改变成深度图相同的大小resized_image = tf.image.resize_bilinear(image, [depth_shape[1], depth_shape[2]])# refinement network，送入网络进行提炼if is_master_gpu:norm_depth_tower = RefineNet({'color_image': resized_image, 'depth_image': init_norm_depth_map},is_training=True, reuse=False)else:norm_depth_tower = RefineNet({'color_image': resized_image, 'depth_image': init_norm_depth_map},is_training=True, reuse=True)# 得到提炼过后的深度图，但是这是归一化之后的norm_depth_map = norm_depth_tower.get_output()# denormalize depth map，进行缩放，得到实际的深度图refined_depth_map = tf.multiply(norm_depth_map, depth_scale_mat) + depth_start_matreturn refined_depth_map

太简单了，简单得不要不要的，对应论文图示如下：

简单得说，就是初始深度图init_depth_map其是1个通道，r img（该表大小的之后）3个通道合起来4个通道，然后通过一系列的卷积操作，输出一个单通道的深度图refined_depth_map，就这样没没了！

LOSS讲解

下面我们来看了loss的定义，在mvsnet/train.py中找到如下代码：

  # regression loss，接下来为重点，涉及到loss部分# depth_image为标签的深度图，depth_map为网络推断出来的深度图，depth_interval为深度最小单位间隔# 可以看到，求了两次loss，一次是与提炼的深度图，一次是和没有提炼的深度图loss0, less_one_temp, less_three_temp = mvsnet_regression_loss(depth_map, depth_image, depth_interval)loss1, less_one_accuracy, less_three_accuracy = mvsnet_regression_loss(refined_depth_map, depth_image, depth_interval)# 两次loss去平均值loss = (loss0 + loss1) / 2

其上的核心函数mvsnet_regression_loss在mvsnet/loss.py中实现，注释代码如下：

def mvsnet_regression_loss(estimated_depth_image, depth_image, depth_interval):""" compute loss and accuracyestimated_depth_image：网络推断出来的深度图depth_image：标签的深度图depth_interval： 深度刻度尺寸"""# non zero mean absulote loss，非0的平均loss# 有的时候，真实2的深度图是不完整的，那么只对真实深度图中有效的像素进行loss计算masked_mae = non_zero_mean_absolute_diff(depth_image, estimated_depth_image, depth_interval)# less one accuracy，减少3个百分比精度，单网络迭代比较好的时候，estimated_depth_image与depth_image的距离只有# 一个单位的距离时，会参与loss，主要时为了提高精度less_one_accuracy = less_one_percentage(depth_image, estimated_depth_image, depth_interval)# less three accuracy，减少3个百分比精度，单网络迭代比较好的时候，estimated_depth_image与depth_image的距离只有# 3个单位的距离时，会参与loss，主要时为了提高精度less_three_accuracy = less_three_percentage(depth_image, estimated_depth_image, depth_interval)return masked_mae, less_one_accuracy, less_three_accuracy

万变不离其宗，损失函数的的计算就是像素差，论文中的公式的是这样的：

大家要注意的一个点是，深度图作差的时候，只对有效的像素进行的loss计算（结合mask实现）。也就是前景，背景是没有计算loss。还有就是其上的两个函数：

less_one_accuracy = less_one_percentage(depth_image, estimated_depth_image, depth_interval)
less_three_accuracy = less_three_percentage(depth_image, estimated_depth_image, depth_interval)

为什这样了？就是说，当网络越来越好的时候，该两个函数也会也会参与loss计算，让精度更加高。本人感觉很鸡助啊，有时间的朋友可以测试下是不是鸡助。

结语

到这里，整个网络基本都通了把，知道深度图是怎么来的，是吧。下面我们就开始对测试的源码进行解析了，其中还涉及到了对点云的重构。