Corners for Layout: End-to-End Layout Recovery from 360 Image

[paper] : Corners for Layout: End-to-End Layout Recovery from 360 Image

[TensorFlow] [Corners for Layout : PyTorch] [Equirectangular Convolutions : PyTorch]

Corners for Layout: End-to-End Layout Recovery from 360 Images

室内场景三维布局恢复问题是十多年来的一个核心研究课题。然而，仍有几个重大挑战尚未解决。在最相关的方法中，最先进的方法中有很大一部分对场景进行了隐式或显式的假设，例如盒形或曼哈顿布局。此外，目前的方法计算复杂，不适合实时应用，如机器人导航和AR/VR。

本文提出了CFL (用于布局的角)，这是第一个用于360图像三维布局恢复的端到端模型。实验结果表明，本文提出方法的性能优于目前的技术水平，对该系统的假设比其他工作更少，成本更低。文章还证明了，通过使用一种直接应用于球面投影的卷积Equirectangular Convolutions，该模型比传统方法更能推广到相机位置变化。

本博客关注卷积方面的内容，至于 layout 方面的研究，读者可以参考原文，这里跳过这些内容，直接介绍卷积操作。

Equirectangular Convolutions

Spherical images are receiving an increasing attention due to the growing number of omnidirectional sensors in drones, robots and autonomous cars. A na¨ıve application of convolutional networks to a equirectangular projection, is not, in principle, a good choice due to the space-varying distortions introduced by such projection.

由于无人机、机器人和自动驾驶汽车上的全向传感器越来越多，球形图像正受到越来越多的关注。一个简单的方法是应用卷积网络来实现 equirectangular 投影。但是原则上，不是一个好的选择，因为这是一个空间变化的扭曲投影（也就是说，随着空间位置的不同，扭曲程度也不同）。

In this section we present a convolution that we name EquiConv, which is defined in the spherical domain instead of the image domain and it is implicitly invariant to equirectangular representation distortions. The kernel in EquiConvs is defined as a spherical surface patch –see Figure 4.

提出了 EquiConv 的卷积，它是在球面域中定义的，而不是在图像域中定义的，它对等矩形表示的失真具有隐式不变性。

EquiConvs 中的核被定义为一个球面 patch，如图 4 所示：

We parametrize its receptive field by the angles αw and αh. Thus, we directly define a convolution over the field of view. The kernel is rotated and applied along the sphere and its position is defined by the spherical coordinates (φ and θ in the figure) of its center.

Unlike standard kernels, that are parameterized by their size kw × kh, with EquiConvs we define the angular size (αw × αh) and resolution (rw × rh). In practice, we keep the aspect ratio, αw / rw = αh / rh , and we use square kernels, so we will refer the field of view as α (αw = αh) and the resolution as r (rw = rh) respectively from now on.

As we increase the resolution of the kernel, the angular distance between the elements decreases, with the intuitive upper limit of not giving more resolution to the kernel than the image itself. In other words, the kernel is defined in a sphere, being its radius less or equal to the image sphere radius. EquiConvs can also be seen as a general model for spherical Atrous Convolutions [4, 5] where the kernel size is what we call resolution, and the rate is the field of view of the kernel divided by the resolution. An example of the differences of EquiConvs by modifiying α and r can be seen in Figure 5.

采用 αw 和 αh 参数化感受野。因此，直接定义视场上的卷积。核沿着球体旋转并施加，其位置由其中心的球坐标 (图中为φ和θ)定义。

与标准的内核不同，标准内核是由它们的大小 kw × kh 参数化的，而在 EquiConvs，定义了角度大小 (αw × αh) 和分辨率 (rw × rh)。在实际操作中，我们保留长宽比，αw / rw = αh / rh，并使用正方形 kernel，所以从现在开始我们将视场表示为α (αw = αh)，分辨率表示为 r (rw = rh) 。

当我们增加核的分辨率时，元素之间的角距离减小，直观的上限是核的分辨率不超过图像本身。换句话说，核被定义在一个球体中，即它的半径小于或等于像球的半径。EquiConvs 也可以被看作是球面卷积的一般模型[4,5]，其中核的大小就是我们所说的分辨率，速率是核的视场除以分辨率。图 5 显示了通过修改 α 和 r 来区分 EquiConvs 的一个例子。

EquiConvs Details

In [7], they introduce deformable convolutions by learning additional offsets from the preceding feature maps. Offsets are added to the regular kernel locations in the Standard Convolution enabling free form deformation of the kernel.

Inspired by this work, we deform the shape of the kernels according to the geometrical priors of the equirectangular image projection. To do that, we generate offsets that are not learned but fixed given the spherical distortion model and constant over the same horizontal locations. Here, we describe how to obtain the distorted pixel locations from the original ones.

本文的 EquiConvs 启发于可变形卷积。不同的是，EquiConvs 的偏移量不是学习得到的，而是根据球面几何先验之间计算得到的。这个几何先验就是：球面畸变模型 spherical distortion model 和水平位置不变性 constant over the same horizontal locations （就是说，在 equirectangular 图像中同一行，水平位置的偏移量是相同的）。

Let us define ( $u_{0,0}$ , $v_{0,0}$ ) as the pixel location on the equirectangular image where we apply the convolution operation (i.e. the image coordinate where the center of the kernel is located).

First, we define the coordinates for every element in the kernel and afterwards we rotate them to the point of the sphere where the kernel is being applied. We define each point of the kernel as

where i and j are integers in the range and d is the distance from the center of the sphere to the kernel grid. In order to cover the field of view α,

We project each point into the sphere surface by normalizing the vectors, and rotate them to align the kernel center to the point where the kernel is applied.

where Ra(β) stands for a rotation matrix of an angle β around the a axis. φ0,0 and θ0,0 are the spherical angles of the center of the kernel –see Figure 4, and are defined as

where W and H are, respectively, the width and height of the equirectangular image in pixels.

首先，我们为内核中的每个元素定义坐标，然后将它们旋转到应用内核的球面上的点。

i 和 j 是这个范围内的整数 , d 是从球面中心到核网格的距离，并且为了覆盖视野 α，d 由（4）式表示。

我们通过对向量进行归一化，将每个点投影到球面中，并旋转它们以使核中心与核的点对齐。

Ra(β) 代表角度 β 绕一个轴的旋转的矩阵； φ0,0 and θ0,0 是核的圆心的球角-参见图4，被定义为（6）式。

Finally, the rest of elements are back-projected to the equirectangular image domain.

First, we convert the unit sphere coordinates to latitude and longitude angles:

And then, to the original 2D equirectangular image domain:

In Figure 6 we show how these offsets are applied to a regular kernel; and in Figure 7 three kernel samples on the spherical and on the equirectangular images.

最后，其余的元素被反投影到等矩形图像域。

首先，根据公式（7）我们将单位球坐标转换为经纬度；然后根据公式（8）变换到原二维等矩形图像域。

在图 6 中，展示了如何将这些偏移量应用于普通内核；而在图 7 中，在球面和等矩形图像上的三个核样本。

360度相机（全景图片）中的卷积（一）：Equirectangular Convolutions相关推荐

360度相机（全景图片）中的卷积（二）：SphereNet: Spherical Representations
360度相机(全景图片)中的卷积(一):Equirectangular Convolutions 360度相机(全景图片)中的卷积(二):SphereNet: Spherical Representa ...
[论文速读]：全景相机（360度相机）室内图像的景深估计 Depth Estimation for Indoors Spherical Panoramas （三篇）
[论文速读]:全景相机(360度相机)室内图像的景深估计 Depth Estimation for Indoors Spherical Panoramas (三篇) 全景相机(360度相机)室内图像有 ...
全景（360 度相机）图像数据集 3D60 Dataset 下载步骤（详细）
3D60 Dataset 下载步骤 (详细) 3D60 Dataset 是研究全景相机.360度相机必不可少的数据集. 目录 3D60 Dataset 下载步骤 (详细) 数据集简介数据集下载方法 ...
360度绩效评估中的6个关键点，尤其是第4个！
做不好360度评估通常是什么因素导致的,根据大部分企业的实施经验总结为以下几点: 一."评估就是发展"误区当一份份评估报告呈现在每位评价对象面前时,他们往往跳过各种评估说明.维度 ...
使用Javascript来创建一个响应式的超酷360度全景图片查看幻灯效果
在线演示本地下载 360度的全景图片效果常常可以用到给客户做产品展示,今天这里我们推荐一个非常不错的来自Robert Pataki的360全景幻灯实现教程,这里教程中将使用javascript来打 ...
java全景图片生成_[Java教程]使用Javascript来创建一个响应式的超酷360度全景图片查看幻灯效果...
[Java教程]使用Javascript来创建一个响应式的超酷360度全景图片查看幻灯效果 0 2015-07-23 18:00:14 360度的全景图片效果常常可以用到给客户做产品展示,今天这里我们 ...
360 度评估中如何评价他人
反馈实例将帮助你轻松地填写这些调查,有效地传达你的感受.像生活中一样,在360度反馈调查中,你可以给出两种反馈.通常,反馈被分类为积极或消极.不过,更好方法是将反馈归类为强化或重定向员工反馈.这消除了 ...
360 度评估中的提问示范
有多种衡量员工绩效的方法,从自上而下的审查到同行之间的审查.360度绩效评估是指经理.决策者.客户和同行对员工的绩效进行评估.评价者的总数从六人到二十人不等,而且还有自我评价的空间. 所有的反馈都会被 ...
1小时教你做360度全景“小星球”效果图 Skillshare – Create a Panoramic ‘Little Planet’ from Anywhere
1小时教你做360度全景"小星球"效果图 Skillshare – Create a Panoramic 'Little Planet' from Anywhere 1小时教你做3 ...

360度相机（全景图片）中的卷积（一）：Equirectangular Convolutions

Corners for Layout: End-to-End Layout Recovery from 360 Image

Corners for Layout: End-to-End Layout Recovery from 360 Images

Equirectangular Convolutions

EquiConvs Details

360度相机（全景图片）中的卷积（一）：Equirectangular Convolutions相关推荐

最新文章

热门文章