阅读报告——Learned Point Cloud Geometry Compression

本文提出了一个具有端到端（End-to-end）超先验知识（Learned, hyperprior ）的PCGC的模型。本文所描述的成果参数少、性能好，并且可以有更好并行性。

Worth noticing in Intro

超先验信息被用来提升潜在特征的条件概率模型性能（Hyperpriors are used to improve the conditional probability modeling of latent features. ）。本文使用variational autoencoder (VAE)来利用超先验信息。
end-to-end已经被证明有更好的率失真表现。

Model & Structure

大致的框架和昨天所介绍的【点云阅读笔记】Point Cloud Coding: Adopting a Deep Learning-based Approach 并大致相同，但是可能这里会更加地详细而已。

Pre-processing

在预处理中，经过了体素化、缩放、划分。

体素化其实就是将其分为一个又一个的小正方体。

同时，我们要引入一个概念：精度。对于一个 $(i, j, k)$ 的体素，其精度为每个维度中可达到的最大值（Point cloud precision sets the maximum achievable value in each dimension）。例如，10b的精度，则 $\leq i, j, k \leq 2^{10}-1$ 。

至于缩放，这里的缩放是直接将其线性乘。这个缩放因子 $s$ 是 $s < 1$ 的，它是一种下采样。（原文还提出了一个future research 说，可以在神经网络内考虑采用一个适应性的缩放因子）

$X^n=ROUND⁡(Xn×s)=ROUND⁡(in×s,jn×s,kn×s)\begin{aligned}\hat{\mathbf{X}}_{n} &=\operatorname{ROUND}\left(\mathbf{X}_{n} \times s\right) \\&=\operatorname{ROUND}\left(i_{n} \times s, j_{n} \times s, k_{n} \times s\right)\end{aligned}$

划分是指将其划分为不重叠的小方块，每个小方块大小为 $W×W×WW\times W\times W$ 。要注意的是，只有方块里有点的方块才是我们编码压缩的对象。

Cube-based Learned-PCGC

本文的最明确的一个idea就在此：通过有效地探索局部和全局空间相关性，通过基于CNN的堆叠自动编码器，学习到的2D变换在图像压缩中表现出了良好的编码性能。于是，作者利用3DCNN将图像压缩相关的思想迁移于点云中。

首先要阐述的一点是，Analysis transform 和 Synthesis transform是一组编码器——解码器对，在main codec和hyperpriors codec都用到了这两种transform。

在hyperpriors的codec中，它起到的作用其实是针对于潜在特征建模。因此，在hypercodec中，它使用了3个轻量级的3D卷积，旨在进一步地降采样。（原文Given that hyperpriors are mainly used for latent feature entropy modeling, we apply three consecutive lightweight 3D convolutions (with further downsampling mechanism embedded) instead of in hyper codec.，感觉是有笔误的：应该是instead of in main codec）

而在main codec中，它在三个卷积里面，还增加了一个Voxception-ResNet (VRN)。增加这个模块是希望运用残差网络的优势，来实现这个提取特征的目的。

量化

因直接rounding是不可微的，因此这里的量化，是在原始的潜在特征向量 $y$ 下，加入：

$y^=y+μ\hat y = y + \mu$

其中， $μ∼U(−12,12)\mu \sim U(-\frac 1 {2}, \frac 1 {2})$ ，即 $(−12,12)(-\frac 1 {2}, \frac 1 {2})$ 的均匀分布。

Entropy Rate Modeling

本文考虑了用算术编码来压缩量化后的潜在特征。在这里，还提出了一个：

理论上，源符号（例如，特征元素）的熵界与其概率分布密切相关，更重要的是，准确的速率估计在有损压缩中起着关键作用，以实现率失真优化（Theoretically, the entropy bound of the source symbol (e.g., feature element) is closely related to its probability distribution, and more importantly, accurate rate estimation plays a key role in lossy compression for rate-distortion optimization）。

对于实际的比特率：

$Ry^=Ey^[−log⁡2py^(y^)]R_{\hat{y}}=E_{\hat{y}}\left[-\log _{2} p_{\hat{y}}(\hat{y})\right]$

对于rate-modeling，它的性能的提升，可以用先验知识提升。如果可以使用足够的先验信息 $z^\hat z$ ，我们就可以用其更好地估计 $y^\hat y$ 。事实上，这里的先验知识 $z^\hat z$ 在此时，其实是用 $y^\hat y$ 降采样了所得到的 $z^\hat z$ （这部分实在看不懂，如果不对请斧正，谢谢！）

这几个式子确实是看不懂，请各位大神点拨一二！

$pz^∣ψ(z^∣ψ)=∏i(pz^i∣ψ(i)(ψ(i))∗U(−12,12))(z^i)p_{\hat{z} \mid \psi}(\hat{z} \mid \psi)=\prod_{i}\left(p_{\hat{z}_{i} \mid \psi^{(i)}}\left(\psi^{(i)}\right) * \mathcal{U}\left(-\frac{1}{2}, \frac{1}{2}\right)\right)\left(\hat{z}_{i}\right)$

$py^∣z^(y^∣z^)=∏i(L(μi,σi)∗U(−12,12))(y^i)p_{\hat{y} \mid \hat{z}}(\hat{y} \mid \hat{z})=\prod_{i}\left(\mathcal{L}\left(\mu_{i}, \sigma_{i}\right) * \mathcal{U}\left(-\frac{1}{2}, \frac{1}{2}\right)\right)\left(\hat{y}_{i}\right)$

Rate-distortion Optimization

分为两部分：Rate Estimation 和 Distortion Measurement。在Rate Estimation 里，所使用的是
$Ry^=∑i−log⁡2(py^i∣z^i(y^i∣z^i))Rz^=∑i−log⁡2(pz^i∣ψ(i)(z^i∣ψ(i)))\begin{aligned} &R_{\hat{y}}=\sum_{i}-\log _{2}\left(p_{\hat{y}_{i} \mid \hat{z}_{i}}\left(\hat{y}_{i} \mid \hat{z}_{i}\right)\right) \\ &R_{\hat{z}}=\sum_{i}-\log _{2}\left(p_{\hat{z}_{i} \mid \psi^{(i)}}\left(\hat{z}_{i} \mid \psi^{(i)}\right)\right) \end{aligned}$
在 Distortion Measurement里，所使用的是：
$DWBCE=1No∑No−log⁡px~o+α1Nn∑Nn−log⁡(1−px~n)D_{\mathrm{WBCE}}=\frac{1}{N_{o}} \sum^{N_{o}}-\log p_{\tilde{x}_{o}}+\alpha \frac{1}{N_{n}} \sum^{N_{n}}-\log \left(1-p_{\tilde{x}_{n}}\right)$

【点云阅读笔记】Learned Point Cloud Geometry Compression相关推荐

毫米波点云生成论文阅读笔记 | 3D Point Cloud Generation with Millimeter-Wave Radar
毫米波点云生成论文 | 3D Point Cloud Generation with Millimeter-Wave Radar Kun Qian, Zhaoyuan He, Xinyu Zhang ...
Spring Cloud文档阅读笔记-初识Spring Cloud（对Spring Cloud初步了解）
首先要知道的是Spring Cloud是微服务架构. 微服务架构是一种架构模式,它将单一的应用程序划分成一组很小的服务,服务之间相互协调.互相配合.每个服务都运行在独立的进程中,服务与服务间采用轻量级 ...
Learning Multiview 3D point Cloud Registration论文阅读笔记
Learning multiview 3D point cloud registration Abstract 提出了一种全新的,端到端的,可学习的多视角三维点云配准算法. 多视角配准往往需要两个阶段 ...
点云配准论文阅读笔记--Comparing ICP variants on real-world data sets
目录写在前面点云配准系列摘要 1引言(Introduction) 2 相关研究(Related work) 3方法( Method) 3.1输入数据的敏感性 3.2评价指标 3.3协议 4 模块 ...
ubuntu下使用CAJ云阅读--CAJViewer(Cloud)
摘要:Linux(Ubuntu)没有直接打开caj论文格式的软件.网上流传最多的"CAJViewer6.0_green"."CAJViewer7.2"都没法正常 ...
[云炬ThinkPython阅读笔记]2.6 字符串运算
[云炬ThinkPython阅读笔记]1.8 术语表
[云炬ThinkPython阅读笔记]2.3 表达式和语句
[云炬ThinkPython阅读笔记]1.8 术语表
【点云压缩】Lossless Coding of Point Cloud Geometry using a Deep Generative Model
Lossless Coding of Point Cloud Geometry using a Deep Generative Model 用Deep generative model 来做点云无损压 ...
阿里云天池大赛赛题解析(深度学习篇)--阅读笔记1--赛题一
阿里云天池大赛赛题解析(深度学习篇)–阅读笔记1 [x]表示遇到不懂的知识,将在[知识补充]给出具体讲解. 文章目录阿里云天池大赛赛题解析(深度学习篇)--阅读笔记1 前言赛题一瑞金医院MMC人 ...
点云配准论文阅读笔记--(4PCS)4-Points Congruent Sets for Robust Pairwise Surface Registration
目录点云配准系列写在前面 Abstract摘要 1 Introduction引言 2 Background研究背景 RANSAC Randomized Alignment 3 Approximat ...

【点云阅读笔记】Learned Point Cloud Geometry Compression