Project page: https://nvlabs.github.io/nvdiffrec/
github项目名称叫 nvdiffrec
整体看起来跟nurf一样是对单个物体（单个场景）训练的。

输入： multi-view images, 相机位姿，背景分割mask（不知道光照情况）。
输出： triangle meshes, texture, lighting

2. Related Work

Multi-view 3D Reconstruciton

Classical methods:
- inter-image correspondences to estimate depth maps:
  - 将深度图融合为点云，可选生成mesh。
- use voxel grids
  - estimate occupancy and color for each voxel
神经隐式场
- 可微渲染，rely on ray marching for rendering, computationally expensive
显示surface表示
- 略

3. Our Approach

输入： multi-view images, 相机位姿，背景分割mask（不知道光照情况）。
输出： triangle meshes, texture, lighting

pipeline：

用DeepMarchingTet 输出四面体网格每个顶点的SDF和offsets
Marching Tet得到mesh
differentialble rasterizer 计算出texture和light，将结果渲染为2D图片。
计算2D图片loss来优化上述过程。

Optimization task:

L = Limage + Lmask + λLreg

Limage: L1 norm on tone mapped colors

对一个颜色x，先进行tone mapping 色调映射，即用下图公式转换为x’,
然后对x’求L1范数。

在第19页。
tone mapping 如下：
https://blog.csdn.net/weixin_34364135/article/details/94578662

大概意思是把高动态范围（HDR）的颜色映射到低动态范围（LDR）。比如说16位颜色映射到8位。映射的函数在论文里写了：

Lmask: squred L2

应该是背景分割mask的L2 loss吧

Lreg：惩罚邻居的符号变化

前面说是Equation 11，但equation11其实在supplementary里面。看描述可知，灵感来自DeepMarchingCube的Eqn.11，但本文用的是Equation2

Lreg 是四面体网格的所有共边的两点，如果符号不同，一个取sigmoid，一个取sign funciton，二者求交叉熵。
- 注意，看了代码，这个sign funciton是，如果大于0为正，否则为0.而不是那个小于0为-1的
目的是减少floaters and internal geometry: Intuitively, this reduces the number of sign flips and simplifies the surface, penalizing internal geometry or floaters.
大概意思是惩罚四面体网格的一条边上的符号变化
在suplementary中有消融实验
- 左边是不加这个loss
- 中间是用了Deep Marching Cubes里面的smooth loss，也就是每个grid的SDF值的L1范数求和
- 右边是加了本文这个交叉熵loss

代码如下：
nvdiffrec/geometry/dmtet.py
第148行

###############################################################################
# Regularizer
###############################################################################def sdf_reg_loss(sdf, all_edges):sdf_f1x6x2 = sdf[all_edges.reshape(-1)].reshape(-1,2)mask = torch.sign(sdf_f1x6x2[...,0]) != torch.sign(sdf_f1x6x2[...,1])sdf_f1x6x2 = sdf_f1x6x2[mask]sdf_diff = torch.nn.functional.binary_cross_entropy_with_logits(sdf_f1x6x2[...,0], (sdf_f1x6x2[...,1] > 0).float()) + \torch.nn.functional.binary_cross_entropy_with_logits(sdf_f1x6x2[...,1], (sdf_f1x6x2[...,0] > 0).float())return sdf_diff

看效果图，可能不是主要是平滑，而是减少floating和internal。

注意：没有直接用SDF值做监督，而是用MT然后可微渲染，2D监督。3D层面这个reg监督，其实已经类似于occupancy value的二分类，而不是SDF的回归了？

9. Implementation

四面体网格分辨率：128 (using 192k tetrahedra and 37k vertices).
每个点的SDF values 随机初始化为 [-0.1,0.9], 这样大概有10%的点会被认为是在里面的at the beginning of optimization. （注意，这里的意思，不用神经网络来隐式表达SDF，而是直接显示的优化每个点的SDF值。也就是对单个场景进行优化，而不是在训练一个有泛化能力的模型）
lr从1到0.1 over 5000 iterations.
GPU: 1 single NVIDIA V100
训练用时： one hour

其他

公式10：拉普拉斯

可能需要看懂second 和first pass是什么意思。
这个sigma是点坐标减去一领域点的均值。
希望second pass的点不太动弹，和first pass保持一致

读论文：(nvdiffrec) Extracting Triangular 3D Models, Materials, and Lighting From Images相关推荐

读论文Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
读论文Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank 原地址:https://blog.cs ...
[读论文]弱监督学习的精确 3D 人脸重建：从单个图像到图像集-Accurate 3D Face Reconstruction with Weakly-Supervised Learning:From
论文地址:Accurate 3D Face Reconstruction with Weakly-Supervised Learning:From Single Image to Image Set ...
读论文3：SELFEXPLAIN: A Self-Explaining Architecture for Neural Text Classifiers
标题读论文3:SELFEXPLAIN: A Self-Explaining Architecture for Neural Text Classifiers 标题 Abstract:[读论文1](h ...
李沐读论文笔记--大模型时代下做科研的四个思路
大模型时代下做科研的四个思路 0. 视频来源: 1. 提高效率(更快更小) 1.1 PEFT介绍(parameter efficient fine tuning) 1.2 作者的方法 1.3 AIM效 ...
搞科研，从好好读论文开始：沈向洋带你读论文了
「或许你永远不知道你以前读过的书能在什么时候派上用场,但请保持阅读,因为阅读的过程也是在你大脑中建立认知的过程.」对于科研人员来说,读论文是一种必修技能.去年,沈向洋博士曾在线上公开课<You ...
Re23：读论文 How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence
诸神缄默不语-个人CSDN博文目录论文名称:How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence ...
我用飞桨Parakeet合成小姐姐声音帮我“读”论文
点击左上方蓝字关注我们 [飞桨开发者说]顾茜,PPDE飞桨开发者技术专家,烟草行业开发工程师,毕业于厦门大学数学科学学院,研究方向为:人工智能在烟草行业的应用. 深度学习的论文读起来总是有点艰难,看不 ...
【医学图像分割】读论文系列 1
[医学图像分割]读论文系列 1 文章目录 [医学图像分割]读论文系列 1 Title Introduction Abstract Keyword Method Experiment Conclusio ...
读论文2：SELFEXPLAIN: A Self-Explaining Architecture for Neural Text Classifiers
SELFEXPLAIN: A Self-Explaining Architecture for Neural Text Classifiers Abstract:[上一篇:读论文1](https:// ...

读论文：(nvdiffrec) Extracting Triangular 3D Models, Materials, and Lighting From Images

2. Related Work

Multi-view 3D Reconstruciton

3. Our Approach

pipeline：