H.266/VVC测试软件VTM

VTM简介

JVET于2018年4月10日美国圣地亚哥会议上，为新一代视频编码标准定名为Versatile Video Coding，正式开启了H.266/VVC的标准化进程。

从H.265跟过来的同学们肯定都知道中间存在一个JEM（Joint Exploration Model ），实际就是一个过渡版本，其中的技术在H.266/VVC中并不一定会采纳，要重新评估决定。

在4月的J会议上，发布了测试软件VTM1.0，详见J1002。之前所说的重新洗牌的局面并没有初现，VTM1.0还是沿用了HM的编码框架，换用多类型树（QT+MTT 四叉树+二叉树/三叉树）的编码结构，能做的只是在框架中的各个模块细化、扩展、抠细节。

下载地址：https://jvet.hhi.fraunhofer.de/svn/svn_VVCSoftware_VTM

变化

确定的内容中相对于H.265变化有：

删除：
• Special strong boundary smoothing for 32×32 luma block intra prediction
• Boundary smoothing across edges for intra prediction (a horizontal filter for vertical prediction and vice versa, and the first row and column with DC prediction)
• DST-VII style transform in 4×4 intra blocks
• Mode-dependent scan for intra blocks
• Quantization weighting matrices
• Residual sign bit hiding
• VPS and VPS VUI
• Dependent slices
• Tiles
• Wavefronts (entropy coding sync)

由于上述改动对特定元素的影响，有以下改动：
• Partitioning of a CU into multiple PUs (including asymmetric partitionings)
• Partitioning of a CU into multiple luma blocks for intra prediction (i.e., signalling of multiple luma intra prediction modes for a CU), except for implicit splits when the CU size is too large for the maximum transform size
• The coding unit syntax element part_mode
• Partitioning of a CU into multiple TUs, except for implicit splits when the CU size is too large for the maximum transform size
• Transforms that are applied across prediction block boundaries
• The syntax element split_transform_flag
• Non-aligned luma and chroma transform blocks
• All VPS and VPS VUI syntax
• SPS syntax elements
o log2_min_luma_transform_block_size_minus2 (always use 4x4 luma and corresponding chroma)
o log2_diff_max_min_luma_transform_block_size
o max_transform_hierarchy_depth_inter
o max_transform_hierarchy_depth_intra
o amp_enabled_flag

QT+MTT

VTM1.0基本就是只改动了编码结构，其他部分并没有大的变化。下面来看下现在的QT+MTT结构。

之前有讲过HM的QT、JEM的QT+BT，详见https://blog.csdn.net/lin453701006/article/details/52753724。

QT+MTT四叉树划分与HM相同，二叉树/三叉树划分结构示意图如下，就是在JEM的QT+BT基础上增加了三叉树划分，使得划分变得更加灵活。划分后的CU可以为正方形或矩形。

划分时，不限定父块的形状，也就是说矩形也可以继续划分二叉树或三叉树。当然也会限制一些划分情况，防止出现多余的划分如下图。

下图是QT+MTT划分顺序的示例，首先进行QT划分，当QT划分结束后，会进行MTT（BT+TT）划分。
有两点需要注意：
1.MTT划分得到的CU不会在进行QT。
2.MTT划分中，BT划分得到的CU可以继续进行MTT，MTT得到的CU也可以进行BT。

VTM编码结构中，不再区分CU、PU、TU，也就是说，划分得到的CU直接用于预测和变换，PU和TU不再单独划分。

VTM代码阅读指南

1.如果是看过HM的同学就不用看这部分了。对于没有看过HM的初学者，建议首先确定一个具体功能函数来看，然后一层一层往外跳，直到熟悉整个框架。比如可以从帧间运动估计xMotionEstimation开始，向下学习整体的MV搜索算法，包括整像素、亚像素搜索等。在阅读xMotionEstimation过程中，也是对理论知识深入学习的过程，比看书学到的更多、更深刻。当xMotionEstimation看完，此时基本熟悉了VTM的代码习惯，再跳出来看上层的整体帧间预测其他部分。

2.VTM相比HM变化最大就是CU、PU、TU的类，增加了CodingStructure，详见BMS/VTM代码学习：CodingStructure和CodingUnit。如果之前看过HM代码，需要改下CU的使用习惯，个人感觉VTM中对CU的操作更方便。

3.VTM已经对JEM很多冗长的代码进行了精简，看上去舒服多了。和HM一样，在VTM中，会通过宏定义来标注新增的内容如下，这样看新增技术的时候很方便。

#if JVET_K0357_AMVRif( m_pImvTempCS && !slice.isIntra() ){const unsigned maxMEPart = tempCS->pcv->only2Nx2N ? 1 : NUMBER_OF_PART_SIZES;for( unsigned p = 0; p < maxMEPart; p++ ){tempCS->initSubStructure( *m_pImvTempCS[wIdx][p], partitioner.chType, partitioner.currArea(), false );}}
#endif