CVPR 2020 开源论文 | 多种可能性行人未来路径预测

©PaperWeekly 原创 · 作者｜梁俊卫

学校｜卡耐基梅隆大学博士生

研究方向｜计算机视觉

在这篇文章里我将介绍我们最新在 CVPR'20 上发表的工作：The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction，主题是关于多种可能性的行人未来路径预测。我们的数据集和代码已经全部开源，里面包括完整的在 3D 模拟器中重建多种可能性未来行人路径的 tutorial，欢迎尝试。

论文标题：The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

论文链接：https://arxiv.org/abs/1912.06445

代码链接：https://github.com/JunweiLiang/Multiverse

行人未来路径预测问题：你能预测下面行人的未来路径吗？

在这篇文章里，我们研究的是多种可能性的行人未来预测问题。从下面的例子可以看到，这个人可能会走向几个不同的方向：

我们提出新的数据集：The Forking Paths Dataset

在真实世界的视频中，我们只能看到一种可能的事件发生，比如上面的例子中，红框中的人是一直往前走，但也许在另一个平行宇宙中，他可能走向另外一个不同的方向，但我们在真实视频中无法观察到。

为了能够取得一个能够量化评估多种可能性路径预测模型的数据集，我们使用基于游戏引擎-虚幻 4 的 3D 模拟器（CARLA [3] ）创建了一个新的 trajectory prediction 数据集。

在这个数据集中，我们重建了真实世界的场景和动态事件，然后让标注者控制 agents 走到设置好的目标点，记录下这些能反映真实人类在同样情况下可能会走的路径。

▲ 重建真实动态场景到3D模拟器中3

多名人类标注者观察该场景 4.8 秒后就可以以第一人称或者第三人称控制 agent 走到目的地。我们希望在这种方式下，可以在同样的场景中，捕捉到人类真实的反应以及可能选择的路线。

▲ 标注界面

以下是我们数据集的展示:

在我们的设定中，标注者会先观察 4.8 秒时间（如下图中的黄色路线），然后就可以控制 agent 走到目的地点。整个标注过程限时 10.4 秒，然后如果跟其他 agent 碰撞到的话会要求重新标注。

标注完成后，我们在 3D 模拟器中选择多个摄像头位置和角度进行数据录取，可以模拟一般的 45 度角监控视频的角度，也有头顶的无人机视频角度。我们甚至可以使用不同的天气状况和光照条件。

整个数据集，代码，以及 3D assets 都已经开源，详见我们的 Github repo [4]。里面包含了一个详细的建立这个数据集的 tutorial，对 3D 视觉和模拟器感兴趣的同学可以尝试一下。

▲ 我们提供了一个简单易用的场景可视化编辑工具

我们的新模型：The Multiverse Model

We propose a multi-decoder framework that predicts both coarse and fine locations of the person using scene semantic segmentation features.

▲ The Multiverse Model for Multi-Future Trajectory Prediction

History Encoder computes representations from scene semantics
Coarse Location Decoder predicts multiple future grid location sequences by using beam search
Fine Location Decoder predicts exact future locations based on the grid predictions
Our model achieves STOA performance in the single-future trajectory prediction experiment and also the proposed multi-future trajectory prediction on the Forking Paths Dataset.

▲ Single-Future Trajectory Prediction. The numbers are displacement errors and they are lower the better. For more details see [1].

▲ Multi-Future Trajectory Prediction on the Forking Paths Dataset. The numbers are displacement errors and they are lower the better. For more details see [1].

Qualitative analysis with the popular Social-GAN [2] model:

▲ Qualitative comparison. The left column is from the Social-GAN [2] model. On the right it is our Multiverse model. The yellow trajectory is the observed trajectory and the green ones are the multi-future trajectory ground truth. The yellow-orange heatmaps are the model outputs.

回到前面的例子，你的预测对了吗？

项目网站：

https://next.cs.cmu.edu/multiverse/

参考文献

[1] Liang, Junwei, Lu Jiang, Kevin Murphy, Ting Yu, and Alexander Hauptmann. “The garden of forking paths: Towards multi-future trajectory prediction.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. [Dataset/Code/Model]

[2] Gupta, Agrim, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. “Social gan: Socially acceptable trajectories with generative adversarial networks.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.

[3] http://carla.org/

[4] https://github.com/JunweiLiang/Multiverse

更多阅读

#投稿通道#

让你的论文被更多人看到

如何才能让更多的优质内容以更短路径到达读者群体，缩短读者寻找优质内容的成本呢？答案就是：你不认识的人。

总有一些你不认识的人，知道你想知道的东西。PaperWeekly 或许可以成为一座桥梁，促使不同背景、不同方向的学者和学术灵感相互碰撞，迸发出更多的可能性。

PaperWeekly 鼓励高校实验室或个人，在我们的平台上分享各类优质内容，可以是最新论文解读，也可以是学习心得或技术干货。我们的目的只有一个，让知识真正流动起来。

???? 来稿标准：

• 稿件确系个人原创作品，来稿需注明作者个人信息（姓名+学校/工作单位+学历/职位+研究方向）

• 如果文章并非首发，请在投稿时提醒并附上所有已发布链接

• PaperWeekly 默认每篇文章都是首发，均会添加“原创”标志

???? 投稿邮箱：

• 投稿邮箱：hr@paperweekly.site

• 所有文章配图，请单独在附件中发送

• 请留下即时联系方式（微信或手机），以便我们在编辑发布时和作者沟通

????

现在，在「知乎」也能找到我们了

进入知乎首页搜索「PaperWeekly」

点击「关注」订阅我们的专栏吧

关于PaperWeekly

PaperWeekly 是一个推荐、解读、讨论、报道人工智能前沿论文成果的学术平台。如果你研究或从事 AI 领域，欢迎在公众号后台点击「交流群」，小助手将把你带入 PaperWeekly 的交流群里。