关于scene understanding场景理解概念的理解

Scene understanding 场景理解感觉定义并不是十分明确，找了几个供参考。

LSUN Challenge 大规模场景理解比赛

INTRODUCTION
The PASCAL VOC and ImageNet ILSVRC challenges have enabled significant progress for object recognition in the past decade. Beginning with CVPR 2015, we borrowed this mechanism to speed up the progress for scene understanding via the LSUN workshop. Complementary to the object-centric ImageNet ILSVRC Challenge hosted at ICCV/ECCV every year, we propose to continue hosting this scene-centric challenge at CVPR every year. Our challenge will focus on major tasks in scene understanding, including scene object retrieval, outdoor scene segmentation, RGB-D 3D object detection and saliency prediction. Inspired by recent successes using big data, such as deep learning, we focus on providing benchmarks that are significantly bigger and more diverse than the existing ones, to support training these data-hungry algorithms. By providing a set of large-scale benchmarks in an annual challenge format, we expect significant progress to continue for scene understanding in the coming years. Given the experience of our previous workshops, we are updating all of our existing tasks and rolling out new tasks.
链接 http://lsun.cs.princeton.edu/2017/
从这个比赛的介绍可以看出，场景理解主要关注的任务有

scene object retrieval 场景目标检索
outdoor scene segmentation 室外场景分割
RGB-D 3D object detection RGB-D 3D 目标检测
saliency prediction 显著性预测

综述Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art

论文链接 https://arxiv.org/pdf/1704.05519.pdf
在这篇综述的第10章中，对于场景理解是这样描述的
One of the basic requirements of autonomous driving is to fully understand its surrounding area such as a complex traffic scene. The complex task of outdoor scene understanding involves several sub-tasks such as depth estimation, scene categorization, object detection and tracking, event categorization, and more. Each of these tasks describe particular aspect of a scene. It is beneficial to model some of these aspects jointly to exploit the relations between different elements of the scene and obtain a holistic understanding. The goal of most scene understanding
models is to obtain a rich but compact representation of the scene including all its elements e.g., layout elements, traffic participants and the relations with respect to each other. Compared to reasoning in the 2D image domain, 3D reasoning plays a significant role in solving geometric scene understanding problems and results in a more informative representation of the scene in the form of 3D object models, layout elements and occlusion relationships. One specific challenge in scene understanding is the interpretation of urban and sub-urban traffic scenarios. Compared to highways and rural roads, urban scenarios comprise many independently moving traffic participants, more variability in the geometric layout of roads and crossroads, and an increased level of difficulty due to ambiguous visual features and illumination changes.
可以看出，在这里，户外场景理解（面向自动驾驶领域的）包括几个子任务：

深度估计
场景分类
目标检测和跟踪
事件分类

MIT 自动驾驶公开课

里面第三次课提到了，场景理解是自动驾驶需要解决的几大任务（定位与建图，场景理解，运动规划，驾驶员状态）之一。
可以直观理解成为Where is someone else?
其中提到的例子主要有
- 关于目标检测的
- 关于驾驶全场景分割的，比如说SegNet
- 从音频数据得到路况信息，分析路面纹理特征等

Lecun的一个ppt

看到lecun关于深度学习和场景理解的一个ppt
里面大概是这样理解场景理解

目标检测
语义分割
场景解析和标注 Scene Parsing and Labelling

国内论文

自动化学报上的
目前视觉场景理解还没有严格统一的定义.参考麻省理工、卡耐基梅隆、斯坦福等大学的国际著名科研团队的研究工作[2−4],视觉场景理解可表述为在环境数据感知的基础上,结合视觉分析与图像处理识别等技术手段,从计算统计、行为认知以及语义等不同角度挖掘视觉数据中的特征与模式,从而实现场景有效分析、认知与表达.近年来结合数据学习与挖掘、生物认知特征和统计建模方法构建的视觉场景认知理解系统。

读都没读顺……