Lyft 3D Object Detection

初步计划：

1. Esential content：3D object detection over semantic maps

1.1 Main task：

This competition is evaluated on the mean average precision at different intersection over union (IoU) thresholds. The IoU of a set of predicted 3D bounding volumes and ground truth bounding volumes is calculated as:

At each threshold value tt, a precision value is calculated based on the number of true positives (TP), false negatives (FN), and false positives (FP) resulting from comparing the predicted object to all ground truth objects:

The average precision of a single image is calculated as the mean of the above precision values at each IoU threshold:

3D Context

The difference between the 2D and 3D bounding volume contexts is small. In the 3D context we reduce the bounding volume to a ground bounding box and a height. The IoU is then the intersection of the ground bounding boxes * the intersection of the height differences, divided by the union of the bounding boxes.

Submission File

The submission format requires a space delimited set of bounding volume parameters. For example:

97ce3ab08ccbc0baae0267cbf8d4da947e1f11ae1dbcb80c3f4408784cd9170c,1.0 2742.15 673.16 -18.65 1.834 4.609 1.648 2.619 car

indicates that sample 97ce3ab08ccbc0baae0267cbf8d4da947e1f11ae1dbcb80c3f4408784cd9170c has a bounding volume with a confidence of 0.5, center_x of 2742.15, center_y of 673.16, center_z of -18.65, width of 1.834, length of 4.609, height of 1.648, yaw of 2.619, and a class_name of car.

1.2 数据内容概述

Type	Data	Quantity	Explain
Train	train_data
	train_images	158,757	22,680*7=158,760
	train_lidar	30,744
	train_maps
	train.csv	22,680
Test	test_data
	test_images	192,276	27,468 * 7
	test_lidar	27,468
	test_maps
	sample_submission.csv	27,468

train_data.zip and test_data.zip - contains JSON files with multiple tables. The most important is sample_data.json, which contains the primary identifiers used in the competition, as well as links to key image / lidar information.

sample_data.json:

calibrated_sensor.json

train_images.zip and test_images.zip - contains .jpeg files corresponding to samples in sample_data.json

train_lidar.zip and test_lidar.zip - contains .jpeg files corresponding to samples in sample_data.json
train_maps.zip and test_maps.zip - contains maps of the entire sample area.

train.csv - contains all sample_tokens in the train set, as well as annotations in the required format for all train set objects.

sample_submission.csv - contains all sample_tokens in the test set, with empty predictions.

1.3 现行可用算法

Lidar only

Voxelnet:

https://github.com/qianguih/voxelnet
https://github.com/tsinghua-rll/VoxelNet-tensorflow

Complex YOLO

Complex-YOLO_An Euler-Region-Proposal for Real-time 3D Object Detection on Point Clouds

Lidar with image

LaserNet

https://arxiv.org/pdf/1904.11466.pdf

2. 第一版提交：

争取在一到两周内提交第一版结果。

3. 研究主要方案，采用通用方案/技巧、优化第一版结果；

3.1 数据格式转换

Training data:

[Done] Lyft to KITTY

[Done] KITTY to train/test split

[Done] KITTY to pickle

Test data:

[Done] Lyft to KITTY

Input: lyft_3D_object_detection/data/3d-object-detection-for-autonomous-vehicles_2/ (test_data, test_images, test_lidar, test_maps)
Tool: converting-lyft-dataset-to-kitty-format-test-set.ipynb
Output: /ref_codes/lyft2KITTY/kitti_format_val/ (calib, image_2, label_2, velodyne)

如何只从Lyft test中转KITTY文件，

前提是需要知道要转出哪些文件夹? 唯一不同于training文件的是不用生成label文件夹;
没有label是否还可以转? 可以转

[TO CODE] KITTY to train/test split

是否可以直接从train文件夹直接转成validate的pickle文件?

[TO CODE] KITTY to pickle:

Input: kitty/frustum_carpedcyc_val.pickle 的生成需要的数据有：

os.path.join(BASE_DIR, 'image_sets/val.txt')

KITTI/object/testing/下面除label_2之外的三个文件夹

Output: Test时候需要那些数据: 只需要 kitty/frustum_carpedcyc_val.pickle

修改部分：

1. 修改数据源路径： ROOT_DIR = '/media/sda1/projects/kaggle_competitions/lyft_3D_object_detection/ref_codes/lyft2KITTY/kitti_format_val'

Tools: kitti/prepare_data_lyft.py => 调整为 kitti/prepare_val_data_lyft.py

extract_frustum_data(idx_filename, split, output_filename, viz=False,perturb_box2d=False, augmentX=1, type_whitelist=['Car']) 这里不需要调整， 后面代码有补充;

Test结果如何修改?

4. 优化点：

增加epoch;
目前采用的是u-net，是15年出来的模型，可以采用voxel-net;
此外complex YOLO也是一个可选项;
以上方法都是针对Lidar数据进行的预测，实际上可以结合image一起进行预测，根据研究表明会更准确;
外部数据KITTI应该也可以用来作为训练用的，更多数据应该有更高的精度;
[KITTI可以，这里不可以]是否可以结合双目图片进一步提高图片训练的结果，例如采用双目摄像头的的深度训练出来的3D物体检测，结合雷达数据；
其他神经网络训练技巧;
网络结构改进;

参考其他选手方案，研究主流发展方向，整理一份优化清单，从多个角度分别尝试优化代码；

5. 全程接受虎宝监督。

补充：

从kaggle下载notebook跑出的csv, image等文件：

https://www.kaggle.com/getting-started/58426

Change your kernel's working directory(it's very important to change the working directory as you will not have write access to other directories) to 'kaggle/working' using the below command

    import osos.chdir(r'kaggle/working')

Now save your dataframe or any other file in this directory as below
df_name.to_csv(r'df_name.csv')

Then in a new cell give the below command

    from IPython.display import FileLinkFileLink(r'df_name.csv')

A link will be generated, click on it and download the file and enjoy!!!

Cautions:

Change the working directory to 'kaggle/working' and then save the file and generate the link else it doesn't work, at least it didn't work for me.
A downloadable link of the file can be generated of the ones which are available in 'kaggle/working'.

References:

[1]. Lyft Dataset SDK: https://github.com/lyft/nuscenes-devkit

[2]. Lyft Dataset: https://level5.lyft.com/dataset/

[3]. KITTY Dataset: http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d

[4]. 3D object detection with deep learning

[5]. 3D Object detection using Deep Learning

[6]. Paper with code

[7]. Comparison of 3D Detection Techniques

[8]. https://paperswithcode.com/sota/3d-object-detection-on-kitti-cars-easy

https://github.com/charlesq34/frustum-pointnets

https://github.com/sshaoshuai/PointRCNN

https://github.com/zhixinwang/frustum-convnet