CLOCs is a novel Camera-LiDAR fusion method for 3D object detection in autonomous driving. Being fed with the predictions from the 2D detection pipeline (with camera image as input) and 3D detection pipeline (with LiDAR point cloud as input) in parallel, a light-weight fusion network is trained to fuse the 2D/3D prediction properly and refine the scores of the 3D detection results. CLOCs decouples the 2D/3D pipelines in the fusion framework, making it convenient to adopt different 2D/3D pipelines to strike a balance between accuracy and efficiency. The following images show the result of CLOCs.
Figure 1. CLOCs Example