Intelligent Systems with Applications • Vol 22
DeLiVoTr: Deep and light-weight voxel transformer for 3D object detection
March 2024 • Gopi Krishna Erabati, Hélder Araújo
The image-based backbone (feature extraction) networks downsample the feature maps not only to increase the receptive field but also to efficiently detect objects of various scales. The existing feature extraction networks in LiDAR-based 3D object detection tasks follow the feature map downsampling similar to image-based feature extraction networks to increase the receptive field. But, such downsampling of LiDAR feature maps in large-scale autonomous driving scenarios hinder the detection of small size objects, su…