Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection

Exploring foci of: arXiv (Cornell University) Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection July 2024 • Zhili Chen, Shuangjie Xu, Maosheng Ye, Zian Qian, Xiaoyi Zou, Dit‐Yan Yeung, Qifeng Chen The Bird's-Eye-View (BEV) representation is a critical factor that directly impacts the 3D object detection performance, but the traditional BEV grid representation induces quadratic computational cost as the spatial resolution grows. To address this limitation, we present a new camera-based 3D object detector with high-resolution vector representation: VectorFormer. The presented high-resolution vector representation is combined with the lower-resolution BEV representation to efficiently exploit 3D geometry from … Open Article Page

Artificial Intelligence Computer Vision Computer Science Object Detection Image Resolution Remote Sensing Geography Law Politics Open Article