GMAFNet: Gated Mechanism Adaptive Fusion Network for 3D Semantic Segmentation of LiDAR Point Clouds Article Swipe

View

Related Concepts

No concepts available.

Xiangbin Kong , Weijun Wu , Minghu Wu , Zaihang Gui , Zhe Luo , Chunyan Miao ·

YOU? · · 2025 · Open Access · · DOI: https://doi.org/10.3390/electronics14244917 · OA: W4417332465

Three-dimensional semantic segmentation plays a crucial role in advancing scene understanding in fields such as autonomous driving, drones, and robotic applications. Existing studies usually improve prediction accuracy by fusing data from vehicle-mounted cameras and vehicle-mounted LiDAR. However, current semantic segmentation methods face two main challenges: first, they often directly fuse 2D and 3D features, leading to the problem of information redundancy in the fusion process; second, there are often issues of image feature loss and missing point cloud geometric information in the feature extraction stage. From the perspective of multimodal fusion, this paper proposes a point cloud semantic segmentation method based on a multimodal gated attention mechanism. The method comprises a feature extraction network and a gated attention fusion and segmentation network. The feature extraction network utilizes a 2D image feature extraction structure and a 3D point cloud feature extraction structure to extract RGB image features and point cloud features, respectively. Through feature extraction and global feature supplementation, it effectively mitigates the issues of fine-grained image feature loss and point cloud geometric structure deficiency. The gated attention fusion and segmentation network increases the network’s attention to important categories such as vehicles and pedestrians through an attention mechanism and then uses a dynamic gated attention mechanism to control the respective weights of 2D and 3D features in the fusion process, enabling it to solve the problem of information redundancy in feature fusion. Finally, a 3D decoder is used for point cloud semantic segmentation. In this paper, tests will be conducted on the SemanticKITTI and nuScenes large-scene point cloud datasets.