Enhancing the Accuracy of Monopole and Dipole Source Identification with Vision Transformer Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.20944/preprints202509.2056.v1
· OA: W4414552592
Identifications of mixed monopole and dipole sound sources under highly randomized acoustic environments are of interest in many industrial applications. The DAMAS–MS method is one of the few methods that has been explicitly developed to address this problem. However, it suffers from a critical constraint in that it consistently exhibits limited accuracy in identifying monopole sources, which leads to their underestimation in the final results. To overcome this constraint, this paper proposed a novel identification framework that integrates vision transformer (ViT) with beamforming techniques. The framework leverages preliminary beamforming results to construct input features by extracting the real and imaginary components of the cross-spectral matrix at target frequencies and incorporating spatial position encodings derived from estimated source locations. To ensure adaptability to varying source densities, multiple ViT sub-models are trained on representative scenarios. This strategy enables effective generalization across the target range and supports multi-label identification of monopole and dipole sources with varied configurations. Furthermore, anechoic chamber experiments with synthesized monopole and dipole emitters validate the method’s stability under single-frequency excitation. Compared to the DAMAS–MS method, the proposed method achieves significantly improved identification accuracy for monopole sources, while maintaining comparable performance in dipole source identification, underscoring its potential for practical applications.