Efficient Multi-Scale Attention Module with Cross-Spatial Learning Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.1109/icassp49357.2023.10096516
· OA: W4372347372
Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. However, modeling cross-channel relationships with dimensionality reduction may bring side effect extracting deep visual representations. In this paper, a novel efficient multi-scale (EMA) module is proposed. Focusing on retaining information per and decreasing computational overhead, we reshape partly channels into batch dimensions group multiple sub-features which make semantic features well-distributed inside each group. Specifically, apart from encoding global to re-calibrate channel-wise weight parallel branch, output two branches further aggregated by cross-dimension interaction capturing pixel-level pairwise relationship. We conduct extensive ablation studies experiments image classification object detection tasks popular benchmarks (e.g., CIFAR-100, ImageNet-1k, MS COCO VisDrone2019) evaluating its performance.