ACM Transactions on Multimedia Computing Communications and Applications • Vol 20 • No 8
SigFormer: Sparse Signal-guided Transformer for Multi-modal Action Segmentation
April 2024 • Qi Liu, Xinchen Liu, K Liu, Xiaoyan Gu, Wu Liu
Multi-modal human action segmentation is a critical and challenging task with a wide range of applications. Nowadays, the majority of approaches concentrate on the fusion of dense signals (i.e., RGB, optical flow, and depth maps). However, the potential contributions of sparse IoT sensor signals, which can be crucial for achieving accurate recognition, have not been fully explored. To make up for this, we introduce a S parse s i gnal- g uided Transformer ( SigFormer ) to combine both dense and sparse signals. We e…