arXiv (Cornell University)
The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024
October 2024 • Han Yang, Qing-Yuan Jiang, Huiyuan Mei, Yang Yang, Jinhui Tang
This report presents our method for Temporal Action Localisation (TAL), which focuses on identifying and classifying actions within specific time intervals throughout a video sequence. We employ a data augmentation technique by expanding the training dataset using overlapping labels from the Something-SomethingV2 dataset, enhancing the model's ability to generalize across various action classes. For feature extraction, we utilize state-of-the-art models, including UMT, VideoMAEv2 for video features, and BEATs and …