Exploring foci of:
arXiv (Cornell University)
Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection
January 2024 • Heqing Zou, Meng Shen, Yu‐Chen Hu, Chen Chen, Eng Siong Chng, Deepu Rajan
Audio-visual deepfake detection scrutinizes manipulations in public video using complementary multimodal cues. Current methods, which train on fused multimodal data for multimodal targets face challenges due to uncertainties and inconsistencies in learned representations caused by independent modality manipulations in deepfake videos. To address this, we propose cross-modality and within-modality regularization to preserve modality distinctions during multimodal representation learning. Our approach includes an au…
Computer Science
Audiovisual
Artificial Intelligence