Matthew Walmer
YOU?
Author Swipe
View article: LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors Open
We present a simple self-supervised method to enhance the performance of ViT features for dense downstream tasks. Our Lightweight Feature Transform (LiFT) is a straightforward and compact postprocessing network that can be applied to enhan…
View article: Multi-entity Video Transformers for Fine-Grained Video Representation Learning
Multi-entity Video Transformers for Fine-Grained Video Representation Learning Open
The area of temporally fine-grained video representation learning focuses on generating frame-by-frame representations for temporally dense tasks, such as fine-grained action phase classification and frame retrieval. In this work, we advan…
View article: TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models
TIJO: Trigger Inversion with Joint Optimization for Defending Multimodal Backdoored Models Open
We present a Multimodal Backdoor Defense technique TIJO (Trigger Inversion using Joint Optimization). Recent work arXiv:2112.07668 has demonstrated successful backdoor attacks on multimodal models for the Visual Question Answering task. Th…
View article: Teaching Matters: Investigating the Role of Supervision in Vision Transformers
Teaching Matters: Investigating the Role of Supervision in Vision Transformers Open
Vision Transformers (ViTs) have gained significant popularity in recent years and have proliferated into many applications. However, their behavior under different learning paradigms is not well explored. We compare ViTs trained through di…
View article: Dual-Key Multimodal Backdoors for Visual Question Answering
Dual-Key Multimodal Backdoors for Visual Question Answering Open
The success of deep learning has enabled advances in multimodal tasks that require non-trivial fusion of multiple input domains. Although multimodal models have shown potential in many problems, their increased complexity makes them more v…
View article: APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection
APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection Open
Physical adversarial attacks threaten to fool object detection systems, but reproducible research on the real-world effectiveness of physical patches and how to defend against them requires a publicly available benchmark dataset. We presen…