Mido Assran
YOU?
Author Swipe
View article: V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning Open
A major challenge for modern AI is to learn to understand the world and learn to act largely by observation. This paper explores a self-supervised approach that combines internet-scale video data with a small amount of interaction data (ro…
View article: VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning
VEDIT: Latent Prediction Architecture For Procedural Video Representation Learning Open
Procedural video representation learning is an active research area where the objective is to learn an agent which can anticipate and forecast the future given the present video input, typically in conjunction with textual annotations. Pri…
View article: RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data Open
Semi-supervised learning aims to train a model using limited labels. State-of-the-art semi-supervised methods for image classification such as PAWS rely on self-supervised representations learned with large-scale unlabeled but curated data…
View article: A Closer Look at Codistillation for Distributed Training
A Closer Look at Codistillation for Distributed Training Open
Codistillation has been proposed as a mechanism to share knowledge among concurrently trained models by encouraging them to represent the same function through an auxiliary loss. This contrasts with the more commonly used fully-synchronous…
View article: Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations
Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations Open
We investigate a strategy for improving the efficiency of contrastive learning of visual representations by leveraging a small amount of supervised information during pre-training. We propose a semi-supervised loss, SuNCEt, based on noise-…