Clément Rambour
YOU?
Author Swipe
DAFTED: Decoupled Asymmetric Fusion of Tabular and Echocardiographic Data for Cardiac Hypertension Diagnosis Open
Multimodal data fusion is a key approach for enhancing diagnosis in medical applications. We propose an asymmetric fusion strategy starting from a primary modality and integrating secondary modalities by disentangling shared and modality-s…
RT-HCP: Dealing with Inference Delays and Sample Efficiency to Learn Directly on Robotic Platforms Open
Learning a controller directly on the robot requires extreme sample efficiency. Model-based reinforcement learning (RL) methods are the most sample efficient, but they often suffer from a too long inference time to meet the robot control f…
CLIPTTA: Robust Contrastive Vision-Language Test-Time Adaptation Open
Vision-language models (VLMs) like CLIP exhibit strong zero-shot capabilities but often fail to generalize under distribution shifts. Test-time adaptation (TTA) allows models to update at inference time without labeled data, typically via …
View article: ViLU: Learning Vision-Language Uncertainties for Failure Prediction
ViLU: Learning Vision-Language Uncertainties for Failure Prediction Open
Reliable Uncertainty Quantification (UQ) and failure prediction remain open challenges for Vision-Language Models (VLMs). We introduce ViLU, a new Vision-Language Uncertainty quantification framework that contextualizes uncertainty estimat…
View article: Optimization of Rank Losses for Image Retrieval
Optimization of Rank Losses for Image Retrieval Open
In image retrieval, standard evaluation metrics rely on score ranking, e.g. average precision (AP), recall at k (R@k), normalized discounted cumulative gain (NDCG). In this work, we introduce a general framework for robust and decomposable…
View article: GalLoP: Learning Global and Local Prompts for Vision-Language Models
GalLoP: Learning Global and Local Prompts for Vision-Language Models Open
Prompt learning has been widely adopted to efficiently adapt vision-language models (VLMs), e.g. CLIP, for few-shot image classification. Despite their success, most prompt learning methods trade-off between classification accuracy and rob…
Learning a versatile representation of SAR data for regression and segmentation by leveraging self-supervised despeckling with MERLIN Open
International audience
Energy Correction Model in the Feature Space for Out-of-Distribution Detection Open
In this work, we study the out-of-distribution (OOD) detection problem through the use of the feature space of a pre-trained deep classifier. We show that learning the density of in-distribution (ID) features with an energy-based models (E…
View article: Optimization of Rank Losses for Image Retrieval
Optimization of Rank Losses for Image Retrieval Open
In image retrieval, standard evaluation metrics rely on score ranking, \eg average precision (AP), recall at k (R@k), normalized discounted cumulative gain (NDCG). In this work we introduce a general framework for robust and decomposable r…
Multivariate Emulation of Kilometer-Scale Numerical Weather Predictions with Generative Adversarial Networks: A Proof of Concept Open
Emulating numerical weather prediction (NWP) model outputs is important to compute large datasets of weather fields in an efficient way. The purpose of the present paper is to investigate the ability of generative adversarial networks (GAN…
Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks Open
Vision-language foundation models such as CLIP have shown impressive zero-shot performance on many tasks and datasets, especially thanks to their free-text inputs. However, they struggle to handle some downstream tasks, such as fine-graine…
VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing Open
Recently, diffusion-based generative models have achieved remarkable success for image generation and edition. However, existing diffusion-based video editing approaches lack the ability to offer precise control over generated content that…
Hybrid Energy Based Model in the Feature Space for Out-of-Distribution Detection Open
Out-of-distribution (OOD) detection is a critical requirement for the deployment of deep neural networks. This paper introduces the HEAT model, a new post-hoc OOD detection method estimating the density of in-distribution (ID) samples usin…
View article: Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation
Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation Open
International audience
View article: Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation
Full Contextual Attention for Multi-resolution Transformers in Semantic Segmentation Open
Transformers have proved to be very effective for visual recognition tasks. In particular, vision transformers construct compressed global representations through self-attention and learnable class tokens. Multi-resolution transformers hav…
View article: Memory transformers for full context and high-resolution 3D Medical Segmentation
Memory transformers for full context and high-resolution 3D Medical Segmentation Open
Transformer models achieve state-of-the-art results for image segmentation. However, achieving long-range attention, necessary to capture global context, with high-resolution 3D images is a fundamental challenge. This paper introduces the …
Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction Open
State-of-the-art methods for optical flow estimation rely on deep learning, which require complex sequential training schemes to reach optimal performances on real-world data. In this work, we introduce the COMBO deep network that explicit…
View article: Now you see me: finding the right observation space to learn diverse behaviours by reinforcement in games
Now you see me: finding the right observation space to learn diverse behaviours by reinforcement in games Open
National audience
View article: Hierarchical Average Precision Training for Pertinent Image Retrieval
Hierarchical Average Precision Training for Pertinent Image Retrieval Open
Image Retrieval is commonly evaluated with Average Precision (AP) or Recall@k. Yet, those metrics, are limited to binary labels and do not take into account errors' severity. This paper introduces a new hierarchical AP training method for …
View article: Robust and Decomposable Average Precision for Image Retrieval
Robust and Decomposable Average Precision for Image Retrieval Open
In image retrieval, standard evaluation metrics rely on score ranking, e.g. average precision (AP). In this paper, we introduce a method for robust and decomposable average precision (ROADMAP) addressing two major challenges for end-to-end…
3D Buildings Reconstruction with SAR Tomography Guided by Partial Footprints Information Open
International audience
Regularized SAR Tomography Approaches Open
International audience
Analysis of dense coregistration methods applied to optical and SAR time-series for ice flow estimations Open
International audience
FLOOD DETECTION IN TIME SERIES OF OPTICAL AND SAR IMAGES Open
These last decades, Earth Observation brought a number of new perspectives from geosciences to human activity monitoring. As more data became available, Artificial Intelligence (AI) techniques led to very successful results for understandi…