Corentin Dancette
YOU?
Author Swipe
View article: Curia: A Multi-Modal Foundation Model for Radiology
Curia: A Multi-Modal Foundation Model for Radiology Open
AI-assisted radiological interpretation is based on predominantly narrow, single-task models. This approach is impractical for covering the vast spectrum of imaging modalities, diseases, and radiological findings. Foundation models (FMs) h…
View article: RAPS-3D: Efficient interactive segmentation for 3D radiological imaging
RAPS-3D: Efficient interactive segmentation for 3D radiological imaging Open
Promptable segmentation, introduced by the Segment Anything Model (SAM), is a promising approach for medical imaging, as it enables clinicians to guide and refine model predictions interactively. However, SAM's architecture is designed for…
View article: A promptable CT foundation model for solid tumor evaluation
A promptable CT foundation model for solid tumor evaluation Open
View article: ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation
ONCOPILOT: A Promptable CT Foundation Model For Solid Tumor Evaluation Open
Carcinogenesis is a proteiform phenomenon, with tumors emerging in various locations and displaying complex, diverse shapes. At the crucial intersection of research and clinical practice, it demands precise and flexible assessment. However…
View article: Artificial intelligence in interventional radiology: Current concepts and future trends
Artificial intelligence in interventional radiology: Current concepts and future trends Open
While artificial intelligence (AI) is already well established in diagnostic radiology, it is beginning to make its mark in interventional radiology. AI has the potential to dramatically change the daily practice of interventional radiolog…
View article: Efficient Medical Question Answering with Knowledge-Augmented Question Generation
Efficient Medical Question Answering with Knowledge-Augmented Question Generation Open
In the expanding field of language model applications, medical knowledge representation remains a significant challenge due to the specialized nature of the domain. Large language models, such as GPT-4, obtain reasonable scores on medical …
View article: Efficient Medical Question Answering with Knowledge-Augmented Question Generation
Efficient Medical Question Answering with Knowledge-Augmented Question Generation Open
View article: Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning
Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning Open
Following the success of Large Language Models (LLMs), Large Multimodal Models (LMMs), such as the Flamingo model and its subsequent competitors, have started to emerge as natural steps towards generalist agents. However, interacting with …
View article: UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks Open
Large Language Models (LLMs) have made the ambitious quest for generalist agents significantly far from being a fantasy. A key hurdle for building such general models is the diversity and heterogeneity of tasks and modalities. A promising …
View article: Improving Selective Visual Question Answering by Learning from Your Peers
Improving Selective Visual Question Answering by Learning from Your Peers Open
Despite advances in Visual Question Answering (VQA), the ability of models to assess their own correctness remains underexplored. Recent work has shown that VQA models, out-of-the-box, can have difficulties abstaining from answering when t…
View article: Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards Open
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further align the network with the intended usage. Yet the imperfect…
View article: Shortcut Learning in Visual Question Answering
Shortcut Learning in Visual Question Answering Open
Cette thèse se concentre sur la tâche de VQA, c'est à dire les systèmes questions-réponses visuelles. Nous étudions l'apprentissage des biais dans cette tâche. Les modèles ont tendance à apprendre des corrélations superficielles les condui…
View article: eP-ALM: Efficient Perceptual Augmentation of Language Models
eP-ALM: Efficient Perceptual Augmentation of Language Models Open
Large Language Models (LLMs) have so far impressed the world, with unprecedented capabilities that emerge in models at large scales. On the vision side, transformer models (i.e., ViT) are following the same trend, achieving the best perfor…
View article: Dynamic Query Selection for Fast Visual Perceiver
Dynamic Query Selection for Fast Visual Perceiver Open
Transformers have been matching deep convolutional networks for vision architectures in recent works. Most work is focused on getting the best results on large-scale benchmarks, and scaling laws seem to be the most successful strategy: big…
View article: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering Open
We introduce an evaluation methodology for visual question answering (VQA) to better diagnose cases of shortcut learning. These cases happen when a model exploits spurious statistical regularities to produce correct answers but does not ac…
View article: Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization Open
Learning robust models that generalize well under changes in the data distribution is critical for real-world applications. To this end, there has been a growing surge of interest to learn simultaneously from multiple training domains - wh…
View article: Fishr: Invariant Gradient Variances for Out-of-Distribution\n Generalization
Fishr: Invariant Gradient Variances for Out-of-Distribution\n Generalization Open
Learning robust models that generalize well under changes in the data\ndistribution is critical for real-world applications. To this end, there has\nbeen a growing surge of interest to learn simultaneously from multiple training\ndomains -…
View article: Learning Reasoning Mechanisms for Unbiased Question-based Counting
Learning Reasoning Mechanisms for Unbiased Question-based Counting Open
International audience
View article: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in\n Visual Question Answering
Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in\n Visual Question Answering Open
We introduce an evaluation methodology for visual question answering (VQA) to\nbetter diagnose cases of shortcut learning. These cases happen when a model\nexploits spurious statistical regularities to produce correct answers but does\nnot…
View article: Overcoming Statistical Shortcuts for Open-ended Visual Counting
Overcoming Statistical Shortcuts for Open-ended Visual Counting Open
Machine learning models tend to over-rely on statistical shortcuts. These spurious correlations between parts of the input and the output labels does not hold in real-world settings. We target this issue on the recent open-ended visual cou…
View article: RUBi: Reducing Unimodal Biases for Visual Question Answering
RUBi: Reducing Unimodal Biases for Visual Question Answering Open
International audience
View article: RUBi: Reducing Unimodal Biases in Visual Question Answering
RUBi: Reducing Unimodal Biases in Visual Question Answering Open
Visual Question Answering (VQA) is the task of answering questions about an image. Some VQA models often exploit unimodal biases to provide the correct answer without using the image information. As a result, they suffer from a huge drop i…
View article: A K-nearest neighbours approach to unsupervised spoken term discovery
A K-nearest neighbours approach to unsupervised spoken term discovery Open
International audience
View article: Sampling Strategies in Siamese Networks for Unsupervised Speech Representation Learning
Sampling Strategies in Siamese Networks for Unsupervised Speech Representation Learning Open
Recent studies have investigated siamese network architectures for learning invariant speech representations using same-different side information at the word level. Here we investigate systematically an often ignored component of siamese …
View article: Sampling strategies in Siamese Networks for unsupervised speech\n representation learning
Sampling strategies in Siamese Networks for unsupervised speech\n representation learning Open
Recent studies have investigated siamese network architectures for learning\ninvariant speech representations using same-different side information at the\nword level. Here we investigate systematically an often ignored component of\nsiame…