Gemma Roig
YOU?
Author Swipe
View article: The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect
The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect Open
The nouns of our language refer to either concrete entities (like a table) or abstract concepts (like justice or love), and cognitive psychology has established that concreteness influences how words are processed. Accordingly, understandi…
View article: Net2Brain: a toolbox to compare artificial vision models with human brain responses
Net2Brain: a toolbox to compare artificial vision models with human brain responses Open
In cognitive neuroscience, the integration of deep neural networks (DNNs) with traditional neuroscientific analyses has significantly advanced our understanding of both biological neural processes and the functioning of DNNs. However, chal…
View article: Follow the MEP: Scalable Neural Representations for Minimum-Energy Path Discovery in Molecular Systems
Follow the MEP: Scalable Neural Representations for Minimum-Energy Path Discovery in Molecular Systems Open
Characterizing conformational transitions in physical systems remains a fundamental challenge, as traditional sampling methods struggle with the high-dimensional nature of molecular systems and high-energy barriers between stable states. T…
View article: Investigating the temporal dynamics and modelling of mid-level feature representations in humans
Investigating the temporal dynamics and modelling of mid-level feature representations in humans Open
Scene perception is a key function of biological visual systems. According to the hierarchical processing view, scene perception in the human brain begins with low-level features, progresses to mid-level features, and ends with high-level …
View article: Human Gaze Boosts Object-Centered Representation Learning
Human Gaze Boosts Object-Centered Representation Learning Open
Recent self-supervised learning (SSL) models trained on human-like egocentric visual inputs substantially underperform on image recognition tasks compared to humans. These models train on raw, uniform visual inputs collected from head-moun…
View article: Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers
Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers Open
Shortcut learning, i.e., a model's reliance on undesired features not directly relevant to the task, is a major challenge that severely limits the applications of machine learning algorithms, particularly when deploying them to assist in m…
View article: The Algonauts Project 2025 Challenge: How the Human Brain Makes Sense of Multimodal Movies
The Algonauts Project 2025 Challenge: How the Human Brain Makes Sense of Multimodal Movies Open
There is growing symbiosis between artificial and biological intelligence sciences: neural principles inspire new intelligent machines, which are in turn used to advance our theoretical understanding of the brain. To promote further collab…
View article: Evaluating Medical Image Segmentation Models Using Augmentation
Evaluating Medical Image Segmentation Models Using Augmentation Open
Background: Medical image segmentation is an essential step in both clinical and research applications, and automated segmentation models—such as TotalSegmentator—have become ubiquitous. However, robust methods for validating the accuracy …
View article: On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process
On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process Open
Knowledge distillation (KD) remains challenging due to the opaque nature of the knowledge transfer process from a Teacher to a Student, making it difficult to address certain issues related to KD. To address this, we proposed UniCAM, a nov…
View article: One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment
One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment Open
What can we learn from comparing video models to human brains, arguably the most efficient and effective video processing systems in existence? Our work takes a step towards answering this question by performing the first large-scale bench…
View article: BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation
BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation Open
The human brain efficiently represents visual inputs through specialized neural populations that selectively respond to specific categories. Advancements in generative modeling have enabled data-driven discovery of neural selectivity using…
View article: EMOKINE: A software package and computational framework for scaling up the creation of highly controlled emotional full-body movement datasets
EMOKINE: A software package and computational framework for scaling up the creation of highly controlled emotional full-body movement datasets Open
EMOKINE is a software package and dataset creation suite for emotional full-body movement research in experimental psychology, affective neuroscience, and computer vision. A computational framework, comprehensive instructions, a pilot data…
View article: Learning Object Semantic Similarity with Self-Supervision
Learning Object Semantic Similarity with Self-Supervision Open
Humans judge the similarity of two objects not just based on their visual appearance but also based on their semantic relatedness. However, it remains unclear how humans learn about semantic relationships between objects and categories. On…
View article: Large Language Model-Informed X-ray Photoelectron Spectroscopy Data Analysis
Large Language Model-Informed X-ray Photoelectron Spectroscopy Data Analysis Open
X-ray photoelectron spectroscopy (XPS) remains a fundamental technique in materials science, offering invaluable insights into the chemical states and electronic structure of a material. However, the interpretation of XPS spectra can be co…
View article: Visual features are processed before navigational affordances in the human brain
Visual features are processed before navigational affordances in the human brain Open
To navigate through their immediate environment humans process scene information rapidly. How does the cascade of neural processing elicited by scene viewing to facilitate navigational planning unfold over time? To investigate, we recorded…
View article: Generative Adversarial Collaborations: A practical guide for conference organizers and participating scientists
Generative Adversarial Collaborations: A practical guide for conference organizers and participating scientists Open
Generative adversarial collaborations (GACs) are a form of formal teamwork between groups of scientists with diverging views. The goal of GACs is to identify and ultimately resolve the most important challenges, controversies, and exciting…
View article: Optimized Financial Planning: Integrating Individual and Cooperative Budgeting Models with LLM Recommendations
Optimized Financial Planning: Integrating Individual and Cooperative Budgeting Models with LLM Recommendations Open
In today’s complex economic environment, individuals and households alike grapple with the challenge of financial planning. This paper introduces novel methodologies for both individual and cooperative (household) financial budgeting. We f…
View article: Caregiver Talk Shapes Toddler Vision: A Computational Study of Dyadic Play
Caregiver Talk Shapes Toddler Vision: A Computational Study of Dyadic Play Open
Infants' ability to recognize and categorize objects develops gradually. The second year of life is marked by both the emergence of more semantic visual representations and a better understanding of word meaning. This suggests that languag…
View article: Different Algorithms (Might) Uncover Different Patterns: A Brain-Age Prediction Case Study
Different Algorithms (Might) Uncover Different Patterns: A Brain-Age Prediction Case Study Open
Machine learning is a rapidly evolving field with a wide range of\napplications, including biological signal analysis, where novel algorithms\noften improve the state-of-the-art. However, robustness to algorithmic\nvariability - measured b…
View article: LLM Multimodal Traffic Accident Forecasting
LLM Multimodal Traffic Accident Forecasting Open
With the rise in traffic congestion in urban centers, predicting accidents has become paramount for city planning and public safety. This work comprehensively studied the efficacy of modern deep learning (DL) methods in forecasting traffic…
View article: Learning Class and Domain Augmentations for Single-Source Open-Domain Generalization
Learning Class and Domain Augmentations for Single-Source Open-Domain Generalization Open
Single-source open-domain generalization (SS-ODG) addresses the challenge of labeled source domains with supervision during training and unlabeled novel target domains during testing. The target domain includes both known classes from the …
View article: Multimodal Vision-Audio-Language Dataset
Multimodal Vision-Audio-Language Dataset Open
The Multimodal Vision-Audio-Language Dataset is a large-scale dataset for multimodal learning. It contains 2M video clips with corresponding audio and a textual description of the visual and auditory content. The dataset is an ensemble of …
View article: Multimodal Vision-Audio-Language Dataset
Multimodal Vision-Audio-Language Dataset Open
The Multimodal Vision-Audio-Language Dataset is a large-scale dataset for multimodal learning. It contains 2M video clips with corresponding audio and a textual description of the visual and auditory content. The dataset is an ensemble of …
View article: Net2Brain: A Toolbox to compare artificial vision models with human brain responses
Net2Brain: A Toolbox to compare artificial vision models with human brain responses Open
<p>Several studies have demonstrated the potential of deep neural networks (DNNs) to serve as state-of-the-art computational models of the primate visual cortex. In the last decade, different implementations of DNNs (varying, for exa…
View article: A large and rich EEG dataset for modeling human visual object recognition
A large and rich EEG dataset for modeling human visual object recognition Open
The human brain achieves visual object recognition through multiple stages of transformations operating at a millisecond scale. To predict and explain these rapid transformations, computational neuroscientists employ machine learning model…
View article: Visual features are processed before navigational affordances in the human brain
Visual features are processed before navigational affordances in the human brain Open
To navigate through their immediate environment humans process scene information rapidly. How does the cascade of neural processing elicited by scene viewing to facilitate navigational planning unfold over time? To investigate, we recorded…
View article: LLM Adaptive PID Control for B5G Truck Platooning Systems
LLM Adaptive PID Control for B5G Truck Platooning Systems Open
This paper presents an exploration into the capabilities of an adaptive PID controller within the realm of truck platooning operations, situating the inquiry within the context of Cognitive Radio and AI-enhanced 5G and Beyond 5G (B5G) netw…