Explanipedia

Critical role of EEG signals in assessment of sex-specific insights in neurological diagnostics via machine learning approach Open

Mohammad-Javad Darvishi-Bayazi, Mohammad Sajjad Ghaemi, Irina Rish, Jocelyn Faubert · 2025

Early detection and diagnosis of neurological pathology are essential for timely treatment and intervention. While deep learning has shown promise in analyzing brain imaging data, the influence of sex-specific patterns in electroencephalog…

Influence Functions for Efficient Data Selection in Reasoning Open

Prateek Humane, Paolo Cudrano, Daniel Z. Kaplan, Matteo Matteucci, Supriyo Chakraborty , et al. · 2025

Fine-tuning large language models (LLMs) on chain-of-thought (CoT) data shows that a small amount of high-quality data can outperform massive datasets. Yet, what constitutes "quality" remains ill-defined. Existing reasoning methods rely on…

Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks? Open

Rishika Bhagwatkar, Kevin Kasa, Abhay Puri, Gabriel Huang, Irina Rish , et al. · 2025

AI agents are vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external content or tool outputs cause unintended or harmful behavior. Inspired by the well-established concept of firewalls, we show t…

Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients Open

Gwen Legate, Irina Rish, Eugene Belilovsky · 2025

Federated learning enables collaborative model training across numerous edge devices without requiring participants to share data; however, memory and communication constraints on these edge devices may preclude their participation in trai…

A Guide to Robust Generalization: The Impact of Architecture, Pre-training, and Optimization Strategy Open

Maxime Heuillet, Rishika Bhagwatkar, Jonas Ngnawé, Yann Pequignot, Alexandre Larouche , et al. · 2025

Deep learning models operating in the image domain are vulnerable to small input perturbations. For years, robustness to such perturbations was pursued by training models from scratch (i.e., with random initializations) using specialized l…

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models Open

Istabrak Abbes, Matthew Riemer, Nizar Islah, Hiroaki Kingetsu, Irina Rish · 2025

Training large language models (LLMs) typically involves pre-training on massive corpora, only to restart the process entirely when new data becomes available. A more efficient and resource-conserving approach would be continual pre-traini…

GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities Open

Diganta Misra, Nizar Islah, Victor May, Brice Rauby, Zihan Wang , et al. · 2025

The rapid evolution of software libraries poses a considerable hurdle for code generation, necessitating continuous adaptation to frequent version updates while preserving backward compatibility. While existing code evolution benchmarks pr…

Spectra 1.1: Scaling Laws and Efficient Inference for Ternary Language Models Open

Tejas Vaidhya, Ayush Kaushal, Francis Couture Harpin, Prashant Shishodia, Yuriy Nevmyvaka , et al. · 2025

Large language models (LLMs) are increasingly used across research and industry applications, yet their inference efficiency remains a significant challenge. As the computational power of modern GPU architectures continuously improves, the…

Random Initialization Can't Catch Up: The Advantage of Language Model Transfer for Time Series Forecasting Open

Roland Riachi, Arjun Ashok, Prateek Humane, Alexis Roger, Andrew Williams , et al. · 2025

Recent works have demonstrated the effectiveness of adapting pre-trained language models (LMs) for forecasting time series in the low-data regime. We build upon these findings by analyzing the effective transfer from language models to tim…

Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning Open

Andrei Mircea, Sanjoy Chakraborty, Nima Chitsazan, Milind Naphade, Sambit Sahu , et al. · 2025

This work aims to understand how scaling improves language models, specifically in terms of training dynamics. We find that language models undergo loss deceleration early in training; an abrupt slowdown in the rate of loss improvement, re…

Artificial neural networks for magnetoencephalography: a review of an emerging field Open

Arthur Dehgan, Hamza Abdelhedi, Vanessa Hadid, Irina Rish, Karim Jerbi · 2025

Objective . Magnetoencephalography (MEG) is a cutting-edge neuroimaging technique that measures the intricate brain dynamics underlying cognitive processes with an unparalleled combination of high temporal and spatial precision. While MEG …

MEEGNet: An open source python library for the application of convolutional neural networks to MEG Open

Arthur Dehgan, Annalisa Pascarella, Yann Harel, Irina Rish, Karim Jerbi · 2025

Computer science

Artificial Neural Networks (ANNs) are rapidly gaining traction in neuroscience, proving invaluable for decoding and modeling brain signals from techniques such as electroencephalography (EEG) and functional magnetic resonance imaging (fMRI…

Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training Open

Vaibhav Singh, Paul Janson, Paria Mehrbod, Adam Ibrahim, Irina Rish , et al. · 2025

The ever-growing availability of unlabeled data presents both opportunities and challenges for training artificial intelligence systems. While self-supervised learning (SSL) has emerged as a powerful paradigm for extracting meaningful repr…

Artificial Neural Networks for Magnetoencephalography: A review of an emerging field Open

Arthur Dehgan, Hamza Abdelhedi, Vanessa Hadid, Irina Rish, Karim Jerbi · 2025

Computer science Psychology Mathematics

Magnetoencephalography (MEG) is a cutting-edge neuroimaging technique that measures the intricate brain dynamics underlying cognitive processes with an unparalleled combination of high temporal and spatial precision. MEG data analytics has…

CHIRP: A Fine-Grained Benchmark for Open-Ended Response Evaluation in Vision-Language Models Open

Alexis Roger, Prateek Humane, Daniel Z. Kaplan, Kshitij Gupta, Qi Sun , et al. · 2025

Computer science Geography Physics

The proliferation of Vision-Language Models (VLMs) in the past several years calls for rigorous and comprehensive evaluation methods and benchmarks. This work analyzes existing VLM evaluation techniques, including automated metrics, AI-bas…

Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference Open

Matthew Riemer, Gopeshh Subbaraj, Glen Berseth, Irina Rish · 2024

Computer science Psychology Geography

Realtime environments change even as agents perform action inference and learning, thus requiring high interaction frequencies to effectively minimize regret. However, recent advances in machine learning involve larger neural networks with…

Critical Role of EEG Signals in Assessment of Sex-Specific Insights in Neurological Diagnostics via Machine Learning Approach Open

Mohammad Javad Darvishi Bayazi, Mohammad Sajjad Ghaemi, Irina Rish, Jocelyn Faubert · 2024

Computer science Psychology

Early detection and diagnosis of pathology are essential for efficient treatment and therapeutic interventions. The emergence of Artificial Intelligence (AI) and deep machine learning techniques have demonstrated the promising capability o…

RedPajama: an Open Dataset for Training Large Language Models Open

Maurice Weber, Daniel Fu, Quentin Anthony, Yonatan Oren, Sally Adams , et al. · 2024

Computer science Geography

Large language models are increasingly becoming a cornerstone technology in artificial intelligence, the sciences, and society as a whole, yet the optimal strategies for dataset composition and filtering remain largely elusive. Many of the…

Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching Open

Arnav Jain, Harley Wiltzer, Jesse Farebrother, Irina Rish, Glen Berseth , et al. · 2024

Computer science Mathematics Psychology

In inverse reinforcement learning (IRL), an agent seeks to replicate expert demonstrations through interactions with the environment. Traditionally, IRL is treated as an adversarial game, where an adversary searches over reward models, and…

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models Open

Nizar Islah, Justine Gehring, Diganta Misra, Eilif Müller, Irina Rish , et al. · 2024

Computer science Philosophy

The rapid evolution of software libraries presents a significant challenge for code generation models, which must adapt to frequent version updates while maintaining compatibility with previous versions. Existing code completion benchmarks…

Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning Open

Md Rifat Arefin, Gopeshh Subbaraj, Nicolas Gontier, Yann LeCun, Irina Rish , et al. · 2024

Computer science Engineering

Decoder-only Transformers often struggle with complex reasoning tasks, particularly arithmetic reasoning requiring multiple sequential operations. In this work, we identify representation collapse in the model's intermediate layers as a ke…

Context is Key: A Benchmark for Forecasting with Essential Textual Information Open

Andrew Williams, Arjun Ashok, Étienne Marcotte, Valentina Zantedeschi, Jithendaraa Subramanian , et al. · 2024

Computer science History Geography

Forecasting is a critical task in decision-making across numerous domains. While historical numerical data provide a start, they fail to convey the complete context for reliable and accurate predictions. Human forecasters frequently rely o…

VFA: Vision Frequency Analysis of Foundation Models and Human Open

Mohammad Javad Darvishi Bayazi, Md Rifat Arefin, Jocelyn Faubert, Irina Rish · 2024

Computer science Geography

Machine learning models often struggle with distribution shifts in real-world scenarios, whereas humans exhibit robust adaptation. Models that better align with human perception may achieve higher out-of-distribution generalization. In thi…

Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at Scale Open

Ayush Kaushal, Tejas Vaidhya, Arnab Kumar Mondal, T. N. Pandey, Aaryan Bhagat , et al. · 2024

Psychology Computer science Geography

Rapid advancements in GPU computational power has outpaced memory capacity and bandwidth growth, creating bottlenecks in Large Language Model (LLM) inference. Post-training quantization is the leading method for addressing memory-related b…

Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent Open

Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, Mohammad Reza Samsami , et al. · 2024

Computer science Physics

Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely. In this work, we perform exploratory analysis on…

Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques Open

Rishika Bhagwatkar, Shravan Nayak, Reza Bayat, Alexis Roger, Daniel Z Kaplan , et al. · 2024

Computer science

Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they are becoming increasingly prevalent, ensuring their robustness against adversarial attacks is paramount. This work systemat…

Lost in Translation: The Algorithmic Gap Between LMs and the Brain Open

Tommaso Tosato, Pascal Tikeng Notsawo, Saskia Helbling, Irina Rish, Guillaume Dumas · 2024

Computer science Chemistry

Language Models (LMs) have achieved impressive performance on various linguistic tasks, but their relationship to human language processing in the brain remains unclear. This paper examines the gaps and overlaps between LMs and the brain a…

Irina Rish YOU? Author Swipe