Explanipedia

Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls Open

Bai Xiao-yan, Itamar Pres, Yuntian Deng, Chenhao Tan, Stuart M. Shieber , et al. · 2025

Language models are increasingly capable, yet still fail at a seemingly simple task of multi-digit multiplication. In this work, we study why, by reverse-engineering a model that successfully learns multiplication via \emph{implicit chain-…

Competition Dynamics Shape Algorithmic Phases of In-Context Learning Open

Core Francisco Park, Ekdeep Singh Lubana, Itamar Pres, Hidenori Tanaka · 2024

In-Context Learning (ICL) has significantly expanded the general-purpose nature of large language models, allowing them to adapt to novel tasks using merely the inputted context. This has motivated a series of papers that analyze tractable…

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity Open

Andrew Lee, Bai Xiao-yan, Itamar Pres, Martin Wattenberg, Jonathan K. Kummerfeld , et al. · 2024

While alignment algorithms are now commonly used to tune pre-trained language models towards a user's preferences, we lack explanations for the underlying mechanisms in which models become ``aligned'', thus making it difficult to explain p…

Itamar Pres YOU? Author Swipe