arXiv (Cornell University)
Transformers Meet In-Context Learning: A Universal Approximation Theory
June 2025 • Yang Jiao, Yuting Wei, Yuxin Chen
Large language models are capable of in-context learning, the ability to perform new tasks at test time using a handful of input-output examples, without parameter updates. We develop a universal approximation theory to elucidate how transformers enable in-context learning. For a general class of functions (each representing a distinct task), we demonstrate how to construct a transformer that, without any further weight updates, can predict based on a few noisy in-context examples with vanishingly small risk. Unli…