Explanipedia

To model human linguistic prediction, make LLMs less superhuman Open

Byung-Doh Oh, Tal Linzen · 2025

When people listen to or read a sentence, they actively make predictions about upcoming words: words that are less predictable are generally read more slowly than predictable ones. The success of large language models (LLMs), which, like h…

Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics Open

Christian Blum, Katja Filippova, Ann Yuan, Asma Ghandeharioun, Julian Zimmert , et al. · 2025

Large language models (LLMs) struggle with cross-lingual knowledge transfer: they hallucinate when asked in one language about facts expressed in a different language during training. This work introduces a controlled setting to study the …

RELIC: Evaluating Compositional Instruction Following via Language Recognition Open

Jackson Petty, Michael Y. Hu, Wentao Wang, Shauli Ravfogel, William Merrill , et al. · 2025

Large language models (LLMs) are increasingly expected to perform tasks based only on a specification of the task provided in context, without examples of inputs and outputs; this ability is referred to as instruction following. We introdu…

Bigger is not always better: The importance of human-scale language modeling for psycholinguistics Open

Ethan Wilcox, Michael Y. Hu, Aaron Mueller, Tal Linzen, Alex Warstadt , et al. · 2025

Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance improvements can arise from simply increasing (…

Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases Open

Michael Y. Hu, Jackson Petty, Chuan Shi, William Merrill, Tal Linzen · 2025

Pretraining language models on formal language can improve their acquisition of natural language. Which features of the formal language impart an inductive bias that leads to effective transfer? Drawing on insights from linguistics and com…

Rapid Word Learning Through Meta In-Context Learning Open

Wentao Wang, Guangyuan Jiang, Tal Linzen, Brenden M. Lake · 2025

Computer science Psychology History

Humans can quickly learn a new word from a few illustrative examples, and then systematically and flexibly use it in novel contexts. Yet the abilities of current language models for few-shot word learning, and methods for improving these a…

BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop Open

Lucas Charpentier, Leshem Choshen, Ryan Cotterell, M. Gul, Michael Y. Hu , et al. · 2025

Computer science

BabyLM aims to dissolve the boundaries between cognitive modeling and language modeling. We call for both workshop papers and for researchers to join the 3rd BabyLM competition. As in previous years, we call for participants in the data-ef…

Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora Open

Michael Y. Hu, Aaron Mueller, Candace Ross, Adina Williams, Tal Linzen , et al. · 2024

Psychology Computer science Chemistry

The BabyLM Challenge is a community effort to close the data-efficiency gap between human and computational language learners. Participants compete to optimize language model training on a fixed language data budget of 100 million words or…

What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length Open

Lindia Tjuatja, Graham Neubig, Tal Linzen, Sophie Hao · 2024

Psychology Political science Economics

When comparing the linguistic capabilities of language models (LMs) with humans using LM probabilities, factors such as the length of the sequence and the unigram frequency of lexical items have a significant effect on LM probabilities in …

How Does Code Pretraining Affect Language Model Task Performance? Open

Jackson Petty, Sjoerd van Steenkiste, Tal Linzen · 2024

Computer science Psychology Engineering

Large language models are increasingly trained on corpora containing both natural language and non-linguistic data like source code. Aside from aiding programming-related tasks, anecdotal evidence suggests that including code in pretrainin…

Bigger is not always better: The importance of human-scale language modeling for psycholinguistics Open

Ethan Wilcox, Michael Y. Hu, Aaron Mueller, Tal Linzen, Alex Warstadt , et al. · 2024

Computer science Psychology Philosophy

Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance improvements can arise from simply increasing (…

Bigger is not always better: The importance of human-scale language modeling for psycholinguistics Open

Ethan Wilcox, Michael Y. Hu, Aaron Mueller, Tal Linzen, Alex Warstadt , et al. · 2024

Computer science Psychology Philosophy

Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance improvements can arise from simply increasing (…

Testing learning hypotheses using neural networks by manipulating learning data Open

Cara Su-Yi Leong, Tal Linzen · 2024

Computer science Psychology

Although passivization is productive in English, it is not completely general -- some exceptions exist (e.g. *One hour was lasted by the meeting). How do English speakers learn these exceptions to an otherwise general pattern? Using neural…

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus Open

Leshem Choshen, Ryan Cotterell, Michael Y. Hu, Tal Linzen, Aaron Mueller , et al. · 2024

Psychology Computer science Chemistry

After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this ye…

SPAWNing Structural Priming Predictions from a Cognitively Motivated Parser Open

Grusha Prasad, Tal Linzen · 2024

Computer science Psychology Biology

Structural priming is a widely used psycholinguistic paradigm to study human sentence representations. In this work we introduce SPAWN, a cognitively motivated parser that can generate quantitative priming predictions from contemporary the…

Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment Open

William Merrill, Zhaofeng Wu, Norihito Naka, Yoon Kim, Tal Linzen · 2024

Computer science Philosophy

Do LMs infer the semantics of text from co-occurrence patterns in their training data? Merrill et al. (2022) argue that, in theory, sentence co-occurrence probabilities predicted by an optimal LM should reflect the entailment relationship …

Do Language Models’ Words Refer? Open

Matthew Mandelkern, Tal Linzen · 2024

Computer science Psychology Philosophy

What do language models (LMs) do with language? They can produce sequences of (mostly) coherent strings closely resembling English. But do those sentences mean something, or are LMs simply babbling in a convincing simulacrum of language us…

Neural Networks as Cognitive Models of the Processing of Syntactic Constraints Open

Suhas Arehalli, Tal Linzen · 2024

Computer science Mathematics Philosophy

Languages are governed by syntactic constraints—structural rules that determine which sentences are grammatical in the language. In English, one such constraint is subject-verb agreement, which dictates that the number of a verb must match…

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax Open

Aaron Mueller, Albert Webson, Jackson Petty, Tal Linzen · 2023

Computer science Mathematics Philosophy

In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer th…

A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models Open

Tiwalayo Eisape, Mh Tessler, Ishita Dasgupta, Fei Sha, Sjoerd van Steenkiste , et al. · 2023

Computer science Psychology Mathematics

A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises. Psychologists have documented several ways in which humans' inferences deviate from the rules of log…

The Impact of Depth on Compositional Generalization in Transformer Language Models Open

Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette , et al. · 2023

Computer science Mathematics Engineering

To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hyp…

Tal Linzen YOU? Author Swipe