Tal Linzen
YOU?
Author Swipe
View article: To model human linguistic prediction, make LLMs less superhuman
To model human linguistic prediction, make LLMs less superhuman Open
When people listen to or read a sentence, they actively make predictions about upcoming words: words that are less predictable are generally read more slowly than predictable ones. The success of large language models (LLMs), which, like h…
View article: Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics
Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics Open
Large language models (LLMs) struggle with cross-lingual knowledge transfer: they hallucinate when asked in one language about facts expressed in a different language during training. This work introduces a controlled setting to study the …
View article: RELIC: Evaluating Compositional Instruction Following via Language Recognition
RELIC: Evaluating Compositional Instruction Following via Language Recognition Open
Large language models (LLMs) are increasingly expected to perform tasks based only on a specification of the task provided in context, without examples of inputs and outputs; this ability is referred to as instruction following. We introdu…
View article: Bigger is not always better: The importance of human-scale language modeling for psycholinguistics
Bigger is not always better: The importance of human-scale language modeling for psycholinguistics Open
Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance improvements can arise from simply increasing (…
View article: Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases Open
Pretraining language models on formal language can improve their acquisition of natural language. Which features of the formal language impart an inductive bias that leads to effective transfer? Drawing on insights from linguistics and com…
View article: Rapid Word Learning Through Meta In-Context Learning
Rapid Word Learning Through Meta In-Context Learning Open
Humans can quickly learn a new word from a few illustrative examples, and then systematically and flexibly use it in novel contexts. Yet the abilities of current language models for few-shot word learning, and methods for improving these a…
View article: BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop
BabyLM Turns 3: Call for papers for the 2025 BabyLM workshop Open
BabyLM aims to dissolve the boundaries between cognitive modeling and language modeling. We call for both workshop papers and for researchers to join the 3rd BabyLM competition. As in previous years, we call for participants in the data-ef…
View article: Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora Open
The BabyLM Challenge is a community effort to close the data-efficiency gap between human and computational language learners. Participants compete to optimize language model training on a fixed language data budget of 100 million words or…
View article: What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length Open
When comparing the linguistic capabilities of language models (LMs) with humans using LM probabilities, factors such as the length of the sequence and the unigram frequency of lexical items have a significant effect on LM probabilities in …
View article: How Does Code Pretraining Affect Language Model Task Performance?
How Does Code Pretraining Affect Language Model Task Performance? Open
Large language models are increasingly trained on corpora containing both natural language and non-linguistic data like source code. Aside from aiding programming-related tasks, anecdotal evidence suggests that including code in pretrainin…
View article: Bigger is not always better: The importance of human-scale language modeling for psycholinguistics
Bigger is not always better: The importance of human-scale language modeling for psycholinguistics Open
Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance improvements can arise from simply increasing (…
View article: Bigger is not always better: The importance of human-scale language modeling for psycholinguistics
Bigger is not always better: The importance of human-scale language modeling for psycholinguistics Open
Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance improvements can arise from simply increasing (…
View article: Testing learning hypotheses using neural networks by manipulating learning data
Testing learning hypotheses using neural networks by manipulating learning data Open
Although passivization is productive in English, it is not completely general -- some exceptions exist (e.g. *One hour was lasted by the meeting). How do English speakers learn these exceptions to an otherwise general pattern? Using neural…
View article: [Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus
[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus Open
After last year's successful BabyLM Challenge, the competition will be hosted again in 2024/2025. The overarching goals of the challenge remain the same; however, some of the competition rules will be different. The big changes for this ye…
View article: SPAWNing Structural Priming Predictions from a Cognitively Motivated Parser
SPAWNing Structural Priming Predictions from a Cognitively Motivated Parser Open
Structural priming is a widely used psycholinguistic paradigm to study human sentence representations. In this work we introduce SPAWN, a cognitively motivated parser that can generate quantitative priming predictions from contemporary the…
View article: Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment
Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment Open
Do LMs infer the semantics of text from co-occurrence patterns in their training data? Merrill et al. (2022) argue that, in theory, sentence co-occurrence probabilities predicted by an optimal LM should reflect the entailment relationship …
View article: Do Language Models’ Words Refer?
Do Language Models’ Words Refer? Open
What do language models (LMs) do with language? They can produce sequences of (mostly) coherent strings closely resembling English. But do those sentences mean something, or are LMs simply babbling in a convincing simulacrum of language us…
View article: Neural Networks as Cognitive Models of the Processing of Syntactic Constraints
Neural Networks as Cognitive Models of the Processing of Syntactic Constraints Open
Languages are governed by syntactic constraints—structural rules that determine which sentences are grammatical in the language. In English, one such constraint is subject-verb agreement, which dictates that the number of a verb must match…
View article: In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax Open
In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates. Do models guided via ICL infer th…
View article: A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models
A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models Open
A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises. Psychologists have documented several ways in which humans' inferences deviate from the rules of log…
View article: The Impact of Depth on Compositional Generalization in Transformer Language Models
The Impact of Depth on Compositional Generalization in Transformer Language Models Open
To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hyp…