Explanipedia

Large Language Models Discriminate Against Speakers of German Dialects Open

Minh Duc Bui, Carolin Holtermann, Valentin Hofmann, Katharina von der Wense · 2025

Dialects represent a significant component of human culture and are found across all regions of the world. In Germany, more than 40% of the population speaks a regional dialect (Adler and Hansen, 2022). However, despite cultural importance…

Fluid Language Model Benchmarking Open

Valentin Hofmann, David Heineman, Ian Magnusson, Kyle Lo, Jesse Dodge , et al. · 2025

Language model (LM) benchmarking faces several challenges: comprehensive evaluations are costly, benchmarks often fail to measure the intended capabilities, and evaluation quality can degrade due to labeling errors and benchmark saturation…

Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation Open

David Heineman, Valentin Hofmann, Ian Magnusson, Yuling Gu, Noah A. Smith , et al. · 2025

Developing large language models is expensive and involves making decisions with small experiments, typically by evaluating on large, multi-task evaluation suites. In this work, we analyze specific properties which make a benchmark more re…

Derivational morphology reveals analogical generalization in large language models Open

Valentin Hofmann, Leonie Weissweiler, David R. Mortensen, Hinrich Schütze, Janet B. Pierrehumbert · 2025

What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which the language skills of LLMs resemble rules. As of yet, …

IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance Open

Paul Röttger, Musashi Hinck, Valentin Hofmann, Kobi Hackenburg, Valentina Pyatkin , et al. · 2025

Computer science Psychology

Large language models (LLMs) are helping millions of users write texts about diverse issues, and in doing so expose users to different ideas and perspectives. This creates concerns about issue bias, where an LLM tends to present just one p…

Derivational Morphology Reveals Analogical Generalization in Large Language Models Open

Valentin Hofmann, Leonie Weissweiler, David R. Mortensen, Hinrich Schütze, Janet B. Pierrehumbert · 2024

Computer science Mathematics Philosophy

What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which the language skills of LLMs resemble rules. As of yet, …

Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks Open

Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter , et al. · 2024

Computer science Philosophy Chemistry

Language is not monolithic. While benchmarks, including those designed for multiple languages, are often used as proxies to evaluate the performance of Large Language Models (LLMs), they tend to overlook the nuances of within-language vari…

AI generates covertly racist decisions about people based on their dialect Open

Valentin Hofmann, Pratyusha Kalluri, Dan Jurafsky, Sharese King · 2024

Computer science Philosophy

Hundreds of millions of people now interact with language models, with uses ranging from help with writing 1,2 to informing hiring decisions 3 . However, these language models are known to perpetuate systematic racial prejudices, making th…

Dialect prejudice predicts AI decisions about people's character, employability, and criminality Open

Valentin Hofmann, Pratyusha Kalluri, Dan Jurafsky, Sharese King · 2024

Psychology Sociology Mathematics

Hundreds of millions of people now interact with language models, with uses ranging from serving as a writing aid to informing hiring decisions. Yet these language models are known to perpetuate systematic racial prejudices, making their j…

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models Open

Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk , et al. · 2024

Computer science Political science Geography

Much recent work seeks to evaluate values and opinions in large language models (LLMs) using multiple-choice surveys and questionnaires. Most of this work is motivated by concerns around real-world LLM applications. For example, politicall…

Graph-enhanced Large Language Models in Asynchronous Plan Reasoning Open

Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony G. Cohn , et al. · 2024

Computer science Geography

Planning is a fundamental property of human intelligence. Reasoning about asynchronous plans is challenging since it requires sequential and parallel planning to optimize time costs. Can large language models (LLMs) succeed at this task? H…

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Open

Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson , et al. · 2024

Computer science Philosophy

Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or …

Geographic Adaptation of Pretrained Language Models Open

Valentin Hofmann, Goran Glavaš, Nikola Ljubešić, Janet B. Pierrehumbert, Hinrich Schütze · 2024

Computer science Philosophy Physics

While pretrained language models (PLMs) have been shown to possess a plethora of linguistic knowledge, the existing body of research has largely neglected extralinguistic knowledge, which is generally difficult to obtain by pretraining on …

Paloma: A Benchmark for Evaluating Language Model Fit Open

Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha , et al. · 2023

Computer science Mathematics Geography

Evaluations of language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains--varying distributions of language. We introduce Perplexity Analysis for …

Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model Open

Leonie Weissweiler, Valentin Hofmann, Anjali Kantharuban, Anna Cai, Ritam Dutt , et al. · 2023

Computer science Biology Physics

Large language models (LLMs) have recently reached an impressive level of linguistic capability, prompting comparisons with human language skills. However, there have been relatively few systematic inquiries into the linguistic capabilitie…

Explaining pretrained language models' understanding of linguistic structures using construction grammar Open

Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, Hinrich Schütze · 2023

Computer science Philosophy

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of languag…

Counting the Bugs in ChatGPT’s Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model Open

Leonie Weissweiler, Valentin Hofmann, Anjali Kantharuban, Anna Cai, Ritam Dutt , et al. · 2023

Computer science Philosophy

Leonie Weissweiler, Valentin Hofmann, Anjali Kantharuban, Anna Cai, Ritam Dutt, Amey Hengle, Anubha Kabra, Atharva Kulkarni, Abhishek Vijayakumar, Haofei Yu, Hinrich Schuetze, Kemal Oflazer, David Mortensen. Proceedings of the 2023 Confere…

Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology Open

Valentin Hofmann, Janet B. Pierrehumbert, Hinrich Schütze · 2022

Computer science Mathematics Political science

We propose a fully unsupervised method to detect bias in contextualized embeddings. The method leverages the assortative information latently encoded by social networks and combines orthogonality regularization, structured sparsity learnin…

The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative Open

Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, Hinrich Schütze · 2022

Computer science Philosophy

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of languag…

The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse Open

Valentin Hofmann, Hinrich Schütze, Janet B. Pierrehumbert · 2022

Computer science Political science Geography

We introduce the Reddit Politosphere, a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. It is to the best of our knowledge the largest and ideologically most …

CaMEL: Case Marker Extraction without Labels Open

Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, Hinrich Schütze · 2022

Computer science Economics

We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages. We propose a first model for CaMEL that uses a massively multiling…

The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse Open

Valentin Hofmann, Hinrich Schütze, Janet B. Pierrehumbert · 2022

Computer science Political science Sociology

The Reddit Politosphere is a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. Based on the Pushshift Reddit Dataset, it is to the best of our knowledge the lar…

The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse Open

Valentin Hofmann, Hinrich Schütze, Janet B. Pierrehumbert · 2022

Computer science Political science Sociology

The Reddit Politosphere is a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. Based on the Pushshift Reddit Dataset, it is to the best of our knowledge the lar…

The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative Open

Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, Hinrich Schütze · 2022

Computer science Philosophy

Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of languag…

CaMEL: Case Marker Extraction without Labels Open

Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, Hinrich Schuetze · 2022

Computer science Economics

We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages. We propose a first model for CaMEL that uses a massively multiling…

Modeling Ideological Salience and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity Open

Valentin Hofmann, Xiaowen Dong, Janet B. Pierrehumbert, Hinrich Schuetze · 2022

Computer science Political science Engineering

The increasing polarization of online political discourse calls for computational tools that automatically detect and monitor ideological divides in social media. We introduce a minimally supervised method that leverages the network struct…

An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers Open

Valentin Hofmann, Hinrich Schuetze, Janet B. Pierrehumbert · 2022

Computer science Chemistry Philosophy

We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the tokenization of pretrained language models (PLMs). FLOTA uses the vocabulary of a standard tokenizer but tries to preserve the morphological…

Valentin Hofmann YOU? Author Swipe