Valentin Hofmann
YOU?
Author Swipe
View article: Large Language Models Discriminate Against Speakers of German Dialects
Large Language Models Discriminate Against Speakers of German Dialects Open
Dialects represent a significant component of human culture and are found across all regions of the world. In Germany, more than 40% of the population speaks a regional dialect (Adler and Hansen, 2022). However, despite cultural importance…
View article: Fluid Language Model Benchmarking
Fluid Language Model Benchmarking Open
Language model (LM) benchmarking faces several challenges: comprehensive evaluations are costly, benchmarks often fail to measure the intended capabilities, and evaluation quality can degrade due to labeling errors and benchmark saturation…
View article: Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation Open
Developing large language models is expensive and involves making decisions with small experiments, typically by evaluating on large, multi-task evaluation suites. In this work, we analyze specific properties which make a benchmark more re…
View article: Derivational morphology reveals analogical generalization in large language models
Derivational morphology reveals analogical generalization in large language models Open
What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which the language skills of LLMs resemble rules. As of yet, …
View article: IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance Open
Large language models (LLMs) are helping millions of users write texts about diverse issues, and in doing so expose users to different ideas and perspectives. This creates concerns about issue bias, where an LLM tends to present just one p…
View article: Derivational Morphology Reveals Analogical Generalization in Large Language Models
Derivational Morphology Reveals Analogical Generalization in Large Language Models Open
What mechanisms underlie linguistic generalization in large language models (LLMs)? This question has attracted considerable attention, with most studies analyzing the extent to which the language skills of LLMs resemble rules. As of yet, …
View article: Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks Open
Language is not monolithic. While benchmarks, including those designed for multiple languages, are often used as proxies to evaluate the performance of Large Language Models (LLMs), they tend to overlook the nuances of within-language vari…
View article: AI generates covertly racist decisions about people based on their dialect
AI generates covertly racist decisions about people based on their dialect Open
Hundreds of millions of people now interact with language models, with uses ranging from help with writing 1,2 to informing hiring decisions 3 . However, these language models are known to perpetuate systematic racial prejudices, making th…
View article: Dialect prejudice predicts AI decisions about people's character, employability, and criminality
Dialect prejudice predicts AI decisions about people's character, employability, and criminality Open
Hundreds of millions of people now interact with language models, with uses ranging from serving as a writing aid to informing hiring decisions. Yet these language models are known to perpetuate systematic racial prejudices, making their j…
View article: Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models Open
Much recent work seeks to evaluate values and opinions in large language models (LLMs) using multiple-choice surveys and questionnaires. Most of this work is motivated by concerns around real-world LLM applications. For example, politicall…
View article: Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning Open
Planning is a fundamental property of human intelligence. Reasoning about asynchronous plans is challenging since it requires sequential and parallel planning to optimize time costs. Can large language models (LLMs) succeed at this task? H…
View article: Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Open
Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or …
View article: Geographic Adaptation of Pretrained Language Models
Geographic Adaptation of Pretrained Language Models Open
While pretrained language models (PLMs) have been shown to possess a plethora of linguistic knowledge, the existing body of research has largely neglected extralinguistic knowledge, which is generally difficult to obtain by pretraining on …
View article: Paloma: A Benchmark for Evaluating Language Model Fit
Paloma: A Benchmark for Evaluating Language Model Fit Open
Evaluations of language models (LMs) commonly report perplexity on monolithic data held out from training. Implicitly or explicitly, this data is composed of domains--varying distributions of language. We introduce Perplexity Analysis for …
View article: Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model
Counting the Bugs in ChatGPT's Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model Open
Large language models (LLMs) have recently reached an impressive level of linguistic capability, prompting comparisons with human language skills. However, there have been relatively few systematic inquiries into the linguistic capabilitie…
View article: Explaining pretrained language models' understanding of linguistic structures using construction grammar
Explaining pretrained language models' understanding of linguistic structures using construction grammar Open
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of languag…
View article: Counting the Bugs in ChatGPT’s Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model
Counting the Bugs in ChatGPT’s Wugs: A Multilingual Investigation into the Morphological Capabilities of a Large Language Model Open
Leonie Weissweiler, Valentin Hofmann, Anjali Kantharuban, Anna Cai, Ritam Dutt, Amey Hengle, Anubha Kabra, Atharva Kulkarni, Abhishek Vijayakumar, Haofei Yu, Hinrich Schuetze, Kemal Oflazer, David Mortensen. Proceedings of the 2023 Confere…
View article: Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology
Unsupervised Detection of Contextualized Embedding Bias with Application to Ideology Open
We propose a fully unsupervised method to detect bias in contextualized embeddings. The method leverages the assortative information latently encoded by social networks and combines orthogonality regularization, structured sparsity learnin…
View article: The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative
The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative Open
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of languag…
View article: The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse
The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse Open
We introduce the Reddit Politosphere, a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. It is to the best of our knowledge the largest and ideologically most …
View article: CaMEL: Case Marker Extraction without Labels
CaMEL: Case Marker Extraction without Labels Open
We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages. We propose a first model for CaMEL that uses a massively multiling…
View article: The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse
The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse Open
The Reddit Politosphere is a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. Based on the Pushshift Reddit Dataset, it is to the best of our knowledge the lar…
View article: The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse
The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse Open
The Reddit Politosphere is a large-scale resource of online political discourse covering more than 600 political discussion groups over a period of 12 years. Based on the Pushshift Reddit Dataset, it is to the best of our knowledge the lar…
View article: The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative
The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative Open
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasising the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of languag…
View article: CaMEL: Case Marker Extraction without Labels
CaMEL: Case Marker Extraction without Labels Open
We introduce CaMEL (Case Marker Extraction without Labels), a novel and challenging task in computational morphology that is especially relevant for low-resource languages. We propose a first model for CaMEL that uses a massively multiling…
View article: Modeling Ideological Salience and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity
Modeling Ideological Salience and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity Open
The increasing polarization of online political discourse calls for computational tools that automatically detect and monitor ideological divides in social media. We introduce a minimally supervised method that leverages the network struct…
View article: An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers
An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers Open
We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the tokenization of pretrained language models (PLMs). FLOTA uses the vocabulary of a standard tokenizer but tries to preserve the morphological…