Massimo Poesio
YOU?
Author Swipe
View article: LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task
LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task Open
Many researchers have reached the conclusion that AI models should be trained to be aware of the possibility of variation and disagreement in human judgments, and evaluated as per their ability to recognize such variation. The LEWIDI serie…
View article: Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference
Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference Open
Our goal is to study how LLMs represent and interpret plural reference in ambiguous and unambiguous contexts. We ask the following research questions: (1) Do LLMs exhibit human-like preferences in representing plural reference? (2) Are LLM…
View article: Improving LLMs' Learning for Coreference Resolution
Improving LLMs' Learning for Coreference Resolution Open
Coreference Resolution (CR) is crucial for many NLP tasks, but existing LLMs struggle with hallucination and under-performance. In this paper, we investigate the limitations of existing LLM-based approaches to CR-specifically the Question-…
View article: Automated Detection of Referential Features in Schizophrenic Speech Using Large Language Models
Automated Detection of Referential Features in Schizophrenic Speech Using Large Language Models Open
Cross-linguistic studies have demonstrated that individuals with schizophrenia—particularly those exhibiting formal thought disorder (FTD)—show distinctive distributions of noun phrases (NPs) in spontaneous speech. NPs (e.g., the picture; …
View article: ERP Signatures of Prediction Error in the Comprehension of Ambiguous Plural Pronouns in German
ERP Signatures of Prediction Error in the Comprehension of Ambiguous Plural Pronouns in German Open
The referent of a plural pronoun can be ambiguous, posing comprehension challenges. Behavioural studies show an ambiguity advantage, with reduced processing cost for an ambiguous pronoun. This raises the question: “How does ERP evidence fo…
View article: Talking-to-Build: How LLM-Assisted Interface Shapes Player Performance and Experience in Minecraft
Talking-to-Build: How LLM-Assisted Interface Shapes Player Performance and Experience in Minecraft Open
With large language models (LLMs) on the rise, in-game interactions are shifting from rigid commands to natural conversations. However, the impacts of LLMs on player performance and game experience remain underexplored. This work explores …
View article: Assessing the Reliability of LLMs Annotations in the Context of Demographic Bias and Model Explanation
Assessing the Reliability of LLMs Annotations in the Context of Demographic Bias and Model Explanation Open
Understanding the sources of variability in annotations is crucial for developing fair NLP systems, especially for tasks like sexism detection where demographic bias is a concern. This study investigates the extent to which annotator demog…
View article: Referential ambiguity and clarification requests: comparing human and LLM behaviour
Referential ambiguity and clarification requests: comparing human and LLM behaviour Open
In this work we examine LLMs' ability to ask clarification questions in task-oriented dialogues that follow the asynchronous instruction-giver/instruction-follower format. We present a new corpus that combines two existing annotations of t…
View article: MDC-R: The Minecraft Dialogue Corpus with Reference
MDC-R: The Minecraft Dialogue Corpus with Reference Open
We introduce the Minecraft Dialogue Corpus with Reference (MDC-R). MDC-R is a new language resource that supplements the original Minecraft Dialogue Corpus (MDC) with expert annotations of anaphoric and deictic reference. MDC's task-orient…
View article: Ancient Script Image Recognition and Processing: A Review
Ancient Script Image Recognition and Processing: A Review Open
Ancient scripts, e.g., Egyptian hieroglyphs, Oracle Bone Inscriptions, and Ancient Greek inscriptions, serve as vital carriers of human civilization, embedding invaluable historical and cultural information. Automating ancient script image…
View article: Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains
Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains Open
With the growth of the Internet, buying habits have changed, and customers have become more dependent on the online opinions of other customers to guide their purchases. Identifying fake reviews thus became an important area for Natural La…
View article: Family lexicon: Using language models to encode memories of personally familiar and famous people and places in the brain
Family lexicon: Using language models to encode memories of personally familiar and famous people and places in the brain Open
Knowledge about personally familiar people and places is extremely rich and varied, involving pieces of semantic information connected in unpredictable ways through past autobiographical memories. In this work, we investigate whether we ca…
View article: Understanding The Effect Of Temperature On Alignment With Human Opinions
Understanding The Effect Of Temperature On Alignment With Human Opinions Open
With the increasing capabilities of LLMs, recent studies focus on understanding whose opinions are represented by them and how to effectively extract aligned opinion distributions. We conducted an empirical analysis of three straightforwar…
View article: A Survey of Coreference and Zeros Resolution for Arabic
A Survey of Coreference and Zeros Resolution for Arabic Open
Coreference resolution is the task of resolving mentions that refer to the same entity into clusters. The area and its tasks are crucial in natural language processing applications. Extensive surveys of this task have been conducted for En…
View article: ClarQ-LLM: A Benchmark for Models Clarifying and Requesting Information in Task-Oriented Dialog
ClarQ-LLM: A Benchmark for Models Clarifying and Requesting Information in Task-Oriented Dialog Open
We introduce ClarQ-LLM, an evaluation framework consisting of bilingual English-Chinese conversation tasks, conversational agents and evaluation metrics, designed to serve as a strong benchmark for assessing agents' ability to ask clarific…
View article: A LLM Benchmark based on the Minecraft Builder Dialog Agent Task
A LLM Benchmark based on the Minecraft Builder Dialog Agent Task Open
In this work we proposing adapting the Minecraft builder task into an LLM benchmark suitable for evaluating LLM ability in spatially orientated tasks, and informing builder agent design. Previous works have proposed corpora with varying co…
View article: The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation
The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation Open
Large Language Models (LLMs) have emerged as powerful support tools across various natural language tasks and a range of application domains. Recent studies focus on exploring their capabilities for data annotation. This paper provides a c…
View article: Integrating knowledge bases to improve coreference and bridging resolution for the chemical domain
Integrating knowledge bases to improve coreference and bridging resolution for the chemical domain Open
Resolving coreference and bridging relations in chemical patents is important for better understanding the precise chemical process, where chemical domain knowledge is very critical. We proposed an approach incorporating external knowledge…
View article: Extending Activation Steering to Broad Skills and Multiple Behaviours
Extending Activation Steering to Broad Skills and Multiple Behaviours Open
Current large language models have dangerous capabilities, which are likely to become more problematic in the future. Activation steering techniques can be used to reduce risks from these capabilities. In this paper, we investigate the eff…
View article: Large Language Models as Minecraft Agents
Large Language Models as Minecraft Agents Open
In this work we examine the use of Large Language Models (LLMs) in the challenging setting of acting as a Minecraft agent. We apply and evaluate LLMs in the builder and architect settings, introduce clarification questions and examining th…
View article: Polysemy—Evidence from Linguistics, Behavioral Science, and Contextualized Language Models
Polysemy—Evidence from Linguistics, Behavioral Science, and Contextualized Language Models Open
Polysemy is the type of lexical ambiguity where a word has multiple distinct but related interpretations. In the past decade, it has been the subject of a great many studies across multiple disciplines including linguistics, psychology, ne…
View article: Modeling Brain Representations of Words' Concreteness in Context Using GPT‐2 and Human Ratings
Modeling Brain Representations of Words' Concreteness in Context Using GPT‐2 and Human Ratings Open
The meaning of most words in language depends on their context. Understanding how the human brain extracts contextualized meaning, and identifying where in the brain this takes place, remain important scientific challenges. But technologic…