Explanipedia

LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task Open

Elisa Leonardelli, Siyao Peng, Giulia Rizzi, Valerio Basile, Elisabetta Fersini , et al. · 2025

Many researchers have reached the conclusion that AI models should be trained to be aware of the possibility of variation and disagreement in human judgments, and evaluated as per their ability to recognize such variation. The LEWIDI serie…

Can LLMs Detect Ambiguous Plural Reference? An Analysis of Split-Antecedent and Mereological Reference Open

Dang Hoang Anh, Rick Nouwen, Massimo Poesio · 2025

Our goal is to study how LLMs represent and interpret plural reference in ambiguous and unambiguous contexts. We ask the following research questions: (1) Do LLMs exhibit human-like preferences in representing plural reference? (2) Are LLM…

Improving LLMs' Learning for Coreference Resolution Open

Yujian Gan, Yuan Liang, Yanni Lin, Juntao Yu, Massimo Poesio · 2025

Coreference Resolution (CR) is crucial for many NLP tasks, but existing LLMs struggle with hallucination and under-performance. In this paper, we investigate the limitations of existing LLM-based approaches to CR-specifically the Question-…

Automated Detection of Referential Features in Schizophrenic Speech Using Large Language Models Open

Derya Çokal, Melike Filizer, Martín Villalba, Douglas Turkington, Nicol Ferrier , et al. · 2025

Cross-linguistic studies have demonstrated that individuals with schizophrenia—particularly those exhibiting formal thought disorder (FTD)—show distinctive distributions of noun phrases (NPs) in spontaneous speech. NPs (e.g., the picture; …

ERP Signatures of Prediction Error in the Comprehension of Ambiguous Plural Pronouns in German Open

Derya Çokal, Moshe Phillip, Martín Villalba, Martín Villalba, Massimo Poesio , et al. · 2025

Computer science Psychology Philosophy

The referent of a plural pronoun can be ambiguous, posing comprehension challenges. Behavioural studies show an ambiguity advantage, with reduced processing cost for an ambiguous pronoun. This raises the question: “How does ERP evidence fo…

Talking-to-Build: How LLM-Assisted Interface Shapes Player Performance and Experience in Minecraft Open

Xin Sun, Lei Wang, Yue Li, Jie Li, Massimo Poesio , et al. · 2025

With large language models (LLMs) on the rise, in-game interactions are shifting from rigid commands to natural conversations. However, the impacts of LLMs on player performance and game experience remain underexplored. This work explores …

Assessing the Reliability of LLMs Annotations in the Context of Demographic Bias and Model Explanation Open

Hadi Mohammadi, Tina Shahedi, Pablo Mosteiro, Massimo Poesio, Ayoub Bagheri , et al. · 2025

Understanding the sources of variability in annotations is crucial for developing fair NLP systems, especially for tasks like sexism detection where demographic bias is a concern. This study investigates the extent to which annotator demog…

Referential ambiguity and clarification requests: comparing human and LLM behaviour Open

Chris Madge, Matthew Purver, Massimo Poesio · 2025

In this work we examine LLMs' ability to ask clarification questions in task-oriented dialogues that follow the asynchronous instruction-giver/instruction-follower format. We present a new corpus that combines two existing annotations of t…

MDC-R: The Minecraft Dialogue Corpus with Reference Open

Maris Camilleri, Jianbo Shao, Prashant Jayannavar, Massimo Poesio · 2025

We introduce the Minecraft Dialogue Corpus with Reference (MDC-R). MDC-R is a new language resource that supplements the original Minecraft Dialogue Corpus (MDC) with expert annotations of anaphoric and deictic reference. MDC's task-orient…

Ancient Script Image Recognition and Processing: A Review Open

Xiaolei Diao, Rite Bo, Yanling Xiao, Lida Shi, Zhihan Zhou , et al. · 2025

Ancient scripts, e.g., Egyptian hieroglyphs, Oracle Bone Inscriptions, and Ancient Greek inscriptions, serve as vital carriers of human civilization, embedding invaluable historical and cultural information. Automating ancient script image…

Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains Open

Ming Liu, Massimo Poesio · 2025

With the growth of the Internet, buying habits have changed, and customers have become more dependent on the online opinions of other customers to guide their purchases. Identifying fake reviews thus became an important area for Natural La…

Family lexicon: Using language models to encode memories of personally familiar and famous people and places in the brain Open

Andrea Bruera, Massimo Poesio · 2024

Psychology Computer science Philosophy

Knowledge about personally familiar people and places is extremely rich and varied, involving pieces of semantic information connected in unpredictable ways through past autobiographical memories. In this work, we investigate whether we ca…

Understanding The Effect Of Temperature On Alignment With Human Opinions Open

Maja Pavlovic, Massimo Poesio · 2024

Business Political science

With the increasing capabilities of LLMs, recent studies focus on understanding whose opinions are represented by them and how to effectively extract aligned opinion distributions. We conducted an empirical analysis of three straightforwar…

A Survey of Coreference and Zeros Resolution for Arabic Open

Abdulrahman Aloraini, Juntao Yu, Wateen Aliady, Massimo Poesio · 2024

Computer science Philosophy

Coreference resolution is the task of resolving mentions that refer to the same entity into clusters. The area and its tasks are crucial in natural language processing applications. Extensive surveys of this task have been conducted for En…

ClarQ-LLM: A Benchmark for Models Clarifying and Requesting Information in Task-Oriented Dialog Open

Yujian Gan, Changling Li, Jinxia Xie, Luou Wen, Matthew Purver , et al. · 2024

Computer science Psychology Engineering

We introduce ClarQ-LLM, an evaluation framework consisting of bilingual English-Chinese conversation tasks, conversational agents and evaluation metrics, designed to serve as a strong benchmark for assessing agents' ability to ask clarific…

A LLM Benchmark based on the Minecraft Builder Dialog Agent Task Open

Chris Madge, Massimo Poesio · 2024

Computer science Engineering Geography

In this work we proposing adapting the Minecraft builder task into an LLM benchmark suitable for evaluating LLM ability in spatially orientated tasks, and informing builder agent design. Previous works have proposed corpora with varying co…

The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation Open

Maja Pavlovic, Massimo Poesio · 2024

Computer science Political science

Large Language Models (LLMs) have emerged as powerful support tools across various natural language tasks and a range of application domains. Recent studies focus on exploring their capabilities for data annotation. This paper provides a c…

Integrating knowledge bases to improve coreference and bridging resolution for the chemical domain Open

Pengcheng Lu, Massimo Poesio · 2024

Computer science Mathematics

Resolving coreference and bridging relations in chemical patents is important for better understanding the precise chemical process, where chemical domain knowledge is very critical. We proposed an approach incorporating external knowledge…

Extending Activation Steering to Broad Skills and Multiple Behaviours Open

Teun van der Weij, Massimo Poesio, Nandi Schoots · 2024

Psychology

Current large language models have dangerous capabilities, which are likely to become more problematic in the future. Activation steering techniques can be used to reduce risks from these capabilities. In this paper, we investigate the eff…

Large Language Models as Minecraft Agents Open

Chris Madge, Massimo Poesio · 2024

Computer science Philosophy

In this work we examine the use of Large Language Models (LLMs) in the challenging setting of acting as a Minecraft agent. We apply and evaluate LLMs in the builder and architect settings, introduce clarification questions and examining th…

Polysemy—Evidence from Linguistics, Behavioral Science, and Contextualized Language Models Open

Janosch Haber, Massimo Poesio · 2023

Computer science Psychology Philosophy

Polysemy is the type of lexical ambiguity where a word has multiple distinct but related interpretations. In the past decade, it has been the subject of a great many studies across multiple disciplines including linguistics, psychology, ne…

Modeling Brain Representations of Words' Concreteness in Context Using GPT‐2 and Human Ratings Open

Andrea Bruera, Yuan Tao, Andrew J. Anderson, Derya Çokal, Janosch Haber , et al. · 2023

Psychology Computer science Philosophy

The meaning of most words in language depends on their context. Understanding how the human brain extracts contextualized meaning, and identifying where in the brain this takes place, remain important scientific challenges. But technologic…

Massimo Poesio YOU? Author Swipe