Explanipedia

Improving Cooperation in Collaborative Embodied AI Open

Hima Jacob Leven Suprabha, Laxmi Nag Laxminarayan Nagesh, Anup Karath Nair, Alvin Reuben Amal Selvaster, Adnan Shahid Khan , et al. · 2025

The integration of Large Language Models (LLMs) into multiagent systems has opened new possibilities for collaborative reasoning and cooperation with AI agents. This paper explores different prompting methods and evaluates their effectiven…

Playpen: An Environment for Exploring Learning Through Conversational Interaction Open

Nicola Horst, Davide Mazzaccara, Antonia Schmidt, Michael Sullivan, Filippo Momentè , et al. · 2025

Interaction between learner and feedback-giver has come into focus recently for post-training of Large Language Models (LLMs), through the use of reward models that judge the appropriateness of a model's response. In this paper, we investi…

NLP verification: towards a general methodology for certifying robustness Open

Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Matthew L. Daggitt , et al. · 2025

Computer science Chemistry

Machine learning has exhibited substantial success in the field of natural language processing (NLP). For example, large language models have empirically proven to be capable of producing text of high complexity and cohesion. However, at t…

Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests Open

Filippo Momentè, Alessandro Suglia, Mario Giulianelli, Ambra Ferrari, Alexander Koller , et al. · 2025

Computer science Psychology

We examine three evaluation paradigms: standard benchmarks (e.g., MMLU and BBH), interactive games (e.g., Signalling Games or Taboo), and cognitive tests (e.g., for working memory or theory of mind). First, we investigate which of the form…

Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling Open

Γεώργιος Πανταζόπουλος, Malvina Nikandrou, Alessandro Suglia, Oliver Lemon, Arash Eshghi · 2024

Computer science Engineering

This study explores replacing Transformers in Visual Language Models (VLMs) with Mamba, a recent structured state space model (SSM) that demonstrates promising performance in sequence modeling. We test models up to 3B parameters under cont…

AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding Open

Alessandro Suglia, Claudio Greco, Katie Baker, Jose L. Part, Ioannis Papaioannou , et al. · 2024

Computer science Psychology Geography

AI personal assistants deployed via robots or wearables require embodied understanding to collaborate with humans effectively. However, current Vision-Language Models (VLMs) primarily focus on third-person view videos, neglecting the richn…

Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers Open

Γεώργιος Πανταζόπουλος, Alessandro Suglia, Oliver Lemon, Arash Eshghi · 2024

Computer science

An effective method for combining frozen large language models (LLM) and visual encoders involves a resampler module that creates a `visual prompt' which is provided to the LLM, along with the textual prompt. While this approach has enable…

NLP Verification: Towards a General Methodology for Certifying Robustness Open

Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Omri Isac , et al. · 2024

Computer science Chemistry

Machine Learning (ML) has exhibited substantial success in the field of Natural Language Processing (NLP). For example large language models have empirically proven to be capable of producing text of high complexity and cohesion. However, …

Visually Grounded Language Learning: A Review of Language Games, Datasets, Tasks, and Models Open

Alessandro Suglia, Ioannis Konstas, Oliver Lemon · 2024

Computer science Psychology

In recent years, several machine learning models have been proposed. They are trained with a language modelling objective on large-scale text-only data. With such pretraining, they can achieve impressive results on many Natural Language Un…

RECANTFormer: Referring Expression Comprehension with Varying Numbers of Targets Open

Bhathiya Hemanthage, Hakan Bilen, Phil Bartie, Christian Dondrup, Oliver Lemon · 2024

Computer science

The Generalized Referring Expression Comprehension (GREC) task extends classic REC by generating image bounding boxes for objects referred to in natural language expressions, which may indicate zero, one, or multiple targets. This generali…

Visually Grounded Language Learning: a review of language games, datasets, tasks, and models Open

Alessandro Suglia, Ioannis Konstas, Oliver Lemon · 2023

Computer science Psychology History

In recent years, several machine learning models have been proposed. They are trained with a language modelling objective on large-scale text-only data. With such pretraining, they can achieve impressive results on many Natural Language Un…

Keynote presentation - Conversations with robots and AIs - Can foundation models support human wellbeing? Open

Oliver Lemon · 2023

Computer science Psychology Political science

Drawing on examples from several research projects at the National Robotarium and Alana AI, including SPRING (social robots for elder care), RES-Q+ (spoken dialogue to support stroke patients), and RNIB (assistive visual dialogue for parti…

Multitask Multimodal Prompted Training for Interactive Embodied Task Completion Open

Γεώργιος Πανταζόπουλος, Malvina Nikandrou, Amit Parekh, Bhathiya Hemanthage, Arash Eshghi , et al. · 2023

Computer science Physics Economics

Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation. To tackle th…

Detecting Agreement in Multi-party Conversational AI Open

Laura Schauer, Jason Sweeney, Charlie Lyttle, Zein Said, Aron Szeles , et al. · 2023

Computer science Psychology Philosophy

Today, conversational systems are expected to handle conversations in multi-party settings, especially within Socially Assistive Robots (SARs). However, practical usability remains difficult as there are additional challenges to overcome, …

Detecting agreement in multi-party dialogue: evaluating speaker diarisation versus a procedural baseline to enhance user engagement Open

Angus Addlesee, Daniel Denley, Andy Edmondson, Nancie Gunson, Daniel Hernández García , et al. · 2023

Computer science Psychology Political science

Conversational agents participating in multi-party interactions face significant challenges in dialogue state tracking, since the identity of the speaker adds significant contextual meaning. It is common to utilise diarisation models to id…

Building for Speech: Designing the Next Generation of Social Robots for Audio Interaction Open

Angus Addlesee, Ioannis Papaioannou, Oliver Lemon · 2023

Computer science Psychology

There have been incredible advancements in robotics and spoken dialogue systems (SDSs) over the past few years, yet we still don't find social robots in public spaces like train stations, shopping malls, or hospital waiting rooms. In this …

FurChat: An Embodied Conversational Agent using LLMs, Combining Open and Closed-Domain Dialogue with Facial Expressions Open

Neeraj Cherakara, Finny S. Varghese, S. Shabana, Nivan Nelson, Abhiram Karukayil , et al. · 2023

Computer science Psychology Mathematics

We demonstrate an embodied conversational agent that can function as a receptionist and generate a mixture of open and closed-domain dialogue along with facial expressions, by using a large language model (LLM) to develop an engaging conve…

Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering Open

Angus Addlesee, Weronika Sieińska, Nancie Gunson, Daniel Hernández García, Christian Dondrup , et al. · 2023

Computer science Psychology Engineering

This paper evaluates the extent to which current Large Language Models (LLMs) can capture task-oriented multi-party conversations (MPCs). We have recorded and transcribed 29 MPCs between patients, their companions, and a social robot in a …

SimpleMTOD: A Simple Language Model for Multimodal Task-Oriented Dialogue with Symbolic Scene Representation Open

Bhathiya Hemanthage, Christian Dondrup, Phil Bartie, Oliver Lemon · 2023

Computer science Economics Political science

SimpleMTOD is a simple language model which recasts several sub-tasks in multimodal task-oriented dialogues as sequence prediction tasks. SimpleMTOD is built on a large-scale transformer-based auto-regressive architecture, which has alread…

Identifying Challenges and Opportunities for Intelligent Data-Driven Health Interfaces to Support Ongoing Care Open

Hendrik Knoche, Alfie Abdul‐Rahman, Leigh Clark, Vasa Ćurčin, Zhiqiang Huo , et al. · 2023

Computer science Engineering Sociology

This workshop will explore future work in the area of intelligent, conversational, data-driven health interfaces both from patients’ and health care professionals’ perspectives. We aim to bring together a diverse set of experts and stakeho…

Multitask Multimodal Prompted Training for Interactive Embodied Task Completion Open

Γεώργιος Πανταζόπουλος, Malvina Nikandrou, Amit Parekh, Bhathiya Hemanthage, Arash Eshghi , et al. · 2023

Computer science Engineering

Georgios Pantazopoulos, Malvina Nikandrou, Amit Parekh, Bhathiya Hemanthage, Arash Eshghi, Ioannis Konstas, Verena Rieser, Oliver Lemon, Alessandro Suglia. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Process…

Multi-party Goal Tracking with LLMs: Comparing Pre-training, Fine-tuning, and Prompt Engineering Open

Angus Addlesee, Weronika Sieińska, Nancie Gunson, Daniel Hernández García, Christian Dondrup , et al. · 2023

Computer science Political science Sociology

Angus Addlesee, Weronika Sieińska, Nancie Gunson, Daniel Hernandez Garcia, Christian Dondrup, Oliver Lemon. Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue. 2023.

FurChat: An Embodied Conversational Agent using LLMs, Combining Open and Closed-Domain Dialogue with Facial Expressions Open

Neeraj Cherakara, Finny S. Varghese, S. Shabana, Nivan Nelson, Abhiram Karukayil , et al. · 2023

Computer science Philosophy Mathematics

Neeraj Cherakara, Finny Varghese, Sheena Shabana, Nivan Nelson, Abhiram Karukayil, Rohith Kulothungan, Mohammed Afil Farhan, Birthe Nesset, Meriam Moujahid, Tanvi Dinkar, Verena Rieser, Oliver Lemon. Proceedings of the 24th Meeting of the …

Oliver Lemon YOU? Author Swipe