Explanipedia

Aya Vision: Advancing the Frontier of Multilingual Multimodality Open

Saurabh Dash, Nan Yu, John Dang, Arash Ahmadian, Shivalika Singh , et al. · 2025

Building multimodal language models is fundamentally challenging: it requires aligning vision and language modalities, curating high-quality instruction data, and avoiding the degradation of existing text-only capabilities once vision is i…

Uncertainty-Aware Step-wise Verification with Generative Reward Models Open

Zihuiwen Ye, Luckeciano C. Melo, Younesse Kaddar, Phil Blunsom, Sam Staton , et al. · 2025

Computer science

Complex multi-step reasoning tasks, such as solving mathematical problems, remain challenging for large language models (LLMs). While outcome supervision is commonly used, process supervision via process reward models (PRMs) provides inter…

Aya Expanse: Combining Research Breakthroughs for a New Multilingual\n Frontier Open

John Dang, Shivalika Singh, Daniel D'souza, Arash Ahmadian, Alejandro Salamanca , et al. · 2024

Computer science Political science

We introduce the Aya Expanse model family, a new generation of 8B and 32B\nparameter multilingual language models, aiming to address the critical\nchallenge of developing highly performant multilingual models that match or\nsurpass the cap…

Separations in the Representational Capabilities of Transformers and Recurrent Architectures Open

Satwik Bhattamishra, Michael G. Hahn, Phil Blunsom, Varun Kanade · 2024

Computer science Engineering Psychology

Transformer architectures have been widely adopted in foundation models. Due to their high inference costs, there is renewed interest in exploring the potential of efficient recurrent architectures (RNNs). In this paper, we analyze the dif…

Improving Reward Models with Synthetic Critiques Open

Zihuiwen Ye, Fraser Greenlee-Scott, Max Bartolo, Phil Blunsom, Jon Ander Campos , et al. · 2024

Psychology Economics Computer science

Reward models (RMs) play a critical role in aligning language models through the process of reinforcement learning from human feedback. RMs are trained to predict a score reflecting human preference, which requires significant time and cos…

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Open

Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D’souza , et al. · 2024

Computer science

Recent breakthroughs in large language models (LLMs) have centered around a handful of data-rich languages. What does it take to broaden access to breakthroughs beyond first-class citizen languages? Our work introduces Aya, a massively mul…

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions Open

Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade · 2023

Computer science Engineering Economics

In order to understand the in-context learning phenomenon, recent works have adopted a stylized experimental framework and demonstrated that Transformers can learn gradient-based learning algorithms for various classes of real-valued funct…

Human Feedback is not Gold Standard Open

Tom Hosking, Phil Blunsom, Max Bartolo · 2023

Computer science Psychology Mathematics

Human feedback has become the de facto standard for evaluating the performance of Large Language Models, and is increasingly being used as a training objective. However, it is not clear which properties of a generated output this single `p…

Structural Transfer Learning in NL-to-Bash Semantic Parsers Open

Kyle Duffy, Satwik Bhattamishra, Phil Blunsom · 2023

Computer science Philosophy Economics

Large-scale pre-training has made progress in many fields of natural language processing, though little is understood about the design of pre-training datasets. We propose a methodology for obtaining a quantitative understanding of structu…

Intriguing Properties of Quantization at Scale Open

Arash Ahmadian, Saurabh Dash, Hongyu Chen, Bharat Venkitesh, Stephen Gou , et al. · 2023

Computer science Physics

Emergent properties have been widely adopted as a term to describe behavior not present in smaller models but observed in larger models. Recent work suggests that the trade-off incurred by quantization is also an emergent property, with sh…

Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization Open

Aishwarya Agrawal, Ivana Kajić, Emanuele Bugliarello, Elnaz Davoodi, Anita Gergely , et al. · 2023

Computer science Mathematics Geography

Vision-and-language (V&L) models pretrained on large-scale multimodal data have demonstrated strong performance on various tasks such as image captioning and visual question answering (VQA). The quality of such models is commonly assessed …

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions Open

Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom · 2023

Computer science Mathematics Engineering

Despite the widespread success of Transformers on NLP tasks, recent works have found that they struggle to model several formal languages when compared to recurrent models. This raises the question of why Transformers perform well in pract…

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions Open

Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom · 2022

Computer science Mathematics Engineering

Despite the widespread success of Transformers on NLP tasks, recent works have found that they struggle to model several formal languages when compared to recurrent models. This raises the question of why Transformers perform well in pract…

Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Open

Qi Liu, Zihuiwen Ye, Changyuan Yu, Phil Blunsom, Linfeng Song · 2022

Computer science Biology Economics

The task of context-dependent text-to-SQL aims to convert multi-turn user utterances to formal SQL queries. This is a challenging task due to both the scarcity of training data from which to learn complex contextual dependencies and to gen…

Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization Open

Aishwarya Agrawal, Ivana Kajić, Emanuele Bugliarello, Elnaz Davoodi, Anita Gergely , et al. · 2022

Computer science Mathematics Geography

Vision-and-language (V&L) models pretrained on large-scale multimodal data have demonstrated strong performance on various tasks such as image captioning and visual question answering (VQA). The quality of such models is commonly assessed …

StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models Open

Adam Liška, Tomáš Kočiský, Elena Gribovskaya, Tayfun Terzi, Eren Sezener , et al. · 2022

Computer science Geography Physics

Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia. However, our world is dynamic, evolves over time, and our models' knowledge…

Revisiting the Compositional Generalization Abilities of Neural Sequence Models Open

Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal · 2022

Computer science Mathematics Physics

Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionall…

Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale Open

Laurent Sartran, Samuel Barrett, Adhiguna Kuncoro, Miloš Stanojević, Phil Blunsom , et al. · 2022

Computer science Economics Physics

We introduce Transformer Grammars (TGs), a novel class of Transformer language models that combine (i) the expressive power, scalability, and strong performance of Transformers and (ii) recursive syntactic compositions, which here are impl…

Relational Memory Augmented Language Models Open

Qi Liu, Dani Yogatama, Phil Blunsom · 2022

Computer science Mathematics

We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation triples and retrieve relevant relations for a given context to improve text gener…

A Systematic Investigation of Commonsense Knowledge in Large Language Models Open

Xiang Lorraine Li, Adhiguna Kuncoro, Jordan Hoffmann, Cyprien de Masson d’Autume, Phil Blunsom , et al. · 2022

Computer science Psychology

Xiang Lorraine Li, Adhiguna Kuncoro, Jordan Hoffmann, Cyprien de Masson d'Autume, Phil Blunsom, Aida Nematzadeh. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.

Relational Memory-Augmented Language Models Open

Qi Liu, Dani Yogatama, Phil Blunsom · 2022

Computer science Physics Economics

We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation triples and retrieve relevant relations for a given context to improve text gener…

Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Open

Qi Liu, Zihuiwen Ye, Changyuan Yu, Linfeng Song, Phil Blunsom · 2022

Computer science Mathematics Biology

The task of context-dependent text-to-SQL aims to convert multi-turn user utterances to formal SQL queries. This is a challenging task due to both the scarcity of training data from which to learn complex contextual dependencies and to gen…

A Systematic Investigation of Commonsense Understanding in Large Language Models. Open

Xiang Lorraine Li, Adhiguna Kuncoro, Cyprien de Masson d’Autume, Phil Blunsom, Aida Nematzadeh · 2021

Computer science Chemistry Geography

Large language models have shown impressive performance on many natural language processing (NLP) tasks in a zero-shot setting. We ask whether these models exhibit commonsense understanding -- a critical component of NLP applications -- by…

A Systematic Investigation of Commonsense Knowledge in Large Language Models Open

Xiang Lorraine Li, Adhiguna Kuncoro, Cyprien de Masson d’Autume, Phil Blunsom, Aida Nematzadeh · 2021

Computer science Economics Chemistry

Language models (LMs) trained on large amounts of data have shown impressive performance on many NLP tasks under the zero-shot and few-shot setup. Here we aim to better understand the extent to which such models learn commonsense knowledge…

Pretraining the Noisy Channel Model for Task-Oriented Dialogue Open

Qi Liu, Lei Yu, Laura Rimell, Phil Blunsom · 2021

Computer science Biology Economics

Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes' theorem to factorize the dialogue task into tw…

Mind the Gap: Assessing Temporal Generalization in Neural Language\n Models Open

Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska , et al. · 2021

Computer science Mathematics Geography

Our world is open-ended, non-stationary, and constantly evolving; thus what\nwe talk about and how we talk about it change over time. This inherent dynamic\nnature of language contrasts with the current static language modelling\nparadigm,…

Mind the Gap: Assessing Temporal Generalization in Neural Language Models Open

Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska , et al. · 2021

Computer science Geography Mathematics

Our world is open-ended, non-stationary, and constantly evolving; thus what we talk about and how we talk about it change over time. This inherent dynamic nature of language contrasts with the current static language modelling paradigm, wh…

Pretraining the Noisy Channel Model for Task-Oriented Dialogue Open

Qi Liu, Lei Yu, Laura Rimell, Phil Blunsom · 2021

Computer science Biology Economics

Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes’ theorem to factorize the dialogue task into tw…

Phil Blunsom YOU? Author Swipe