Phil Blunsom
YOU?
Author Swipe
View article: Aya Vision: Advancing the Frontier of Multilingual Multimodality
Aya Vision: Advancing the Frontier of Multilingual Multimodality Open
Building multimodal language models is fundamentally challenging: it requires aligning vision and language modalities, curating high-quality instruction data, and avoiding the degradation of existing text-only capabilities once vision is i…
View article: Uncertainty-Aware Step-wise Verification with Generative Reward Models
Uncertainty-Aware Step-wise Verification with Generative Reward Models Open
Complex multi-step reasoning tasks, such as solving mathematical problems, remain challenging for large language models (LLMs). While outcome supervision is commonly used, process supervision via process reward models (PRMs) provides inter…
View article: Aya Expanse: Combining Research Breakthroughs for a New Multilingual\n Frontier
Aya Expanse: Combining Research Breakthroughs for a New Multilingual\n Frontier Open
We introduce the Aya Expanse model family, a new generation of 8B and 32B\nparameter multilingual language models, aiming to address the critical\nchallenge of developing highly performant multilingual models that match or\nsurpass the cap…
View article: Separations in the Representational Capabilities of Transformers and Recurrent Architectures
Separations in the Representational Capabilities of Transformers and Recurrent Architectures Open
Transformer architectures have been widely adopted in foundation models. Due to their high inference costs, there is renewed interest in exploring the potential of efficient recurrent architectures (RNNs). In this paper, we analyze the dif…
View article: Improving Reward Models with Synthetic Critiques
Improving Reward Models with Synthetic Critiques Open
Reward models (RMs) play a critical role in aligning language models through the process of reinforcement learning from human feedback. RMs are trained to predict a score reflecting human preference, which requires significant time and cos…
View article: Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Open
Recent breakthroughs in large language models (LLMs) have centered around a handful of data-rich languages. What does it take to broaden access to breakthroughs beyond first-class citizen languages? Our work introduces Aya, a massively mul…
View article: Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions Open
In order to understand the in-context learning phenomenon, recent works have adopted a stylized experimental framework and demonstrated that Transformers can learn gradient-based learning algorithms for various classes of real-valued funct…
View article: Human Feedback is not Gold Standard
Human Feedback is not Gold Standard Open
Human feedback has become the de facto standard for evaluating the performance of Large Language Models, and is increasingly being used as a training objective. However, it is not clear which properties of a generated output this single `p…
View article: Structural Transfer Learning in NL-to-Bash Semantic Parsers
Structural Transfer Learning in NL-to-Bash Semantic Parsers Open
Large-scale pre-training has made progress in many fields of natural language processing, though little is understood about the design of pre-training datasets. We propose a methodology for obtaining a quantitative understanding of structu…
View article: Intriguing Properties of Quantization at Scale
Intriguing Properties of Quantization at Scale Open
Emergent properties have been widely adopted as a term to describe behavior not present in smaller models but observed in larger models. Recent work suggests that the trade-off incurred by quantization is also an emergent property, with sh…
View article: Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization Open
Vision-and-language (V&L) models pretrained on large-scale multimodal data have demonstrated strong performance on various tasks such as image captioning and visual question answering (VQA). The quality of such models is commonly assessed …
View article: Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions Open
Despite the widespread success of Transformers on NLP tasks, recent works have found that they struggle to model several formal languages when compared to recurrent models. This raises the question of why Transformers perform well in pract…
View article: Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions Open
Despite the widespread success of Transformers on NLP tasks, recent works have found that they struggle to model several formal languages when compared to recurrent models. This raises the question of why Transformers perform well in pract…
View article: Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play
Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Open
The task of context-dependent text-to-SQL aims to convert multi-turn user utterances to formal SQL queries. This is a challenging task due to both the scarcity of training data from which to learn complex contextual dependencies and to gen…
View article: Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization
Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization Open
Vision-and-language (V&L) models pretrained on large-scale multimodal data have demonstrated strong performance on various tasks such as image captioning and visual question answering (VQA). The quality of such models is commonly assessed …
View article: StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models
StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models Open
Knowledge and language understanding of models evaluated through question answering (QA) has been usually studied on static snapshots of knowledge, like Wikipedia. However, our world is dynamic, evolves over time, and our models' knowledge…
View article: Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Revisiting the Compositional Generalization Abilities of Neural Sequence Models Open
Compositional generalization is a fundamental trait in humans, allowing us to effortlessly combine known phrases to form novel sentences. Recent works have claimed that standard seq-to-seq models severely lack the ability to compositionall…
View article: Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale Open
We introduce Transformer Grammars (TGs), a novel class of Transformer language models that combine (i) the expressive power, scalability, and strong performance of Transformers and (ii) recursive syntactic compositions, which here are impl…
View article: Relational Memory Augmented Language Models
Relational Memory Augmented Language Models Open
We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation triples and retrieve relevant relations for a given context to improve text gener…
View article: A Systematic Investigation of Commonsense Knowledge in Large Language Models
A Systematic Investigation of Commonsense Knowledge in Large Language Models Open
Xiang Lorraine Li, Adhiguna Kuncoro, Jordan Hoffmann, Cyprien de Masson d'Autume, Phil Blunsom, Aida Nematzadeh. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 2022.
View article: Relational Memory-Augmented Language Models
Relational Memory-Augmented Language Models Open
We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation triples and retrieve relevant relations for a given context to improve text gener…
View article: Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play
Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play Open
The task of context-dependent text-to-SQL aims to convert multi-turn user utterances to formal SQL queries. This is a challenging task due to both the scarcity of training data from which to learn complex contextual dependencies and to gen…
View article: A Systematic Investigation of Commonsense Understanding in Large Language Models.
A Systematic Investigation of Commonsense Understanding in Large Language Models. Open
Large language models have shown impressive performance on many natural language processing (NLP) tasks in a zero-shot setting. We ask whether these models exhibit commonsense understanding -- a critical component of NLP applications -- by…
View article: A Systematic Investigation of Commonsense Knowledge in Large Language Models
A Systematic Investigation of Commonsense Knowledge in Large Language Models Open
Language models (LMs) trained on large amounts of data have shown impressive performance on many NLP tasks under the zero-shot and few-shot setup. Here we aim to better understand the extent to which such models learn commonsense knowledge…
View article: Pretraining the Noisy Channel Model for Task-Oriented Dialogue
Pretraining the Noisy Channel Model for Task-Oriented Dialogue Open
Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes' theorem to factorize the dialogue task into tw…
View article: Mind the Gap: Assessing Temporal Generalization in Neural Language\n Models
Mind the Gap: Assessing Temporal Generalization in Neural Language\n Models Open
Our world is open-ended, non-stationary, and constantly evolving; thus what\nwe talk about and how we talk about it change over time. This inherent dynamic\nnature of language contrasts with the current static language modelling\nparadigm,…
View article: Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Mind the Gap: Assessing Temporal Generalization in Neural Language Models Open
Our world is open-ended, non-stationary, and constantly evolving; thus what we talk about and how we talk about it change over time. This inherent dynamic nature of language contrasts with the current static language modelling paradigm, wh…
View article: Pretraining the Noisy Channel Model for Task-Oriented Dialogue
Pretraining the Noisy Channel Model for Task-Oriented Dialogue Open
Direct decoding for task-oriented dialogue is known to suffer from the explaining-away effect, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes’ theorem to factorize the dialogue task into tw…