Explanipedia

FairPair: A Robust Evaluation of Biases in Language Models through Paired Perturbations Open

Jane Dwivedi-Yu, Raaz Dwivedi, Timo Schick · 2024

The accurate evaluation of differential treatment in language models to specific groups is critical to ensuring a positive and safe user experience. An ideal evaluation should have the properties of being robust, extendable to new groups o…

Improving Wikipedia verifiability with AI Open

Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick A. Lewis, Gautier Izacard , et al. · 2023

Computer science Philosophy

Verifiability is a core content policy of Wikipedia: claims need to be backed by citations. Maintaining and improving the quality of Wikipedia references is an important challenge and there is a pressing need for better tools to assist hum…

Evaluation of Faithfulness Using the Longest Supported Subsequence Open

Anirudh Mittal, Timo Schick, Mikel Artetxe, Jane Dwivedi-Yu · 2023

Computer science Mathematics Economics

As increasingly sophisticated language models emerge, their trustworthiness becomes a pivotal issue, especially in tasks such as summarization and question-answering. Ensuring their responses are contextually grounded and faithful is chall…

Self-Alignment with Instruction Backtranslation Open

Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Luke Zettlemoyer , et al. · 2023

Computer science Philosophy

We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions. Our approach, named instruction backtranslation, starts with a languag…

Active Learning Principles for In-Context Learning with Large Language Models Open

Katerina Margatina, Timo Schick, Νικόλαος Αλέτρας, Jane Dwivedi-Yu · 2023

Computer science Engineering Biology

The remarkable advancements in large language models (LLMs) have significantly enhanced the performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively gr…

LongForm: Effective Instruction Tuning with Reverse Instructions Open

Abdullatif Köksal, Timo Schick, Anna Korhonen, Hinrich Schütze · 2023

Computer science

Instruction tuning enables language models to more effectively generalize and better follow user intent. However, obtaining instruction data is costly and challenging. Prior work employs methods such as expensive human annotation, crowd-so…

Augmented Language Models: a Survey Open

Grégoire Mialon, Roberto Dessì, María Lomelí, Christoforos Nalmpantis, Ram Pasunuru , et al. · 2023

Computer science Biology

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in c…

Toolformer: Language Models Can Teach Themselves to Use Tools Open

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, María Lomelí , et al. · 2023

Computer science Philosophy Materials science

Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup,…

Semantic-Oriented Unlabeled Priming for Large-Scale Language Models Open

Yanchen Liu, Timo Schick, Hinrich Schtze · 2023

Computer science Physics Biology

Due to the high costs associated with finetuning large language models, various recent works propose to adapt them to specific tasks without any parameter updates through in-context learning.Unfortunately, for in-context learning there is …

MEAL: Stable and Active Learning for Few-Shot Prompting Open

Abdullatif Köksal, Timo Schick, Hinrich Schuetze · 2023

Computer science Chemistry Engineering

Few-shot classification has made great strides due to foundation models that, through priming and prompting, are highly effective few-shot learners. However, this approach has high variance both across different sets of few shots (*data se…

Task-aware Retrieval with Instructions Open

Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard , et al. · 2023

Computer science Economics

We study the problem of retrieval with instructions, where users provide explicit descriptions of their intent along with their queries to guide a retrieval system. Our solution is a general-purpose task-aware retrieval system, trained usi…

Active Learning Principles for In-Context Learning with Large Language Models Open

Katerina Margatina, Timo Schick, Νικόλαος Αλέτρας, Jane Dwivedi-Yu · 2023

Computer science Materials science Biology

The remarkable advancements in large language models (LLMs) have significantly enhanced predictive performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effecti…

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor Open

Or Honovich, Thomas Scialom, Omer Levy, Timo Schick · 2023

Computer science

Instruction tuning enables pretrained language models to perform new tasks from inference-time natural language descriptions. These approaches rely on vast amounts of human supervision in the form of crowdsourced datasets or user interacti…

Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor Open

Or Honovich, Thomas Scialom, Omer Levy, Timo Schick · 2022

Computer science

Instruction tuning enables pretrained language models to perform new tasks from inference-time natural language descriptions. These approaches rely on vast amounts of human supervision in the form of crowdsourced datasets or user interacti…

Task-aware Retrieval with Instructions Open

Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard , et al. · 2022

Computer science Economics

We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries. We aim to develop a general-purpose task-aware retrieval system using multi-task instruction …

MEAL: Stable and Active Learning for Few-Shot Prompting Open

Abdullatif Köksal, Timo Schick, Hinrich Schütze · 2022

Computer science Engineering Chemistry

Few-shot classification has made great strides due to foundation models that, through priming and prompting, are highly effective few-shot learners. However, this approach has high variance both across different sets of few shots (data sel…

Improving Wikipedia Verifiability with AI Open

Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick Lewis, Gautier Izacard , et al. · 2022

Computer science Philosophy Economics

Verifiability is a core content policy of Wikipedia: claims that are likely to be challenged need to be backed by citations. There are millions of articles available online and thousands of new articles are released each month. For this re…

EditEval: An Instruction-Based Benchmark for Text Improvements Open

Jane Dwivedi-Yu, Timo Schick, Zhengbao Jiang, María Lomelí, Patrick Lewis , et al. · 2022

Computer science Geography Chemistry

Evaluation of text generation to date has primarily focused on content created sequentially, rather than improvements on a piece of text. Writing, however, is naturally an iterative and incremental process that requires expertise in differ…

PEER: A Collaborative Language Model Open

Timo Schick, Jane Dwivedi-Yu, Zhengbao Jiang, Fabio Petroni, Patrick Lewis , et al. · 2022

Computer science Psychology Philosophy

Textual content is often the output of a collaborative writing process: We start with an initial draft, ask for suggestions, and repeatedly make changes. Agnostic of this process, today's language models are trained to generate only the fi…

Atlas: Few-shot Learning with Retrieval Augmented Language Models Open

Gautier Izacard, Patrick Lewis, María Lomelí, Lucas Hosseini, Fabio Petroni , et al. · 2022

Computer science Engineering Materials science

Large language models have shown impressive few-shot results on a wide range of tasks. However, when knowledge is key for such results, as is the case for tasks such as question answering and fact checking, massive parameter counts to stor…

Improving Wikipedia Verifiability with AI Open

Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick Lewis, Gautier Izacard , et al. · 2022

Computer science Economics Philosophy

Verifiability is a core content policy of Wikipedia: claims that are likely to be challenged need to be backed by citations. There are millions of articles available online and thousands of new articles are released each month. For this re…

Leveraging QA Datasets to Improve Generative Data Augmentation Open

Dheeraj Mekala, Tu Vu, Timo Schick, Jingbo Shang · 2022

Computer science Biology Mathematics

The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation. In this work, we propose CONDA, an approach to further improve GLMs' ab…

Semantic-Oriented Unlabeled Priming for Large-Scale Language Models Open

Yanchen Liu, Timo Schick, Hinrich Schütze · 2022

Computer science Biology

Due to the high costs associated with finetuning large language models, various recent works propose to adapt them to specific tasks without any parameter updates through in-context learning. Unfortunately, for in-context learning there is…

True Few-Shot Learning with Prompts—A Real-World Perspective Open

Timo Schick, Hinrich Schütze · 2022

Computer science Engineering Chemistry

Prompt-based approaches excel at few-shot learning. However, Perez et al. (2021) recently cast doubt on their performance as they had difficulty getting good results in a “true” few-shot setting in which prompts and hyperparameters cannot …

Leveraging QA Datasets to Improve Generative Data Augmentation Open

Dheeraj Mekala, Tu Vu, Timo Schick, Jingbo Shang · 2022

Computer science Biology Mathematics

The ability of generative language models (GLMs) to generate text has improved considerably in the last few years, enabling their use for generative data augmentation. In this work, we propose CONDA, an approach to further improve GLM's ab…

CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment Open

Lütfi Kerem Senel, Timo Schick, Hinrich Schütze · 2022

Computer science Engineering Geography

Pretrained language models (PLMs) have achieved superhuman performance on many benchmarks, creating a need for harder tasks. We introduce CoDA21 (Context Definition Alignment), a challenging benchmark that measures natural language underst…

True Few-Shot Learning with Prompts -- A Real-World Perspective Open

Timo Schick, Hinrich Schütze · 2021

Computer science Biology Physics

Prompt-based approaches are strong at few-shot learning. However, Perez et al. (2021) have recently cast doubt on their performance because they had difficulty getting good results in a "true" few-shot setting in which prompts and hyperpar…

Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Open

Timo Schick, Sahana Udupa, Hinrich Schütze · 2021

Computer science Psychology Philosophy

When trained on large, unfiltered crawls from the internet, language models pick up and reproduce all kinds of undesirable biases that can be found in the data: they often generate racist, sexist, violent or otherwise toxic language. As la…

Timo Schick YOU? Author Swipe