Kenneth Heafield
YOU?
Author Swipe
View article: Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey Open
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources inc…
Scaling neural machine translation to 200 languages Open
The development of neural techniques has opened up new avenues for research in machine translation. Today, neural machine translation (NMT) systems can leverage highly multilingual capacities and even perform zero-shot translation, deliver…
Code-Switched Language Identification is Harder Than You Think Open
Code switching (CS) is a very common phenomenon in written and spoken communication but one that is handled poorly by many natural language processing applications. Looking to the application of building CS corpora, we explore CS language …
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca Open
Foundational large language models (LLMs) can be instruction-tuned to perform open-domain question answering, facilitating applications like chat assistants. While such efforts are often carried out in a single language, we empirically ana…
Iterative Translation Refinement with Large Language Models Open
We propose iteratively prompting a large language model to self-correct a translation, with inspiration from their strong language understanding and translation capability as well as a human-like translation approach. Interestingly, multi-…
An Open Dataset and Model for Language Identification Open
Language identification (LID) is a fundamental step in many natural language processing pipelines. However, current LID systems are far from perfect, particularly on lower-resource languages. We present a LID model which achieves a macro-a…
An Open Dataset and Model for Language Identification Open
Language identification (LID) is a fundamental step in many natural language processing pipelines. However, current LID systems are far from perfect, particularly on lower-resource languages. We present a LID model which achieves a macro-a…
View article: Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey Open
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources inc…
View article: Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)
Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP) Open
This paper investigates the performance of massively multilingual neural machine translation (NMT) systems in translating Yorùbá greetings (ε kú 1 ), which are a big part of Yorùbá language and culture, into English.To evaluate these mode…
Cheating to Identify Hard Problems for Neural Machine Translation Open
We identify hard problems for neural machine translation models by analyzing progressively higher-scoring translations generated by letting models cheat to various degrees. If a system cheats and still gets something wrong, that suggests i…
View article: Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey Open
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources inc…
View article: No Language Left Behind: Scaling Human-Centered Machine Translation
No Language Left Behind: Scaling Human-Centered Machine Translation Open
Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of langua…
Exploring Diversity in Back Translation for Low-Resource Machine Translation Open
Back translation is one of the most widely used methods for improving the performance of neural machine translation systems. Recent research has sought to enhance the effectiveness of this method by increasing the 'diversity' of the genera…
Cheat Codes to Quantify Missing Source Information in Neural Machine Translation Open
This paper describes a method to quantify the amount of information H(t|s) added by the target sentence t that is not present in the source s in a neural machine translation system. We do this by providing the model the target sentence in …
Direct simultaneous speech to speech translation Open
We present the first direct simultaneous speech-to-speech translation (Simul-S2ST) model, with the ability to start generating translation in the target speech before consuming the full source speech content and independently from intermed…
TranslateLocally: Blazing-fast translation running on the local CPU Open
Every day, millions of people sacrifice their privacy and browsing habits in exchange for online machine translation. Companies and governments with confidentiality requirements often ban online translation or pay a premium to disable logg…
The University of Edinburgh's English-German and English-Hausa Submissions to the WMT21 News Translation Task Open
This paper presents the University of Edinburgh’s constrained submissions of EnglishGerman and English-Hausa systems to the WMT 2021 shared task on news translation. We build En-De systems in three stages: corpus filtering, back-translatio…
TranslateLocally: Blazing-fast translation running on the local CPU Open
Every day, millions of people sacrifice their privacy and browsing habits in exchange for online machine translation. Companies and governments with confidentiality requirements often ban online translation or pay a premium to disable logg…
Gender bias amplification during Speed-Quality optimization in Neural Machine Translation Open
Is bias amplified when neural machine translation (NMT) models are optimized for speed and evaluated on generic test sets using BLEU? We investigate architectures and techniques commonly used to speed up decoding in Transformer-based model…
Fully Synthetic Data Improves Neural Machine Translation with Knowledge Distillation Open
This paper explores augmenting monolingual data for knowledge distillation in neural machine translation. Source language monolingual text can be incorporated as a forward translation. Interestingly, we find the best way to incorporate tar…
Exploring Monolingual Data for Neural Machine Translation with Knowledge Distillation. Open
We explore two types of monolingual data that can be included in knowledge distillation training for neural machine translation (NMT). The first is the source-side monolingual data. Second, is the target-side monolingual data that is used …
Speed-optimized, Compact Student Models that Distill Knowledge from a Larger Teacher Model: the UEDIN-CUNI Submission to the WMT 2020 News Translation Task. Open
We describe the joint submission of the University of Edinburgh and Charles University, Prague, to the Czech/English track in the WMT 2020 Shared Task on News Translation. Our fast and compact student models distill knowledge from a larger…
Approaching Neural Chinese Word Segmentation as a Low-Resource Machine Translation Task Open
Chinese word segmentation has entered the deep learning era which greatly reduces the hassle of feature engineering. Recently, some researchers attempted to treat it as character-level translation, which further simplified model designing,…
The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020 Open
We present Sockeye 2, a modernized and streamlined version of the Sockeye neural machine translation (NMT) toolkit. New features include a simplified code base through the use of MXNet's Gluon API, a focus on state of the art model archite…
The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020 Open
We present Sockeye 2, a modernized and streamlined version of the Sockeye neural machine translation (NMT) toolkit. New features include a simplified code base through the use of MXNet's Gluon API, a focus on state of the art model archite…
In Neural Machine Translation, What Does Transfer Learning Transfer? Open
Transfer learning improves quality for low-resource machine translation, but it is unclear what exactly it transfers. We perform several ablation studies that limit information transfer, then measure the quality impact across three languag…
Parallel Sentence Mining by Constrained Decoding Open
We present a novel method to extract parallel sentences from two monolingual corpora, using neural machine translation. Our method relies on translating sentences in one corpus, but constraining the decoding by a prefix tree built on the o…
ParaCrawl: Web-Scale Acquisition of Parallel Corpora Open
We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software. We empirically compare alternative methods and publish benchmark data sets for sentence alignment and sentence …