View article: MizAR 60 for Mizar 50
MizAR 60 for Mizar 50 Open
As a present to Mizar on its 50th anniversary, we develop an AI/TP system that automatically proves about 60% of the Mizar theorems in the hammer setting. We also automatically prove 75% of the Mizar theorems when the automated provers are…
View article
Attention Is All You Need Open
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. …
View article
Cross-lingual Language Model Pretraining Open
Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We p…
View article
Understanding Back-Translation at Scale Open
An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences. This work broadens the understanding of back-translation and in…
View article
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned Open
Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation. In this work we evaluate the contribution made by individual attention heads to the overall performance of the…
View article
Multilingual Denoising Pre-training for Neural Machine Translation Open
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART—a sequence-to-sequence denoising auto-encoder pre-trained on …
View article
Multilingual Denoising Pre-training for Neural Machine Translation Open
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART -- a sequence-to-sequence denoising auto-encoder pre-trained …
View article
Phrase-Based & Neural Unsupervised Machine Translation Open
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of la…
View article
Non-Autoregressive Neural Machine Translation Open
Existing approaches to neural machine translation condition each output word on previously generated outputs. We introduce a model that avoids this autoregressive property and produces its outputs in parallel, allowing an order of magnitud…
View article
A Structured Review of the Validity of BLEU Open
The BLEU metric has been widely used in NLP for over 15 years to evaluate NLP systems, especially in machine translation and natural language generation. I present a structured review of the evidence on whether BLEU is a valid evaluation t…
View article
Unsupervised Statistical Machine Translation Open
Linear temporal logic (LTL) is commonly used in model checking tasks; moreover, it is well-suited for the formalization of technical requirements. However, the correct specification and interpretation of temporal logic formulas require a s…
View article
Meta-Learning for Low-Resource Neural Machine Translation Open
In this paper, we propose to extend the recently introduced model-agnostic meta-learning algorithm (MAML, Finn, et al., 2017) for low-resource neural machine translation (NMT). We frame low-resource translation as a meta-learning problem w…
View article
Facebook FAIR’s WMT19 News Translation Task Submission Open
This paper describes Facebook FAIR’s submission to the WMT19 shared news translation task. We participate in four language directions, English German and English Russian in both directions. Following our submission from last year, our base…
View article
Self-Attention with Relative Position Representations Open
Relying entirely on an attention mechanism, the Transformer introduced by Vaswani et al. (2017) achieves state-of-the-art results for machine translation. In contrast to recurrent and convolutional neural networks, it does not explicitly m…
View article
Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder Open
In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach. We are then able to employ attention-based NMT for many-to-many multilingual translation tasks. Our appr…
View article
Unsupervised Pretraining for Sequence to Sequence Learning Open
This work presents a general unsupervised learning method to improve the accuracy of sequence to sequence (seq2seq) models. In our method, the weights of the encoder and decoder of a seq2seq model are initialized with the pretrained weight…
View article
Iterative Back-Translation for Neural Machine Translation Open
We present iterative back-translation, a method for generating increasingly better synthetic parallel data from monolingual data to train neural machine translation systems. Our proposed method is very simple yet effective and highly appli…
View article
Phrase-Based & Neural Unsupervised Machine Translation. Open
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of la…
View article
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation Open
Neural machine translation (NMT) aims at solving machine translation (MT) problems using neural networks and has exhibited promising results in recent years. However, most of the existing NMT models are shallow and there is still a perform…
View article
Rapid Adaptation of Neural Machine Translation to New Languages Open
This paper examines the problem of adapting neural machine translation systems to new, low-resourced languages (LRLs) as effectively and rapidly as possible. We propose methods based on starting with massively multilingual "seed models", w…
View article
Graph Transformer for Graph-to-Sequence Learning Open
The dominant graph-to-sequence transduction models employ graph neural networks for graph representation learning, where the structural information is reflected by the receptive field of neurons. Unlike graph neural networks that restrict …
View article
Improving Neural Machine Translation Models with Monolingual Data Open
Neural Machine Translation (NMT) has obtained state-of-the art performance for several language pairs, while only using parallel data for training. Target-side monolingual data plays an important role in boosting fluency for phrase-based s…
View article
Instance Weighting for Neural Machine Translation Domain Adaptation Open
Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging to be applied to Neural Machine Translation (NMT) directly, because NMT is not a linear model. In this paper, two …
View article
Transfer Learning for Low-Resource Neural Machine Translation Open
The encoder-decoder framework for neural machine translation (NMT) has been shown effective in large data scenarios, but is much less effective for low-resource languages. We present a transfer learning method that significantly improves B…
View article
BLEU is Not Suitable for the Evaluation of Text Simplification Open
BLEU is widely considered to be an informative metric for text-to-text generation, including Text Simplification (TS). TS includes both lexical and structural aspects. In this paper we show that BLEU is not suitable for the evaluation of s…
View article
Beyond BLEU:Training Neural Machine Translation with Semantic Similarity Open
While most neural machine translation (NMT)systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can significantly improve fi…
View article
Generating Chinese Classical Poems with Statistical Machine Translation Models Open
This paper describes a statistical approach to generation of Chinese classical poetry and proposes a novel method to automatically evaluate poems. The system accepts a set of keywords representing the writing intents from a writer and gene…
View article
Non-Autoregressive Neural Machine Translation Open
Existing approaches to neural machine translation condition each output word on previously generated outputs. We introduce a model that avoids this autoregressive property and produces its outputs in parallel, allowing an order of magnitud…
View article
An Effective Approach to Unsupervised Machine Translation Open
While machine translation has traditionally relied on large amounts of parallel corpora, a recent research line has managed to train both Neural Machine Translation (NMT) and Statistical Machine Translation (SMT) systems using monolingual …
View article
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input Open
Non-autoregressive translation (NAT) models, which remove the dependence on previous target tokens from the inputs of the decoder, achieve significantly inference speedup but at the cost of inferior accuracy compared to autoregressive tran…