Natural language processing ≈ Natural language processing
View article: MizAR 60 for Mizar 50
MizAR 60 for Mizar 50 Open
As a present to Mizar on its 50th anniversary, we develop an AI/TP system that automatically proves about 60% of the Mizar theorems in the hammer setting. We also automatically prove 75% of the Mizar theorems when the automated provers are…
View article
Transformer-Based Feature Learning for Algorithm Selection in Combinatorial Optimisation Open
Given a combinatorial optimisation problem, there are typically multiple ways of modelling it for presentation to an automated solver. Choosing the right combination of model and target solver can have a significant impact on the effective…
View article
Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations (Short Paper) Open
This research focuses on assessing the ability of large language models (LLMs) in representing geometries and their spatial relations. We utilize LLMs including GPT-2 and BERT to encode the well-known text (WKT) format of geometries and th…
View article
Enriching Word Vectors with Subword Information Open
Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to eac…
View article
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks Open
Nils Reimers, Iryna Gurevych. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
View article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text\n Transformer Open
Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language processing (NLP). The effectiveness of transfer\nlearning has…
View article
A Simple Framework for Contrastive Learning of Visual Representations Open
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. …
View article
Attention Is All You Need Open
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. …
View article
BioBERT: a pre-trained biomedical language representation model for biomedical text mining Open
Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processing (NLP), extracting valuable information from biomedical literature ha…
View article
SQuAD: 100,000+ Questions for Machine Comprehension of Text Open
We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text f…
View article
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Open
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to …
View article
Learning Transferable Visual Models From Natural Language Supervision Open
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify an…
View article
Kaldi Speech Recognition Toolkit Open
—We describe the design of Kaldi, a free, open-source toolkit for speech recognition research. Kaldi provides a speech recognition system based on finite-state transducers (using the freely available OpenFst), together with detailed docume…
View article
Hierarchical Attention Networks for Document Classification Open
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.
View article
Reading digits in natural images with unsupervised feature learning Open
Detecting and reading text from natural images is a hard computer vision task that is central to a variety of emerging applications. Related problems like document character recognition have been widely studied by computer vision and machi…
View article
LLM-Supported Manufacturing Mapping Generation Open
In large manufacturing companies, such as Bosch, that operate thousands of production lines with each comprising up to dozens of production machines and other equipment, even simple inventory questions such as of location and quantities of…
View article
Representation Learning with Contrastive Predictive Coding Open
While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose…
View article
Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation Open
Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corresponding identification of chance or base case levels of the s…
View article
Neural Architectures for Named Entity Recognition Open
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.
View article
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling Open
For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine tran…
View article
Training language models to follow instructions with human feedback Open
Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these m…
View article
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Open
We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerg…
View article
Get To The Point: Summarization with Pointer-Generator Networks Open
Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two…
View article
Representation Learning with Contrastive Predictive Coding Open
While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose…
View article
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Open
Human ability to understand language is general, flexible, and robust. In contrast, most NLU models above the word level are designed for a specific task and struggle with out-of-domain data. If we aspire to develop models with understandi…
View article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Open
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has gi…
View article
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Open
This article surveys and organizes research works in a new paradigm in natural language processing, which we dub “prompt-based learning.” Unlike traditional supervised learning, which trains a model to take in an input x and predict an out…
View article
Transformer-XL: Attentive Language Models beyond a Fixed-Length Context Open
Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond …
View article
Language Models are Few-Shot Learners Open
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires…
View article
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Open
Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, their ability to access and precisely manipulate knowl…