Natural language understanding

Exploring the Limits of Transfer Learning with a Unified Text-to-Text\n Transformer Open

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang , et al. · 2019

Computer science Engineering

Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language processing (NLP). The effectiveness of transfer\nlearning has…

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Open

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy , et al. · 2018

Computer science Geography Political science

Human ability to understand language is general, flexible, and robust. In contrast, most NLU models above the word level are designed for a specific task and struggle with out-of-domain data. If we aspire to develop models with understandi…

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Open

Colin Raffel, Noam Shazeer, Adam P. Roberts, Katherine Lee, Sharan Narang , et al. · 2019

Computer science Physics Economics

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has gi…

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Open

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi , et al. · 2022

Computer science Engineering Philosophy

This article surveys and organizes research works in a new paradigm in natural language processing, which we dub “prompt-based learning.” Unlike traditional supervised learning, which trains a model to take in an input x and predict an out…

PaLM: Scaling Language Modeling with Pathways Open

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra , et al. · 2022

Computer science Physics Economics

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model t…

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks Open

Jiasen Lu, Dhruv Batra, Devi Parikh, Stefan Lee · 2019

Computer science Art Economics

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language. We extend the popular BERT architecture to a multi-modal two-stream model, pro-cessing…

TinyBERT: Distilling BERT for Natural Language Understanding Open

Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Dong Chen , et al. · 2020

Computer science Physics Geography

Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive, so it is difficult to efficiently…

AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts Open

Taylor Shin, Yasaman Razeghi, Robert L. Logan, Eric Wallace, Sameer Singh · 2020

Computer science Geography

The remarkable success of pretrained language models has motivated the study of what kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the-blanks problems (e.g., cloze tests) is a natural approach for…

Multi-Task Deep Neural Networks for Natural Language Understanding Open

Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao · 2019

Computer science Geography Economics

In this paper, we present a Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks. MT-DNN not only leverages large amounts of cross-task data, but also benefits from…

Unified Language Model Pre-training for Natural Language Understanding and Generation Open

Li Dong · 2019

Computer science Philosophy Physics

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional…

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data Open

Emily M. Bender, Alexander Koller · 2020

Computer science Psychology Philosophy

The success of the large neural language models on many NLP tasks is exciting. However, we find that these successes sometimes lead to hype in which these models are being described as “understanding” language or capturing “meaning”. In th…

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction Open

Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Aslı Çelikyılmaz , et al. · 2019

Computer science Engineering

We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional KBs that store k…

ChatGPT and Open-AI Models: A Preliminary Review Open

Konstantinos I. Roumeliotis, Nikolaos D. Tselikas · 2023

Computer science Materials science Mathematics

According to numerous reports, ChatGPT represents a significant breakthrough in the field of artificial intelligence. ChatGPT is a pre-trained AI model designed to engage in natural language conversations, utilizing sophisticated technique…

Attention in Natural Language Processing Open

Andrea Galassi, Marco Lippi, Paolo Torroni · 2020

Computer science

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview …

Multitask Prompted Training Enables Zero-Shot Task Generalization Open

Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika , et al. · 2021

Computer science Mathematics Economics

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language mod…

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Open

Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy , et al. · 2018

Computer science Geography History

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one …

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Open

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi , et al. · 2021

Computer science Philosophy Physics

This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning". Unlike traditional supervised learning, which trains a model to take in an input x and predict an outpu…

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances Open

Michael J. Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Andrés Carmona Cortes , et al. · 2022

Computer science Engineering

Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language. However, a s…

AllenNLP: A Deep Semantic Natural Language Processing Platform Open

Matt Gardner, Joël Grus, Mark E Neumann, Oyvind Tafjord, Pradeep Dasigi , et al. · 2018

Computer science

This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily. It is …

GPT (Generative Pre-Trained Transformer)— A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions Open

Gokul Yenduri, M. Ramalingam, G. Chemmalar Selvi, Y. Supriya, Gautam Srivastava , et al. · 2024

Computer science Engineering Art

The Generative Pre-trained Transformer (GPT) represents a notable breakthrough in the domain of natural language processing, which is propelling us toward the development of machines that can understand and communicate using language in a …

Is ChatGPT a General-Purpose Natural Language Processing Task Solver? Open

Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga , et al. · 2023

Computer science Economics Physics

Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot—i.e., without adaptation on downstream data. Recently, the debut of Chat…

DeBERTa: Decoding-enhanced BERT with Disentangled Attention Open

Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen · 2020

Computer science Mathematics Materials science

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks. In this paper we propose a new model architecture DeBERTa (Decoding-enhanced BERT with disent…

SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings Open

Erik Cambria, Soujanya Poria, Devamanyu Hazarika, Kenneth Kwok · 2018

Computer science Political science Biology

With the recent development of deep learning, research in AI has gained new vigor and prominence. While machine learning has succeeded in revitalizing many research fields, such as computer vision, speech recognition, and medical diagnosis…

Shared computational principles for language processing in humans and deep language models Open

Ariel Goldstein, Zaid Zada, Eliav Buchnik, Mariano Schain, Amy Price , et al. · 2022

Computer science Psychology Mathematics

Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models generate approp…

Argument Mining: A Survey Open

John Lawrence, Chris Reed · 2019

Computer science Philosophy Biology

Argument mining is the automatic identification and extraction of the structure of inference and reasoning expressed as arguments presented in natural language. Understanding argumentative structure makes it possible to determine not only …

Snips Voice Platform: an embedded Spoken Language Understanding system\n for private-by-design voice interfaces Open

Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier , et al. · 2018

Computer science Art Philosophy

This paper presents the machine learning architecture of the Snips Voice\nPlatform, a software solution to perform Spoken Language Understanding on\nmicroprocessors typical of IoT devices. The embedded inference is fast and\naccurate while…

Semantics-Aware BERT for Language Understanding Open

Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang , et al. · 2020

Computer science Political science

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tas…

Distilling Task-Specific Knowledge from BERT into Simple Neural Networks Open

Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova , et al. · 2019

Computer science Philosophy Economics

In the natural language processing literature, neural networks are becoming increasingly deeper and complex. The recent poster child of this trend is the deep language representation model, which includes BERT, ELMo, and GPT. These develop…

Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison Open

Alessandro Raganato, José Camacho-Collados, Roberto Navigli · 2017

Computer science Geology Economics

Word Sense Disambiguation is a longstanding task in Natural Language Processing, lying at the core of human language understanding. However, the evaluation of automatic systems has been problematic, mainly due to the lack of a reliable eva…

Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces Open

Alice Coucke, Alaa Saade, Adrien Ball, Théodore Bluche, Alexandre Caulier , et al. · 2018

Computer science Art Philosophy

This paper presents the machine learning architecture of the Snips Voice Platform, a software solution to perform Spoken Language Understanding on microprocessors typical of IoT devices. The embedded inference is fast and accurate while en…

Natural language understanding ≈ Natural language understanding