Exploring the Limits of Transfer Learning with a Unified Text-to-Text\n Transformer Article Swipe

PDF

Related Concepts

automatic summarization computer science transfer of learning natural language processing artificial intelligence transformer question answering task (project management) language model natural language understanding set (abstract data type) training set natural language engineering systems engineering programming language voltage electrical engineering

Colin Raffel , Noam Shazeer , Adam Roberts , Katherine Lee , Sharan Narang , Michael Matena , Yanqi Zhou , Wei Li , Peter J. Liu ·

YOU? · · 2019 · Open Access · · DOI: https://doi.org/10.48550/arxiv.1910.10683 · OA: W4288089799

Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language processing (NLP). The effectiveness of transfer\nlearning has given rise to a diversity of approaches, methodology, and\npractice. In this paper, we explore the landscape of transfer learning\ntechniques for NLP by introducing a unified framework that converts all\ntext-based language problems into a text-to-text format. Our systematic study\ncompares pre-training objectives, architectures, unlabeled data sets, transfer\napproaches, and other factors on dozens of language understanding tasks. By\ncombining the insights from our exploration with scale and our new ``Colossal\nClean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks\ncovering summarization, question answering, text classification, and more. To\nfacilitate future work on transfer learning for NLP, we release our data set,\npre-trained models, and code.\n