Fine-tuning
View article
Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? Open
Training a deep convolutional neural network (CNN) from scratch is difficult because it requires a large amount of labeled training data and a great deal of expertise to ensure proper convergence. A promising alternative is to fine-tune a …
View article
Parameter-efficient fine-tuning of large-scale pre-trained language models Open
With the prevalence of pre-trained language models (PLMs) and the pre-training–fine-tuning paradigm, it has been continuously shown that larger models tend to yield better performance. However, as PLMs scale up, fine-tuning and storing all…
View article
P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks Open
Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training. However, in the context of NLU, prior work reveals that prompt tuning does not perform we…
View article
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models Open
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset of them) are being modified. We show that with small-to-medium training data, applying BitFit on pre-trained BERT models is competitive wit…
View article
Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally Open
Intense interest in applying convolutional neural networks (CNNs) in biomedical image analysis is wide spread, but its success is impeded by the lack of large annotated datasets in biomedical imaging. Annotating biomedical images is not on…
View article
Prefix-Tuning: Optimizing Continuous Prompts for Generation Open
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks. However, it modifies all the language model parameters and therefore necessitates storing a full copy for each task. In this paper, w…
View article
Towards a Unified View of Parameter-Efficient Transfer Learning Open
Fine-tuning large pre-trained language models on downstream tasks has become the de-facto learning paradigm in NLP. However, conventional approaches fine-tune all the parameters of the pre-trained model, which becomes prohibitive as the mo…
View article
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks Open
Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training. However, in the context of NLU, prior work reveals that prompt tuning does not perform we…
View article
An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation Open
In this paper, we propose a novel domain adaptation method named "mixed fine tuning" for neural machine translation (NMT). We combine two existing approaches namely fine tuning and multi domain NMT. We first train an NMT model on an out-of…
View article
Transfer Learning With Adaptive Fine-Tuning Open
With the utilization of deep learning approaches, the key factors for a successful application are sufficient datasets with reliable ground truth, which are generally not easy to obtain, especially in the field of medicine. In recent years…
View article
Revisiting Few-sample BERT Fine-tuning Open
This paper is a study of fine-tuning of BERT contextual representations, with focus on commonly observed instabilities in few-sample scenarios. We identify several factors that cause this instability: the common use of a non-standard optim…
View article
An efficient deep learning model to categorize brain tumor using reconstruction and fine-tuning Open
Brain tumors are among the most fatal and devastating diseases, often resulting in significantly reduced life expectancy. An accurate diagnosis of brain tumors is crucial to devise treatment plans that can extend the lives of affected indi…
View article
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution Open
When transferring a pretrained model to a downstream task, two popular methods are full fine-tuning (updating all the model parameters) and linear probing (updating only the last linear layer -- the "head"). It is well known that fine-tuni…
View article
Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot Learning Open
The goal of few-shot learning is to learn a classifier that can recognize unseen classes from limited support data with labels. A common practice for this task is to train a model on the base set first and then transfer to novel classes th…
View article
Making Pre-trained Language Models Better Few-shot Learners Open
The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a…
View article
No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence Open
Pre-trained models have been shown effective in many code intelligence tasks.\nThese models are pre-trained on large-scale unlabeled corpus and then\nfine-tuned in downstream tasks. However, as the inputs to pre-training and\ndownstream ta…
View article
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning Open
Recent pretrained language models extend from millions to billions of parameters. Thus the need to fine-tune an extremely large pretrained model with a limited training corpus arises in various downstream tasks. In this paper, we propose a…
View article
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models Open
As pre-trained language models (PLMs) have become the fundamental infrastructure for various NLP tasks and researchers have readily enjoyed themselves in the pretraining-finetuning paradigm, evidence from emerging research has continuously…
View article
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise Open
The growing importance of massive datasets used for deep learning makes robustness to label noise a critical property for classifiers to have. Sources of label noise include automatic labeling, non-expert labeling, and label corruption by …
View article
Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation Open
We study the power of cross-attention in the Transformer architecture within the context of transfer learning for machine translation, and extend the findings of studies into cross-attention when training from scratch. We conduct a series …
View article
On the Effectiveness of Parameter-Efficient Fine-Tuning Open
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP tasks. However, fine-tuning the whole model is parameter inefficient as it always yields an entirely new model for each task. Currently, man…
View article
The Power of Scale for Parameter-Efficient Prompt Tuning Open
In this work, we explore "prompt tuning", a simple yet effective mechanism for learning "soft prompts" to condition frozen language models to perform specific downstream tasks. Unlike the discrete text prompts used by GPT-3, soft prompts a…
View article
Efficient Few-Shot Learning Without Prompts Open
Recent few-shot methods, such as parameter-efficient fine-tuning (PEFT) and pattern exploiting training (PET), have achieved impressive results in label-scarce settings. However, they are difficult to employ since they are subject to high …
View article
Structural tuning of heterogeneous molecular catalysts for electrochemical energy conversion Open
The structural tuning of first and second coordination spheres of heterogeneous molecular catalysts is reviewed.
View article
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey Open
Large models represent a groundbreaking advancement in multiple application fields, enabling remarkable achievements across various tasks. However, their unprecedented scale comes with significant computational costs. These models, often c…
View article
BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression Open
The slow speed of BERT has motivated much research on accelerating its inference, and the early exiting idea has been proposed to make trade-offs between model quality and efficiency. This paper aims to address two weaknesses of previous w…
View article
The fate of the Higgs vacuum Open
We have recently suggested that tiny black holes can act as nucleation seeds for the decay of the metastable Higgs vacuum. Previous results applied only to the nucleation of thin-wall bubbles, and covered a very small region of parameter s…
View article
Emerging trends: A gentle introduction to fine-tuning Open
The previous Emerging Trends article (Church et al ., 2021. Natural Language Engineering 27 (5), 631–645.) introduced deep nets to poets. Poets is an imperfect metaphor, intended as a gesture toward inclusion. The future for deep nets will…
View article
Analytic approach to non-slow-roll inflation Open
Brief periods of non-slow-roll evolution during inflation can produce\ninteresting observable consequences, as primordial black holes, or an\ninflationary gravitational wave spectrum enhanced at small scales. We develop a\nmodel independen…
View article
FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer Open
Recent work has explored the potential to adapt a pre-trained vision transformer (ViT) by updating only a few parameters so as to improve storage efficiency, called parameter-efficient transfer learning (PETL). Current PETL methods have sh…