Fine-Tuning Language Models via Epistemic Neural Networks

Exploring foci of: arXiv (Cornell University) Fine-Tuning Language Models via Epistemic Neural Networks November 2022 • Ian Osband, Seyed Mohammad Asghari, Benjamin Van Roy, Nat McAleese, John Aslanides, Geoffrey Irving Language models often pre-train on large unsupervised text corpora, then fine-tune on additional task-specific data. However, typical fine-tuning schemes do not prioritize the examples that they tune on. We show that, if you can prioritize informative training data, you can achieve better performance while using fewer labels. To do this we augment a language model with an epinet: a small additional network that helps to estimate model uncertainty and forms an \textit{epistemic neural network} (ENN). ENNs are neura… Open Article Page

Computer Science Artificial Intelligence Heuristic Machine Learning Deep Learning Generative Grammar Training, Validation, And Test Data Sets Management Economics Open Article