John Kirchenbauer
YOU?
Author Swipe
View article: A Fictional Q&A Dataset for Studying Memorization and Knowledge Acquisition
A Fictional Q&A Dataset for Studying Memorization and Knowledge Acquisition Open
When language models are trained on textual data, they acquire both knowledge about the structure of language as well as knowledge of facts about the world. At inference time, their knowledge of facts can be leveraged to solve interesting …
View article: The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Open
Large language models (LLMs) are typically trained on enormous quantities of unlicensed text, a practice that has led to scrutiny due to possible intellectual property infringement and ethical concerns. Training LLMs on openly licensed tex…
View article: Zero-Shot Vision Encoder Grafting via LLM Surrogates
Zero-Shot Vision Encoder Grafting via LLM Surrogates Open
Vision language models (VLMs) typically pair a modestly sized vision encoder with a large language model (LLM), e.g., Llama-70B, making the decoder the primary computational burden during training. To reduce costs, a potential promising st…
View article: When Can You Get Away with Low Memory Adam?
When Can You Get Away with Low Memory Adam? Open
Adam is the go-to optimizer for training modern machine learning models, but it requires additional memory to maintain the moving averages of the gradients and their squares. While various low-memory optimizers have been proposed that some…
View article: Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers Open
Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In this work, we present a novel four-dimensional hybrid p…
View article: Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs Open
There is growing demand for performing inference with hundreds of thousands of input tokens on trained transformer models. Inference at this extreme scale demands significant computational resources, hindering the application of transforme…
View article: Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Gemstones: A Model Suite for Multi-Faceted Scaling Laws Open
Scaling laws are typically fit using a family of models with a narrow range of frozen hyperparameter choices. In this work we study scaling laws using multiple architectural shapes and hyperparameter choices, highlighting their impact on r…
View article: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Open
We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This …
View article: GenQA: Generating Millions of Instructions from a Handful of Prompts
GenQA: Generating Millions of Instructions from a Handful of Prompts Open
Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about finetuning at scale, such as curricula and learning rate cooldown schedules, th…
View article: Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs Open
Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. Durin…
View article: OPTune: Efficient Online Preference Tuning
OPTune: Efficient Online Preference Tuning Open
Reinforcement learning with human feedback~(RLHF) is critical for aligning Large Language Models (LLMs) with human preference. Compared to the widely studied offline version of RLHF, \emph{e.g.} direct preference optimization (DPO), recent…
View article: Transformers Can Do Arithmetic with the Right Embeddings
Transformers Can Do Arithmetic with the Right Embeddings Open
The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to ea…
View article: LMD3: Language Model Data Density Dependence
LMD3: Language Model Data Density Dependence Open
We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate tha…
View article: NEFTune: Noisy Embeddings Improve Instruction Finetuning
NEFTune: Noisy Embeddings Improve Instruction Finetuning Open
We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on Alpa…
View article: Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Baseline Defenses for Adversarial Attacks Against Aligned Language Models Open
As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing fro…
View article: Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models Open
With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative. For example, a company deploying a client-facing chatbot must ensure that …
View article: On the Reliability of Watermarks for Large Language Models
On the Reliability of Watermarks for Large Language Models Open
As LLMs become commonplace, machine-generated text has the potential to flood the internet with spam, social media bots, and valueless content. Watermarking is a simple and effective strategy for mitigating such harms by enabling the detec…
View article: Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust Open
Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this paper, we introduce a novel technique called Tree-Ring Watermarking that robustly f…
View article: Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery Open
The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prom…
View article: A Watermark for Large Language Models
A Watermark for Large Language Models Open
Potential harms of large language models can be mitigated by watermarking model output, i.e., embedding signals into generated text that are invisible to humans but algorithmically detectable from a short span of tokens. We propose a water…
View article: What is Your Metric Telling You? Evaluating Classifier Calibration under Context-Specific Definitions of Reliability
What is Your Metric Telling You? Evaluating Classifier Calibration under Context-Specific Definitions of Reliability Open
Classifier calibration has received recent attention from the machine learning community due both to its practical utility in facilitating decision making, as well as the observation that modern neural network classifiers are poorly calibr…