Explanipedia

A Fictional Q&A Dataset for Studying Memorization and Knowledge Acquisition Open

John Kirchenbauer, Janny Mongkolsupawan, Yuxin Wen, Tom Goldstein, Daphne Ippolito · 2025

When language models are trained on textual data, they acquire both knowledge about the structure of language as well as knowledge of facts about the world. At inference time, their knowledge of facts can be leveraged to solve interesting …

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Open

Nikhil Kandpal, Brian Lester, Colin Raffel, Sebastian Majstorovic, Stella Biderman , et al. · 2025

Large language models (LLMs) are typically trained on enormous quantities of unlicensed text, a practice that has led to scrutiny due to possible intellectual property infringement and ethical concerns. Training LLMs on openly licensed tex…

Zero-Shot Vision Encoder Grafting via LLM Surrogates Open

Kaiyu Yue, Vasu Singla, Menglin Jia, John Kirchenbauer, Rifaa Qadri , et al. · 2025

Vision language models (VLMs) typically pair a modestly sized vision encoder with a large language model (LLM), e.g., Llama-70B, making the decoder the primary computational burden during training. To reduce costs, a potential promising st…

When Can You Get Away with Low Memory Adam? Open

Dayal Singh Kalra, John Kirchenbauer, Maissam Barkeshli, Tom Goldstein · 2025

Adam is the go-to optimizer for training modern machine learning models, but it requires additional memory to maintain the moving averages of the gradients and their squares. While various low-memory optimizers have been proposed that some…

Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers Open

Siddharth Singh, Prajwal Singhania, Aditya Ranjan, John Kirchenbauer, Jonas Geiping , et al. · 2025

Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In this work, we present a novel four-dimensional hybrid p…

Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUs Open

Ryan Synk, Monte Hoover, John Kirchenbauer, Neel Jain, Alex Stein , et al. · 2025

There is growing demand for performing inference with hundreds of thousands of input tokens on trained transformer models. Inference at this extreme scale demands significant computational resources, hindering the application of transforme…

Gemstones: A Model Suite for Multi-Faceted Scaling Laws Open

Sean McLeish, John Kirchenbauer, Don Miller, Siddharth Singh, Abhinav Bhatelé , et al. · 2025

Computer science Political science Mathematics

Scaling laws are typically fit using a family of models with a narrow range of frozen hyperparameter choices. In this work we study scaling laws using multiple architectural shapes and hyperparameter choices, highlighting their impact on r…

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Open

Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh , et al. · 2025

Computer science Mathematics Geology

We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This …

GenQA: Generating Millions of Instructions from a Handful of Prompts Open

Jiuhai Chen, Rifaa Qadri, Yuxin Wen, Neel Jain, John Kirchenbauer , et al. · 2024

Computer science

Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about finetuning at scale, such as curricula and learning rate cooldown schedules, th…

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs Open

Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi , et al. · 2024

Psychology Computer science

Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. Durin…

OPTune: Efficient Online Preference Tuning Open

Li-chang Chen, Jiuhai Chen, Chenxi Liu, John Kirchenbauer, Davit Soselia , et al. · 2024

Computer science Economics

Reinforcement learning with human feedback~(RLHF) is critical for aligning Large Language Models (LLMs) with human preference. Compared to the widely studied offline version of RLHF, \emph{e.g.} direct preference optimization (DPO), recent…

Transformers Can Do Arithmetic with the Right Embeddings Open

Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer , et al. · 2024

Mathematics Computer science Engineering

The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to ea…

LMD3: Language Model Data Density Dependence Open

John Kirchenbauer, Garrett Honke, Gowthami Somepalli, Jonas Geiping, Daphne Ippolito , et al. · 2024

Economics

We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate tha…

NEFTune: Noisy Embeddings Improve Instruction Finetuning Open

Neel Jain, Ping-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hongmin Chu , et al. · 2023

Computer science Philosophy

We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on Alpa…

Baseline Defenses for Adversarial Attacks Against Aligned Language Models Open

Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer , et al. · 2023

Computer science Geology Mathematics

As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing fro…

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models Open

Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu , et al. · 2023

Computer science Mathematics Biology

With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative. For example, a company deploying a client-facing chatbot must ensure that …

On the Reliability of Watermarks for Large Language Models Open

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah , et al. · 2023

Computer science Chemistry

As LLMs become commonplace, machine-generated text has the potential to flood the internet with spam, social media bots, and valueless content. Watermarking is a simple and effective strategy for mitigating such harms by enabling the detec…

Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust Open

Yuxin Wen, John Kirchenbauer, Jonas Geiping, Tom Goldstein · 2023

Computer science Mathematics Chemistry

Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this paper, we introduce a novel technique called Tree-Ring Watermarking that robustly f…

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery Open

Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping , et al. · 2023

Computer science Philosophy

The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prom…

A Watermark for Large Language Models Open

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers , et al. · 2023

Computer science Chemistry

Potential harms of large language models can be mitigated by watermarking model output, i.e., embedding signals into generated text that are invisible to humans but algorithmically detectable from a short span of tokens. We propose a water…

What is Your Metric Telling You? Evaluating Classifier Calibration under Context-Specific Definitions of Reliability Open

John Kirchenbauer, Jacob Oaks, Eric Heim · 2022

Computer science Mathematics Physics

Classifier calibration has received recent attention from the machine learning community due both to its practical utility in facilitating decision making, as well as the observation that modern neural network classifiers are poorly calibr…

John Kirchenbauer YOU? Author Swipe