Exploring foci of:
arXiv (Cornell University)
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
July 2024 • Boyuan Chen, Diego Martí Monsó, Yilun Du, Max Simchowitz, Russ Tedrake, Vincent Sitzmann
This paper presents Diffusion Forcing, a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels. We apply Diffusion Forcing to sequence generative modeling by training a causal next-token prediction model to generate one or several future tokens without fully diffusing past ones. Our approach is shown to combine the strengths of next-token prediction models, such as variable-length generation, with the strengths of full-sequence diffusion models,…
Security Token
Computer Science
Algorithm
Mathematics
Physics
Mathematical Analysis
Chemistry
Biochemistry