Jonathan Lorraine
YOU?
Author Swipe
View article: Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond
Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond Open
We introduce Audio-SDS, a generalization of Score Distillation Sampling (SDS) to text-conditioned audio diffusion models. While SDS was initially designed for text-to-3D generation using image diffusion, its core idea of distilling a power…
View article: LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Open
This work explores expanding the capabilities of large language models (LLMs) pretrained on text to generate 3D meshes within a unified model. This offers key advantages of (1) leveraging spatial knowledge already embedded in LLMs, derived…
View article: Multi-student Diffusion Distillation for Better One-step Generators
Multi-student Diffusion Distillation for Better One-step Generators Open
Diffusion models achieve high-quality sample generation at the cost of a lengthy multistep inference procedure. To overcome this, diffusion distillation techniques produce student generators capable of matching or surpassing the teacher in…
View article: Scalable Nested Optimization for Deep Learning
Scalable Nested Optimization for Deep Learning Open
Gradient-based optimization has been critical to the success of machine learning, updating a single set of parameters to minimize a single loss. A growing number of applications rely on a generalization of this, where we have a bilevel or …
View article: Improving Hyperparameter Optimization with Checkpointed Model Weights
Improving Hyperparameter Optimization with Checkpointed Model Weights Open
When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a…
View article: Training Data Attribution via Approximate Unrolled Differentiation
Training Data Attribution via Approximate Unrolled Differentiation Open
Many training data attribution (TDA) methods aim to estimate how a model's behavior would change if one or more data points were removed from the training set. Methods based on implicit differentiation, such as influence functions, can be …
View article: LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis
LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis Open
Recent text-to-3D generation approaches produce impressive 3D results but require time-consuming optimization that can take up to an hour per prompt. Amortized methods like ATT3D optimize multiple prompts simultaneously to improve efficien…
View article: Graph Metanetworks for Processing Diverse Neural Architectures
Graph Metanetworks for Processing Diverse Neural Architectures Open
Neural networks efficiently encode learned information within their parameters. Consequently, many tasks can be unified by treating neural networks themselves as input data. When doing so, recent studies demonstrated the importance of acco…
View article: Using Large Language Models for Hyperparameter Optimization
Using Large Language Models for Hyperparameter Optimization Open
This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on…
View article: ATT3D: Amortized Text-to-3D Object Synthesis
ATT3D: Amortized Text-to-3D Object Synthesis Open
Text-to-3D modelling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently achieved high-quality results but requires a lengthy, per-prompt optimi…
View article: On Implicit Bias in Overparameterized Bilevel Optimization
On Implicit Bias in Overparameterized Bilevel Optimization Open
Many problems in machine learning involve bilevel optimization (BLO), including hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems consist of two nested sub-problems, called the outer and inner problems,…
View article: Task Selection for AutoML System Evaluation
Task Selection for AutoML System Evaluation Open
Our goal is to assess if AutoML system changes - i.e., to the search space or hyperparameter optimization - will improve the final model's performance on production tasks. However, we cannot test the changes on production tasks. Instead, w…
View article: Lyapunov Exponents for Diversity in Differentiable Games
Lyapunov Exponents for Diversity in Differentiable Games Open
Ridge Rider (RR) is an algorithm for finding diverse solutions to optimization problems by following eigenvectors of the Hessian ("ridges"). RR is designed for conservative gradient systems (i.e., settings involving a single loss function)…
View article: Input Convex Gradient Networks
Input Convex Gradient Networks Open
The gradients of convex functions are expressive models of non-trivial vector fields. For example, Brenier's theorem yields that the optimal transport map between any two measures on Euclidean space under the squared distance is realized a…
View article: Meta-Learning to Improve Pre-Training
Meta-Learning to Improve Pre-Training Open
Pre-training (PT) followed by fine-tuning (FT) is an effective method for training neural networks, and has led to significant performance improvements in many domains. PT can incorporate various design choices such as task and data reweig…
View article: Complex Momentum for Learning in Games.
Complex Momentum for Learning in Games. Open
We generalize gradient descent with momentum for learning in differentiable games to have complex-valued momentum. We give theoretical motivation for our method by proving convergence on bilinear zero-sum games for simultaneous and alterna…
View article: Complex Momentum for Optimization in Games
Complex Momentum for Optimization in Games Open
We generalize gradient descent with momentum for optimization in differentiable games to have complex-valued momentum. We give theoretical motivation for our method by proving convergence on bilinear zero-sum games for simultaneous and alt…
View article: Optimizing Millions of Hyperparameters by Implicit Differentiation
Optimizing Millions of Hyperparameters by Implicit Differentiation Open
We propose an algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations. We present results about the relationship between the IFT a…
View article: Understanding Neural Architecture Search Techniques
Understanding Neural Architecture Search Techniques Open
Automatic methods for generating state-of-the-art neural network architectures without human experts have generated significant attention recently. This is because of the potential to remove human experts from the design loop which can red…
View article: Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions
Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions Open
Hyperparameter optimization can be formulated as a bilevel optimization problem, where the optimal parameters on the training set depend on the hyperparameters. We aim to adapt regularization hyperparameters for neural networks by fitting …
View article: Stochastic Hyperparameter Optimization through Hypernetworks
Stochastic Hyperparameter Optimization through Hypernetworks Open
Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of weights and hyperparam…