Sitan Chen
YOU?
Author Swipe
View article: Quantum Probe Tomography
Quantum Probe Tomography Open
Characterizing quantum many-body systems is a fundamental problem across physics, chemistry, and materials science. While significant progress has been made, many existing Hamiltonian learning protocols demand digital quantum control over …
View article: Efficient Pauli Channel Estimation with Logarithmic Quantum Memory
Efficient Pauli Channel Estimation with Logarithmic Quantum Memory Open
In this work, we consider one of the prototypical tasks for characterizing the structure of noise in quantum devices: estimating eigenvalues of an n-qubit Pauli noise channel. Prior work [Chen , Phys. Rev. A 105, 032435 (2022)] has proved …
View article: Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions
Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions Open
In recent years, masked diffusion models (MDMs) have emerged as a promising alternative approach for generative modeling over discrete domains. Compared to autoregressive models (ARMs), MDMs trade off complexity at training time with flexi…
View article: Blink of an eye: a simple theory for feature localization in generative models
Blink of an eye: a simple theory for feature localization in generative models Open
Large language models can exhibit unexpected behavior in the blink of an eye. In a recent computer use demo, a language model switched from coding to Googling pictures of Yellowstone, and these sudden shifts in behavior have also been obse…
View article: Gradient dynamics for low-rank fine-tuning beyond kernels
Gradient dynamics for low-rank fine-tuning beyond kernels Open
LoRA has emerged as one of the de facto methods for fine-tuning foundation models with low computational cost and memory footprint. The idea is to only train a low-rank perturbation to the weights of a pre-trained model, given supervised d…
View article: Unrolled denoising networks provably learn optimal Bayesian inference
Unrolled denoising networks provably learn optimal Bayesian inference Open
Much of Bayesian inference centers around the design of estimators for inverse problems which are optimal assuming the data comes from a known prior. But what do these optimality guarantees mean if the prior is unknown? In recent years, al…
View article: What does guidance do? A fine-grained analysis in a simple setting
What does guidance do? A fine-grained analysis in a simple setting Open
The use of guidance in diffusion models was originally motivated by the premise that the guidance-modified score is that of the data distribution tilted by a conditional likelihood raised to some power. In this work we clarify this misconc…
View article: Predicting quantum channels over general product distributions
Predicting quantum channels over general product distributions Open
We investigate the problem of predicting the output behavior of unknown quantum channels. Given query access to an $n$-qubit channel $E$ and an observable $O$, we aim to learn the mapping \begin{equation*} ρ\mapsto \mathrm{Tr}(O E[ρ]) \end…
View article: Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel
Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel Open
Sampling algorithms play an important role in controlling the quality and runtime of diffusion model inference. In recent years, a number of works~\cite{chen2023sampling,chen2023ode,benton2023error,lee2022convergence} have proposed schemes…
View article: Optimal tradeoffs for estimating Pauli observables
Optimal tradeoffs for estimating Pauli observables Open
We revisit the problem of Pauli shadow tomography: given copies of an unknown $n$-qubit quantum state $ρ$, estimate $\text{tr}(Pρ)$ for some set of Pauli operators $P$ to within additive error $ε$. This has been a popular testbed for explo…
View article: Learning general Gaussian mixtures with efficient score matching
Learning general Gaussian mixtures with efficient score matching Open
We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions. We make no separation assumptions on the underlying mixture components: we only require that the covariance matrices have bounded condition number and that the m…
View article: Critical windows: non-asymptotic theory for feature emergence in diffusion models
Critical windows: non-asymptotic theory for feature emergence in diffusion models Open
We develop theory to understand an intriguing property of diffusion models for image generation that we term critical windows. Empirically, it has been observed that there are narrow time intervals in sampling during which particular featu…
View article: An optimal tradeoff between entanglement and copy complexity for state tomography
An optimal tradeoff between entanglement and copy complexity for state tomography Open
There has been significant interest in understanding how practical constraints on contemporary quantum devices impact the complexity of quantum learning. For the classic question of tomography, recent work tightly characterized the copy co…
View article: Provably learning a multi-head attention layer
Provably learning a multi-head attention layer Open
The multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models. Given a sequence length $k$, attention matrices $\mathbfΘ_1,\ldots,\mathbfΘ_m\in\mathbb{R}…
View article: Learning to Predict Arbitrary Quantum Processes
Learning to Predict Arbitrary Quantum Processes Open
We present an efficient machine-learning (ML) algorithm for predicting any unknown quantum process E over n qubits. For a wide range of distributions D on arbitrary n-qubit states, we show that this ML algorithm can learn to predict any lo…
View article: Efficient Pauli channel estimation with logarithmic quantum memory
Efficient Pauli channel estimation with logarithmic quantum memory Open
Here we revisit one of the prototypical tasks for characterizing the structure of noise in quantum devices: estimating every eigenvalue of an $n$-qubit Pauli noise channel to error $ε$. Prior work [14] proved no-go theorems for this task i…
View article: A faster and simpler algorithm for learning shallow networks
A faster and simpler algorithm for learning shallow networks Open
We revisit the well-studied problem of learning a linear combination of $k$ ReLU activations given labeled examples drawn from the standard $d$-dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this p…
View article: Learning Mixtures of Gaussians Using the DDPM Objective
Learning Mixtures of Gaussians Using the DDPM Objective Open
Recent works have shown that diffusion models can learn essentially any distribution provided one can perform score estimation. Yet it remains poorly understood under what settings score estimation is possible, let alone when practical gra…
View article: Learning Polynomial Transformations via Generalized Tensor Decompositions
Learning Polynomial Transformations via Generalized Tensor Decompositions Open
We consider the problem of learning high dimensional polynomial transformations of Gaussians. Given samples of the form f(x), where x∼N(0,Ir) is hidden and f: ℝr → ℝd is a function where every output coordinate is a low-degree polynomial, …
View article: Learning Narrow One-Hidden-Layer ReLU Networks
Learning Narrow One-Hidden-Layer ReLU Networks Open
We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions. We give the first polynomial-time algorithm that succeeds whenever $k$ is a …
View article: Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers
Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers Open
We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling. Several recent works have analyzed stochastic samplers using tools like Girsanov's theorem and a chain rule variant of the…
View article: Flowers: precious food and medicine resources
Flowers: precious food and medicine resources Open
Flower plants are popular all over the world and important sources of ornamental plants, bioactive molecules and nutrients. Flowers have a wide range of biological activities and beneficial pharmacological effects. Flowers and their active…
View article: Learning to predict arbitrary quantum processes
Learning to predict arbitrary quantum processes Open
We present an efficient machine learning (ML) algorithm for predicting any unknown quantum process $\mathcal{E}$ over $n$ qubits. For a wide range of distributions $\mathcal{D}$ on arbitrary $n$-qubit states, we show that this ML algorithm…
View article: The Complexity of NISQ
The Complexity of NISQ Open
The recent proliferation of NISQ devices has made it imperative to understand their computational power. In this work, we define and study the complexity class $\textsf{NISQ} $, which is intended to encapsulate problems that can be efficie…
View article: When Does Adaptivity Help for Quantum State Learning?
When Does Adaptivity Help for Quantum State Learning? Open
We consider the classic question of state tomography: given copies of an unknown quantum state $ρ\in\mathbb{C}^{d\times d}$, output $\widehatρ$ which is close to $ρ$ in some sense, e.g. trace distance or fidelity. When one is allowed to ma…
View article: Quantum advantage in learning from experiments
Quantum advantage in learning from experiments Open
Quantum technology promises to revolutionize how we learn about the physical world. An experiment that processes quantum data with a quantum computer could have substantial advantages over conventional experiments in which quantum states a…
View article: Kalman filtering with adversarial corruptions
Kalman filtering with adversarial corruptions Open
Here we revisit the classic problem of linear quadratic estimation, i.e. estimating the trajectory of a linear dynamical system from noisy measurements. The celebrated Kalman filter gives an optimal estimator when the measurement noise is …