Explanipedia

Zephyrus: An Agentic Framework for Weather Science Open

Sumanth Varambally, Y. C. Chen, Zhirui Xia, Ruijia Niu, Mukesh Kumar Jain , et al. · 2025

Foundation models for weather science are pre-trained on vast amounts of structured numerical data and outperform traditional weather forecasting systems. However, these models lack language-based reasoning capabilities, limiting their uti…

Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion Open

Xunpeng Huang, Yingyu Lin, Nikki Lijing Kuang, Hanze Dong, Difan Zou , et al. · 2025

Continuous diffusion models have demonstrated remarkable performance in data generation across various domains, yet their efficiency remains constrained by two critical limitations: (1) the local adjacency structure of the forward Markov p…

Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference Open

Kyurae Kim, Yi-An Ma, Trevor Campbell, Jacob R. Gardner · 2025

We prove that, given a mean-field location-scale variational family, black-box variational inference (BBVI) with the reparametrization gradient converges at a rate that is nearly independent of explicit dimension dependence. Specifically, …

Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees Open

Nishant Jain, Xunpeng Huang, Yi-An Ma, Tong Zhang · 2025

Consistency models have recently emerged as a compelling alternative to traditional SDE-based diffusion models. They offer a significant acceleration in generation by producing high-quality samples in very few steps. Despite their empirica…

seeBias: A Comprehensive Tool for Assessing and Visualizing AI Fairness Open

Yilin Ning, Yi-An Ma, M. Liu, Xin Li, Nan Liu · 2025

Fairness in artificial intelligence (AI) prediction models is increasingly emphasized to support responsible adoption in high-stakes domains such as health care and criminal justice. Guidelines and implementation frameworks highlight the i…

Purifying Approximate Differential Privacy with Randomized Post-processing Open

Yingyu Lin, Erchi Wang, Yi-An Ma, Yu-Xiang Wang · 2025

We propose a framework to convert $(\varepsilon, δ)$-approximate Differential Privacy (DP) mechanisms into $(\varepsilon', 0)$-pure DP mechanisms under certain conditions, a process we call ``purification.'' This algorithmic technique leve…

Discovering Latent Causal Graphs from Spatiotemporal Data Open

Kun Wang, Sumanth Varambally, Duncan Watson‐Parris, Yi-An Ma, Rong Yu · 2024

Many important phenomena in scientific fields like climate, neuroscience, and epidemiology are naturally represented as spatiotemporal gridded data with complex interactions. Inferring causal relationships from these data is a challenging …

ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models Open

Veeramakali Vignesh Manivannan, Yasaman Jafari, Srikar Eranky, S. Shaun Ho, Rose Yu , et al. · 2024

The use of Large Language Models (LLMs) in climate science has recently gained significant attention. However, a critical issue remains: the lack of a comprehensive evaluation framework capable of assessing the quality and scientific valid…

Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk Open

Yuzhou Gu, Nikki Lijing Kuang, Yi-An Ma, Zhao Song, Lichen Zhang · 2024

We consider the problem of sampling from a $d$-dimensional log-concave distribution $π(θ) \propto \exp(-f(θ))$ for $L$-Lipschitz $f$, constrained to a convex body with an efficiently computable self-concordant barrier function, contained i…

A Skewness-Based Criterion for Addressing Heteroscedastic Noise in Causal Discovery Open

Yingyu Lin, Yuxing Huang, Wenqin Liu, Haoran Deng, Ignavier Ng , et al. · 2024

Real-world data often violates the equal-variance assumption (homoscedasticity), making it essential to account for heteroscedastic noise in causal discovery. In this work, we explore heteroscedastic symmetric noise models (HSNMs), where t…

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization Open

Dongxia Wu, Nikki Lijing Kuang, Ruijia Niu, Yi-An Ma, Rose Yu · 2024

Online black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle in a sample-efficient way. While prior studies focus on forward approaches such as Gaussian Processes (GPs) to learn a su…

Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation Open

Amartya Sanyal, Yaxi Hu, Yaodong Yu, Yi-An Ma, Zhaoran Wang , et al. · 2024

"Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configuratio…

Demystifying SGD with Doubly Stochastic Gradients Open

Kyurae Kim, Joohwan Ko, Yi-An Ma, Jacob R. Gardner · 2024

Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problem…

Faster Sampling via Stochastic Gradient Proximal Sampler Open

Xunpeng Huang, Difan Zou, Yi-An Ma, Hanze Dong, Tong Zhang · 2024

Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems. However, the proximal sampler, which exhibits much faster convergence than L…

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference Open

Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yi-An Ma , et al. · 2024

To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs. In this paper, we view such approaches as decomposing the en…

Multi-Fidelity Residual Neural Processes for Scalable Surrogate Modeling Open

Ruijia Niu, Dongxia Wu, Kai Kim, Yi-An Ma, Duncan Watson‐Parris , et al. · 2024

Multi-fidelity surrogate modeling aims to learn an accurate surrogate at the highest fidelity level by combining data from multiple sources. Traditional methods relying on Gaussian processes can hardly scale to high-dimensional data. Deep …

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints Open

Lingkai Kong, Yuanqi Du, Wenhao Mu, Kirill Neklyudov, Valentin De Bortoli , et al. · 2024

Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailable. While numerous studies have addressed the issue of unknown objectives, limited research has focu…

Learning Granger Causality from Instance-wise Self-attentive Hawkes Processes Open

Dongxia Wu, Tsuyoshi Idé, Aurélie Lozano, Γεώργιος Κόλλιας, Jiří Navrátil , et al. · 2024

We address the problem of learning Granger causality from asynchronous, interdependent, multi-type event sequences. In particular, we are interested in discovering instance-level causal structures in an unsupervised manner. Instance-level …

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo Open

Xunpeng Huang, Difan Zou, Hanze Dong, Yi-An Ma, Tong Zhang · 2024

To sample from a general target distribution $p_*\propto e^{-f_*}$ beyond the isoperimetric condition, Huang et al. (2023) proposed to perform sampling through reverse diffusion, giving rise to Diffusion-based Monte Carlo (DMC). Specifical…

A Gradient-Based Optimization Method Using the Koopman Operator Open

Mengqi Hu, Bian Li, Yi-An Ma, Yifei Lou, Yang Xiu · 2023

In this paper, we propose a novel approach to solving optimization problems by reformulating the optimization problem into a dynamical system, followed by the adaptive spectral Koopman (ASK) method. The Koopman operator, employed in our ap…

Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation Open

Nikki Lijing Kuang, Ming Yin, Mengdi Wang, Yu-Xiang Wang, Yi-An Ma · 2023

Recent studies in reinforcement learning (RL) have made significant progress by leveraging function approximation to alleviate the sample complexity hurdle for better performance. Despite the success, existing provably efficient algorithms…

Tractable MCMC for Private Learning with Pure and Gaussian Differential Privacy Open

Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg · 2023

Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\vare…

Discovering Mixtures of Structural Causal Models from Time Series Data Open

Sumanth Varambally, Yi-An Ma, Rose Yu · 2023

Discovering causal relationships from time series data is significant in fields such as finance, climate science, and neuroscience. However, contemporary techniques rely on the simplifying assumption that data originates from the same caus…

Design and performance of the field cage for the XENONnT experiment Open

E. Aprile, K. Abe, S. Ahmed Maouloud, L. Althueser, B. Andrieu , et al. · 2023

The precision in reconstructing events detected in a dual-phase time projection chamber depends on an homogeneous and well understood electric field within the liquid target. In the XENONnT TPC the field homogeneity is achieved through a d…

Deep Bayesian Active Learning for Accelerating Stochastic Simulation Open

Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Alessandro Vespignani, Yi-An Ma , et al. · 2023

Stochastic simulations such as large-scale, spatiotemporal, age-structured epidemic models are computationally expensive at fine-grained resolution. While deep surrogate models can speed up the simulations, doing so for stochastic simulati…

Optimization on Pareto sets: On a theory of multi-objective optimization Open

Abhishek Roy, Geelon So, Yi-An Ma · 2023

In multi-objective optimization, a single decision vector must balance the trade-offs between many objectives. Solutions achieving an optimal trade-off are said to be Pareto optimal: these are decision vectors for which improving any one o…

Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing? Open

Kyurae Kim, Yi-An Ma, Jacob R. Gardner · 2023

We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification…

Reverse Diffusion Monte Carlo Open

Xunpeng Huang, Hanze Dong, Yifan Hao, Yi-An Ma, Tong Zhang · 2023

We propose a Monte Carlo sampler from the reverse diffusion process. Unlike the practice of diffusion models, where the intermediary updates -- the score functions -- are learned with a neural network, we transform the score matching probl…

Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning Open

Amin Karbasi, Nikki Lijing Kuang, Yi-An Ma, Siddharth Mitra · 2023

Thompson sampling (TS) is widely used in sequential decision making due to its ease of use and appealing empirical performance. However, many existing analytical and empirical results for TS rely on restrictive assumptions on reward distri…

A Central Limit Theorem for Algorithmic Estimator of Saddle Point Open

Abhishek Roy, Yi-An Ma · 2023

In this work, we study the asymptotic randomness of an algorithmic estimator of the saddle point of a globally convex-concave and locally strongly-convex strongly-concave objective. Specifically, we show that the averaged iterates of a Sto…

Yi-An Ma YOU? Author Swipe