Explanipedia

Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks Open

Andrea Montanari, Pierfrancesco Urbani · 2025

Understanding the inductive bias and generalization properties of large overparametrized machine learning models requires to characterize the dynamics of the training algorithm. We study the learning dynamics of large two-layer neural netw…

Local minima of the empirical risk in high dimension: General theorems and convex examples Open

Kiana Asgari, Andrea Montanari, Basil N. Saeed · 2025

Mathematics

We consider a general model for high-dimensional empirical risk minimization whereby the data $\mathbf{x}_i$ are $d$-dimensional isotropic Gaussian vectors, the model is parametrized by $\mathbfΘ\in\mathbb{R}^{d\times k}$, and the loss dep…

Provably Efficient Posterior Sampling for Sparse Linear Regression via Measure Decomposition Open

Andrea Montanari, Yuchen Wu · 2024

Mathematics Computer science Chemistry

We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbolθ$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbolθ+\boldsymbol{\varepsilon}$. In general, such …

Which exceptional low-dimensional projections of a Gaussian point cloud can be found in polynomial time? Open

Andrea Montanari, Kangjie Zhou · 2024

Computer science Mathematics Physics

Given $d$-dimensional standard Gaussian vectors $\boldsymbol{x}_1,\dots, \boldsymbol{x}_n$, we consider the set of all empirical distributions of its $m$-dimensional projections, for $m$ a fixed constant. Diaconis and Freedman (1984) prove…

On Smale's 17th problem over the reals Open

Andrea Montanari, Eliran Subag · 2024

Computer science Mathematics

We consider the problem of efficiently solving a system of $n$ non-linear equations in ${\mathbb R}^d$. Addressing Smale's 17th problem stated in 1998, we consider a setting whereby the $n$ equations are random homogeneous polynomials of a…

Sampling from Spherical Spin Glasses in Total Variation via Algorithmic Stochastic Localization Open

Brice Huang, Andrea Montanari, Huy Tuan Pham · 2024

Mathematics Physics Computer science

We consider the problem of algorithmically sampling from the Gibbs measure of a mixed $p$-spin spherical spin glass. We give a polynomial-time algorithm that samples from the Gibbs measure up to vanishing total variation error, for any mod…

Scaling laws for learning with real and surrogate data Open

Ayush Jain, Andrea Montanari, Eren Şaşoğlu · 2024

Computer science Economics Mathematics

Collecting large quantities of high-quality data can be prohibitively expensive or impractical, and a bottleneck in machine learning. One may instead augment a small set of $n$ data points from the target distribution with data from more a…

Optimization of random cost functions and statistical physics Open

Andrea Montanari · 2024

Physics Computer science Mathematics

This is the text of my report presented at the 29th Solvay Conference on Physics on `The Structure and Dynamics of Disordered Systems' held in Bruxelles from October 19 to 21, 2023. I consider the problem of minimizing a random energy func…

Discovery of sparse, reliable omic biomarkers with Stabl Open

Julien Hédou, Ivana Marić, Grégoire Bellan, Jakob Einhaus, Dyani Gaudillière , et al. · 2024

Computer science Biology

Adoption of high-content omic technologies in clinical studies, coupled with computational methods, has yielded an abundance of candidate biomarkers. However, translating such findings into bona fide clinical biomarkers remains challenging…

Sampling from Mean-Field Gibbs Measures via Diffusion Processes Open

A. El Alaoui, Andrea Montanari, Mark Sellke · 2023

Mathematics Physics Computer science

We consider Ising mixed $p$-spin glasses at high-temperature and without external field, and study the problem of sampling from the Gibbs distribution $μ$ in polynomial time. We develop a new sampling algorithm with complexity of the same …

Universality of max-margin classifiers Open

Andrea Montanari, Feng Ruan, Basil N. Saeed, Youngtak Sohn · 2023

Mathematics Computer science Physics

Maximum margin binary classification is one of the most fundamental algorithms in machine learning, yet the role of featurization maps and the high-dimensional asymptotics of the misclassification error for non-Gaussian features are still …

Towards a statistical theory of data selection under weak supervision Open

Germain Kolossov, Andrea Montanari, Pulkit Tandon · 2023

Mathematics Computer science Physics

Given a sample of size $N$, it is often useful to select a subsample of smaller size $n

Six Lectures on Linearized Neural Networks Open

Theodor Misiakiewicz, Andrea Montanari · 2023

Computer science Mathematics Geology

In these six lectures, we examine what can be learnt about the behavior of multi-layer neural networks from the analysis of linear models. We first recall the correspondence between neural networks and linear models via the so-called lazy …

Shattering in Pure Spherical Spin Glasses Open

A. El Alaoui, Andrea Montanari, Mark Sellke · 2023

Physics Mathematics Computer science

We prove the existence of a shattered phase within the replica-symmetric phase of the pure spherical $p$-spin models for $p$ sufficiently large. In this phase, we construct a decomposition of the sphere into well-separated small clusters, …

Adversarial examples in random neural networks with general activations Open

Andrea Montanari, Yuchen Wu · 2023

Physics Mathematics

A substantial body of empirical work documents the lack of robustness in deep learning models to adversarial examples. Recent theoretical work proved that adversarial examples are ubiquitous in two-layers networks with sub-exponential widt…

Solving systems of Random Equations via First and Second-Order Optimization Algorithms Open

Andrea Montanari, Eliran Subag · 2023

Mathematics Physics

Gradient-based (a.k.a. `first order') optimization algorithms are routinely used to solve large scale non-convex problems. Yet, it is generally hard to predict their effectiveness. In order to gain insight into this question, we revisit th…

Sampling, Diffusions, and Stochastic Localization Open

Andrea Montanari · 2023

Mathematics Computer science Physics

Diffusions are a successful technique to sample from high-dimensional distributions. The target distribution can be either explicitly given or learnt from a collection of samples. They implement a diffusion process whose endpoint is a samp…

Local algorithms for maximum cut and minimum bisection on locally treelike regular graphs of large degree Open

A. El Alaoui, Andrea Montanari, Mark Sellke · 2023

Mathematics Physics

Given a graph of degree over vertices, we consider the problem of computing a near maximum cut or a near minimum bisection in polynomial time. For graphs of girth , we develop a local message passing algorithm whose complexity is , and tha…

Posterior Sampling in High Dimension via Diffusion Processes Open

Andrea Montanari, Yuchen Wu · 2023

Mathematics Computer science Physics

Sampling from the posterior is a key technical problem in Bayesian statistics. Rigorous guarantees are difficult to obtain for Markov Chain Monte Carlo algorithms of common use. In this paper, we study an alternative class of algorithms ba…

Stabl: sparse and reliable biomarker discovery in predictive modeling of high-dimensional omic data Open

Julien Hédou, Ivana Marić, Grégoire Bellan, Jakob Einhaus, Dyani Gaudillière , et al. · 2023

Computer science Biology

High-content omic technologies coupled with sparsity-promoting regularization methods (SRM) have transformed the biomarker discovery process. However, the translation of computational results into a clinical use-case scenario remains chall…

Learning time-scales in two-layers neural networks Open

Raphaël Berthier, Andrea Montanari, Kangjie Zhou · 2023

Computer science Mathematics Physics

Gradient-based learning in multi-layer neural networks displays a number of striking features. In particular, the decrease rate of empirical risk is non-monotone even after averaging over large batches. Long plateaus in which one observes …

Compressing Tabular Data via Latent Variable Estimation Open

Andrea Montanari, Eric J. Weiner · 2023

Computer science Mathematics

Data used for analytics and machine learning often take the form of tables with categorical entries. We introduce a family of lossless compression algorithms for such data that proceed in four steps: $(i)$ Estimate latent variables associa…

Nonnegative Matrix Factorization Via Archetypal Analysis Open

Hamid Javadi, Andrea Montanari · 2023

Mathematics Computer science Physics

Given a collection of data points, nonnegative matrix factorization (NMF) suggests expressing them as convex combinations of a small set of “archetypes” with nonnegative entries. This decomposition is unique only if the true archetypes are…

Equivalence of Approximate Message Passing and Low-Degree Polynomials in Rank-One Matrix Estimation Open

Andrea Montanari, Alexander S. Wein · 2022

Mathematics Physics Materials science

We consider the problem of estimating an unknown parameter vector ${\boldsymbol θ}\in{\mathbb R}^n$, given noisy observations ${\boldsymbol Y} = {\boldsymbol θ}{\boldsymbol θ}^{\top}/\sqrt{n}+{\boldsymbol Z}$ of the rank-one matrix ${\bold…

Fundamental Limits of Low-Rank Matrix Estimation with Diverging Aspect Ratios Open

Andrea Montanari, Yuchen Wu · 2022

Mathematics Physics Philosophy

We consider the problem of estimating the factors of a low-rank $n \times d$ matrix, when this is corrupted by additive Gaussian noise. A special example of our setting corresponds to clustering mixtures of Gaussians with equal (known) cov…

Dimension free ridge regression Open

Cheng Chen, Andrea Montanari · 2022

Mathematics Physics Materials science

Random matrix theory has become a widely useful tool in high-dimensional statistics and theoretical machine learning. However, random matrix theory is largely focused on the proportional asymptotics in which the number of columns grows pro…

Optimization of random high-dimensional functions: Structure and algorithms Open

Antonio Auffinger, Andrea Montanari, Eliran Subag · 2022

Mathematics Computer science Physics

Replica symmetry breaking postulates that near optima of spin glass Hamiltonians have an ultrametric structure. Namely, near optima can be associated to leaves of a tree, and the Euclidean distance between them corresponds to the distance …

Overparametrized linear dimensionality reductions: From projection pursuit to two-layer neural networks Open

Andrea Montanari, Kangjie Zhou · 2022

Mathematics Physics Philosophy

Given a cloud of $n$ data points in $\mathbb{R}^d$, consider all projections onto $m$-dimensional subspaces of $\mathbb{R}^d$ and, for each such projection, the empirical distribution of the projected points. What does this collection of p…

A Friendly Tutorial on Mean-Field Spin Glass Techniques for Non-Physicists Open

Andrea Montanari, Subhabrata Sen · 2022

Computer science Physics Mathematics

This tutorial is based on lecture notes written for a class taught in the Statistics Department at Stanford in the Winter Quarter of 2017. The objective was to provide a working knowledge of some of the techniques developed over the last 4…

Andrea Montanari YOU? Author Swipe