Explanipedia

Online Learning-guided Learning Rate Adaptation via Gradient Alignment Open

Ruichen Jiang, Ali Kavis, Aryan Mokhtari · 2025

The performance of an optimizer on large-scale deep learning models depends critically on fine-tuning the learning rate, often requiring an extensive grid search over base learning rates, schedules, and other hyperparameters. In this paper…

Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting Open

Sunny Sanyal, Hayden Prairie, Rudrajit Das, Ali Kavis, Sujay Sanghavi · 2025

Computer science Psychology

Fine-tuning a pre-trained model on a downstream task often degrades its original capabilities, a phenomenon known as "catastrophic forgetting". This is especially an issue when one does not have access to the data and recipe used to develo…

Understanding Self-Supervised Learning via Gaussian Mixture Models Open

Parikshit Bansal, Ali Kavis, Sujay Sanghavi · 2024

Computer science Mathematics Physics

Self-supervised learning attempts to learn representations from un-labeled data; it does so via a loss function that encourages the embedding of a point to be close to that of its augmentations. This simple idea performs remarkably well, y…

Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization Open

Ruichen Jiang, Ali Kavis, Qiujiang Jin, Sujay Sanghavi, Aryan Mokhtari · 2024

Computer science Mathematics Economics

We propose adaptive, line search-free second-order methods with optimal rate of convergence for solving convex-concave min-max problems. By means of an adaptive step size, our algorithms feature a simple update rule that requires solving o…

Universal Gradient Methods for Stochastic Convex Optimization Open

Anton Rodomanov, Ali Kavis, Yongtao Wu, Kimon Antonakopoulos, Volkan Cevher · 2024

Computer science Mathematics

We develop universal gradient methods for Stochastic Convex Optimization (SCO). Our algorithms automatically adapt not only to the oracle's noise but also to the Hölder smoothness of the objective function without a priori knowledge of the…

Advancing the lower bounds: An accelerated, stochastic, second-order method with optimal adaptation to inexactness Open

Artem Agafonov, Dmitry Kamzolov, Alexander Gasnikov, Ali Kavis, Kimon Antonakopoulos , et al. · 2023

Mathematics Computer science Economics

We present a new accelerated stochastic second-order method that is robust to both gradient and Hessian inexactness, which occurs typically in machine learning. We establish theoretical lower bounds and prove that our algorithm achieves op…

Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods Open

Kimon Antonakopoulos, Ali Kavis, Volkan Cevher · 2022

Mathematics Computer science Business

This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(σ/ \sqrt{T})$ convergence when the oracle feedback is stochastic with variance $σ^2$, and impro…

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization Open

Ali Kavis, Stratis Skoulakis, Kimon Antonakopoulos, Leello Tadesse Dadi, Volkan Cevher · 2022

Mathematics Computer science

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan & Streete…

High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize Open

Ali Kavis, Kfir Y. Levy, Volkan Cevher · 2022

Mathematics Computer science Economics

In this paper, we propose a new, simplified high probability analysis of AdaGrad for smooth, non-convex problems. More specifically, we focus on a particular accelerated gradient (AGD) template (Lan, 2020), through which we recover the ori…

On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems Open

Panayotis Mertikopoulos, Nadav Hallak, Ali Kavis, Volkan Cevher · 2021

Computer science Mathematics Economics

This paper analyzes the trajectories of stochastic gradient descent (SGD) to help understand the algorithm’s convergence properties in non-convex problems. We first show that the sequence of iterates generated by SGD remains bounded and co…

Sifting through the Noise: Universal First-Order Methods for Stochastic Variational Inequalities Open

Kimon Antonakopoulos, Thomas Michaelsen Pethick, Ali Kavis, Panayotis Mertikopoulos, Volkan Cevher · 2021

Computer science Mathematics Economics

We examine a flexible algorithmic framework for solving monotone variational inequalities in the presence of randomness and uncertainty. The proposed template encompasses a wide range of popular first-order methods, including dual averagin…

STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization Open

Kfir Y. Levy, Ali Kavis, Volkan Cevher · 2021

Computer science Mathematics Economics

In this work we investigate stochastic non-convex optimization problems where the objective is an expectation over smooth loss functions, and the goal is to find an approximate stationary point. The most popular approach to handling such p…

Double-Loop Unadjusted Langevin Algorithm Open

Paul Rolland, Armin Eftekhari, Ali Kavis, Volkan Cevher · 2020

Mathematics Computer science Economics

A well-known first-order method for sampling from log-concave probability distributions is the Unadjusted Langevin Algorithm (ULA). This work proposes a new annealing step-size schedule for ULA, which allows to prove new convergence guaran…

On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems Open

Panayotis Mertikopoulos, Nadav Hallak, Ali Kavis, Volkan Cevher · 2020

Mathematics Computer science Economics

This paper analyzes the trajectories of stochastic gradient descent (SGD) to help understand the algorithm's convergence properties in non-convex problems. We first show that the sequence of iterates generated by SGD remains bounded and co…

UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization Open

Ali Kavis, Kfir Y. Levy, Francis Bach, Volkan Cevher · 2019

Computer science Mathematics

We propose a novel adaptive, accelerated algorithm for the stochastic constrained convex optimization setting. Our method, which is inspired by the Mirror-Prox method, \emph{simultaneously} achieves the optimal rates for smooth/non-smooth …

Efficient learning of smooth probability functions from Bernoulli tests\n with guarantees Open

Paul Rolland, Ali Kavis, Alex Immer, Adish Singla, Volkan Cevher · 2018

Computer science Mathematics Political science

We study the fundamental problem of learning an unknown, smooth probability\nfunction via pointwise Bernoulli tests. We provide a scalable algorithm for\nefficiently solving this problem with rigorous guarantees. In particular, we\nprove t…

Efficient learning of smooth probability functions from Bernoulli tests with guarantees Open

Paul Rolland, Ali Kavis, Alex Immer, Adish Singla, Volkan Cevher · 2018

Computer science Mathematics Economics

We study the fundamental problem of learning an unknown, smooth probability function via pointwise Bernoulli tests. We provide a scalable algorithm for efficiently solving this problem with rigorous guarantees. In particular, we prove the …

Ali Kavis YOU? Author Swipe