Explanipedia

The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks Open

Sholom Schechtman, Nicolas Schreuder · 2025

We analyze the implicit bias of constant step stochastic subgradient descent (SGD). We consider the setting of binary classification with homogeneous neural networks - a large class of deep neural networks with ReLU-type activation functio…

The gradient's limit of a definable family of functions admits a variational stratification Open

Sholom Schechtman · 2024

Mathematics Computer science

It is well-known that the convergence of a family of smooth functions does not imply the convergence of its gradients. In this work, we show that if the family is definable in an o-minimal structure (for instance semialgebraic, subanalytic…

Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions Open

Pascal Bianchi, Walid Hachem, Sholom Schechtman · 2023

Mathematics Biology Engineering

In nonsmooth stochastic optimization, we establish the nonconvergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold M, wher…

SignSVRG: fixing SignSGD via variance reduction Open

Evgenii Chzhen, Sholom Schechtman · 2023

Mathematics Computer science Business

We consider the problem of unconstrained minimization of finite sums of functions. We propose a simple, yet, practical way to incorporate variance reduction techniques into SignSGD, guaranteeing convergence that is similar to the full sign…

Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold Open

Sholom Schechtman, Daniil Tiapkin, Michael Muehlebach, Éric Moulines · 2023

Mathematics Computer science Engineering

We consider the problem of minimizing a non-convex function over a smooth manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions Constrained Gradient Method (ODCGM) which only requires computing a projection onto a…

AskewSGD : An Annealed interval-constrained Optimisation method to train Quantized Neural Networks Open

Louis Leconte, Sholom Schechtman, Éric Moulines · 2022

Computer science Mathematics Biology

In this paper, we develop a new algorithm, Annealed Skewed SGD - AskewSGD - for training deep neural networks (DNNs) with quantized weights. First, we formulate the training of quantized neural networks (QNNs) as a smoothed sequence of int…

First-Order Constrained Optimization: Non-smooth Dynamical System Viewpoint Open

Sholom Schechtman, Daniil Tiapkin, Éric Moulines, Michael I. Jordan, Michael Muehlebach · 2022

Computer science Mathematics Economics

In a recent paper, Muehlebach and Jordan (2021a) proposed a novel algorithm for constrained optimization that uses original ideals from nonsmooth dynamical systems. In this work, we extend Muehlebach and Jordan (2021a) in several important…

Some Problems in Nonconvex Stochastic Optimization Open

Sholom Schechtman · 2021

Mathematics Computer science

The subject of this thesis is the analysis of several stochastic algorithms in a nonconvex setting. The aim is to prove and characterize their convergence. First, we study a smooth optimization problem, analyzing a family of adaptive algor…

Stochastic Subgradient Descent on a Generic Definable Function Converges to a Minimizer Open

Sholom Schechtman · 2021

Mathematics Physics Biology

It was previously shown by Davis and Drusvyatskiy that every Clarke critical point of a generic, semialgebraic (and more generally definable in an o-minimal structure), weakly convex function is lying on an active manifold and is either a …

Stochastic Subgradient Descent on a Generic Definable Function Converges\n to a Minimizer Open

Sholom Schechtman · 2021

Mathematics Physics Biology

It was previously shown by Davis and Drusvyatskiy that every Clarke critical\npoint of a generic, semialgebraic (and more generally definable in an o-minimal\nstructure), weakly convex function is lying on an active manifold and is either\…

Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions Open

Pascal Bianchi, Walid Hachem, Sholom Schechtman · 2021

Mathematics Engineering Political science

In non-smooth stochastic optimization, we establish the non-convergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold $M$ w…

Convergence of constant step stochastic gradient descent for non-smooth non-convex functions Open

Pascal Bianchi, Walid Hachem, Sholom Schechtman · 2020

Mathematics Computer science

This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function F , defined as the expectation of a non convex, non smooth, locally Lipschitz random function. As the g…

Sholom Schechtman YOU? Author Swipe