Alexander Gasnikov
YOU?
Author Swipe
View article: Training-Free Out-Of-Distribution Segmentation With Foundation Models
Training-Free Out-Of-Distribution Segmentation With Foundation Models Open
Detecting unknown objects in semantic segmentation is crucial for safety-critical applications such as autonomous driving. Large vision foundation models, including DINOv2, InternImage, and CLIP, have advanced visual representation learnin…
View article: Modeling skiers flows via Wardrope equilibrium in closed capacitated networks
Modeling skiers flows via Wardrope equilibrium in closed capacitated networks Open
We propose an equilibrium model of ski resorts where users are assigned to cycles in a closed network. As queues form on lifts with limited capacity, we derive an efficient way to find waiting times via convex optimization. The equilibrium…
View article: Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness
Sign Operator for Coping with Heavy-Tailed Noise in Non-Convex Optimization: High Probability Bounds Under $(L_0, L_1)$-Smoothness Open
In recent years, non-convex optimization problems are more often described by generalized $(L_0, L_1)$-smoothness assumption rather than standard one. Meanwhile, severely corrupted data used in these problems has increased the demand for m…
View article: Power of Generalized Smoothness in Stochastic Convex Optimization: First- and Zero-Order Algorithms
Power of Generalized Smoothness in Stochastic Convex Optimization: First- and Zero-Order Algorithms Open
This paper is devoted to the study of stochastic optimization problems under the generalized smoothness assumption. By considering the unbiased gradient oracle in Stochastic Gradient Descent, we provide strategies to achieve in bounds the …
View article: Decentralised convex optimisation with probability-proportional-to-size quantization
Decentralised convex optimisation with probability-proportional-to-size quantization Open
Communication is one of the bottlenecks of distributed optimisation and learning. To overcome this bottleneck, we propose a novel quantization method that transforms a vector into a sample of components' indices drawn from a categorical di…
View article: Decentralised convex optimisation with probability-proportional-to-size quantization
Decentralised convex optimisation with probability-proportional-to-size quantization Open
Communication is one of the bottlenecks of distributed optimisation and learning. To overcome this bottleneck, we propose a novel quantization method that transforms a vector into a sample of components' indices drawn from a categorical di…
View article: Ruppert-Polyak averaging for Stochastic Order Oracle
Ruppert-Polyak averaging for Stochastic Order Oracle Open
Black-box optimization, a rapidly growing field, faces challenges due to limited knowledge of the objective function's internal mechanisms. One promising approach to address this is the Stochastic Order Oracle Concept. This concept, simila…
View article: Accelerated Bregman gradient methods for relatively smooth and relatively Lipschitz continuous minimization problems
Accelerated Bregman gradient methods for relatively smooth and relatively Lipschitz continuous minimization problems Open
In this paper, we propose some accelerated methods for solving optimization problems under the condition of relatively smooth and relatively Lipschitz continuous functions with an inexact oracle. We consider the problem of minimizing the c…
View article: Accelerated zero-order SGD under high-order smoothness and overparameterized regime
Accelerated zero-order SGD under high-order smoothness and overparameterized regime Open
We present a novel gradient-free algorithm to solve a convex stochastic optimization problem, such as those encountered in medicine, physics, and machine learning (e.g., adversarial multi-armed bandit problem), where the objective function…
View article: OPTAMI: Global Superlinear Convergence of High-order Methods
OPTAMI: Global Superlinear Convergence of High-order Methods Open
Second-order methods for convex optimization outperform first-order methods in terms of theoretical iteration convergence, achieving rates up to $O(k^{-5})$ for highly-smooth functions. However, their practical performance and applications…
View article: Nesterov's method of dichotomy via Order Oracle: The problem of optimizing a two-variable function on a square
Nesterov's method of dichotomy via Order Oracle: The problem of optimizing a two-variable function on a square Open
The challenges of black box optimization arise due to imprecise responses and limited output information. This article describes new results on optimizing multivariable functions using an Order Oracle, which provides access only to the ord…
View article: Local SGD for Near-Quadratic Problems: Improving Convergence under Unconstrained Noise Conditions
Local SGD for Near-Quadratic Problems: Improving Convergence under Unconstrained Noise Conditions Open
Distributed optimization plays an important role in modern large-scale machine learning and data processing systems by optimizing the utilization of computational resources. One of the classical and popular approaches is Local Stochastic G…
View article: Average-case optimization analysis for distributed consensus algorithms on regular graphs
Average-case optimization analysis for distributed consensus algorithms on regular graphs Open
The consensus problem in distributed computing involves a network of agents aiming to compute the average of their initial vectors through local communication, represented by an undirected graph. This paper focuses on the studying of this …
View article: An Equilibrium Dynamic Traffic Assignment Model with Linear Programming Formulation
An Equilibrium Dynamic Traffic Assignment Model with Linear Programming Formulation Open
In this paper, we consider a dynamic equilibrium transportation problem. There is a fixed number of cars moving from origin to destination areas. Preferences for arrival times are expressed as a cost of arriving before or after the preferr…
View article: Exploring Applications of State Space Models and Advanced Training Techniques in Sequential Recommendations: A Comparative Study on Efficiency and Performance
Exploring Applications of State Space Models and Advanced Training Techniques in Sequential Recommendations: A Comparative Study on Efficiency and Performance Open
Recommender systems aim to estimate the dynamically changing user preferences and sequential dependencies between historical user behaviour and metadata. Although transformer-based models have proven to be effective in sequential recommend…
View article: Decentralized Optimization with Coupled Constraints
Decentralized Optimization with Coupled Constraints Open
We consider the decentralized minimization of a separable objective $\sum_{i=1}^{n} f_i(x_i)$, where the variables are coupled through an affine constraint $\sum_{i=1}^n\left(\mathbf{A}_i x_i - b_i\right) = 0$. We assume that the functions…
View article: Stochastic Frank-Wolfe: Unified Analysis and Zoo of Special Cases
Stochastic Frank-Wolfe: Unified Analysis and Zoo of Special Cases Open
The Conditional Gradient (or Frank-Wolfe) method is one of the most well-known methods for solving constrained optimization problems appearing in various machine learning tasks. The simplicity of iteration and applicability to many practic…
View article: Accelerated Stochastic Gradient Method with Applications to Consensus Problem in Markov-Varying Networks
Accelerated Stochastic Gradient Method with Applications to Consensus Problem in Markov-Varying Networks Open
Stochastic optimization is a vital field in the realm of mathematical optimization, finding applications in diverse areas ranging from operations research to machine learning. In this paper, we introduce a novel first-order optimization al…
View article: Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed
Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed Open
Methods with adaptive stepsizes, such as AdaGrad and Adam, are essential for training modern Deep Learning models, especially Large Language Models. Typically, the noise in the stochastic gradients is heavy-tailed for the later ones. Gradi…
View article: Local Methods with Adaptivity via Scaling
Local Methods with Adaptivity via Scaling Open
The rapid development of machine learning and deep learning has introduced increasingly complex optimization challenges that must be addressed. Indeed, training modern, advanced models has become difficult to implement without leveraging m…
View article: Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks
Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks Open
We consider the task of minimizing the sum of convex functions stored in a decentralized manner across the nodes of a communication network. This problem is relatively well-studied in the scenario when the objective functions are smooth, o…
View article: Exploring Jacobian Inexactness in Second-Order Methods for Variational Inequalities: Lower Bounds, Optimal Algorithms and Quasi-Newton Approximations
Exploring Jacobian Inexactness in Second-Order Methods for Variational Inequalities: Lower Bounds, Optimal Algorithms and Quasi-Newton Approximations Open
Variational inequalities represent a broad class of problems, including minimization and min-max problems, commonly found in machine learning. Existing second-order and high-order methods for variational inequalities require precise comput…
View article: Higher Degree Inexact Model for Optimization problems
Higher Degree Inexact Model for Optimization problems Open
In this paper, it was proposed a new concept of the inexact higher degree $(δ, L, q)$-model of a function that is a generalization of the inexact $(δ, L)$-model, $(δ, L)$-oracle and $(δ, L)$-oracle of degree $q \in [0,2)$. Some examples we…
View article: Accelerated Methods with Compression for Horizontal and Vertical Federated Learning
Accelerated Methods with Compression for Horizontal and Vertical Federated Learning Open
Distributed optimization algorithms have emerged as a superior approaches for solving machine learning problems. To accommodate the diverse ways in which data can be stored across devices, these methods must be adaptable to a wide range of…
View article: Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning Open
We propose a novel architecture and method of explainable classification with Concept Bottleneck Models (CBMs). While SOTA approaches to Image Classification task work as a black box, there is a growing demand for models that would provide…
View article: Extragradient Sliding for Composite Non-Monotone Variational Inequalities
Extragradient Sliding for Composite Non-Monotone Variational Inequalities Open
Variational inequalities offer a versatile and straightforward approach to analyzing a broad range of equilibrium problems in both theoretical and practical fields. In this paper, we consider a composite generally non-monotone variational …
View article: Optimal Flow Matching: Learning Straight Trajectories in Just One Step
Optimal Flow Matching: Learning Straight Trajectories in Just One Step Open
Over the several recent years, there has been a boom in development of Flow Matching (FM) methods for generative modeling. One intriguing property pursued by the community is the ability to learn flows with straight trajectories which real…