Explanipedia

Gradient Descent Provably Optimizes Over-parameterized Neural Networks Open

Simon S. Du, Xiyu Zhai, Barnabás Póczos, Aarti Singh · 2018

One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth. This paper demystifies…

Fair Resource Allocation in Federated Learning Open

Li Tian, Maziar Sanjabi, Ahmad Beirami, Virginia Smith · 2019

Computer science Mathematics Psychology

Federated learning involves training statistical models in massive, heterogeneous networks. Naively minimizing an aggregate loss function in such a network may disproportionately advantage or disadvantage some of the devices. In this work,…

Parallel and Distributed Methods for Constrained Nonconvex Optimization—Part I: Theory Open

Gesualdo Scutari, Francisco Facchinei, Lorenzo Lampariello · 2016

Computer science Mathematics Psychology

In Part I of this paper, we proposed and analyzed a novel algorithmic\nframework for the minimization of a nonconvex (smooth) objective function,\nsubject to nonconvex constraints, based on inner convex approximations. This\nPart II is dev…

Relatively Smooth Convex Optimization by First-Order Methods, and Applications Open

Haihao Lu, Robert M. Freund, Yurii Nesterov · 2018

Mathematics Economics Biology

The usual approach to developing and analyzing first-order methods for smooth convex optimization assumes that the gradient of the objective function is uniformly smooth with some Lipschitz constant $L$. However, in many settings the diffe…

SARAH: A Novel Method for Machine Learning Problems Using Stochastic\n Recursive Gradient Open

Lam M. Nguyen, Jie Liu, Katya Scheinberg, Martin Takáč · 2017

Computer science Mathematics Philosophy

In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH),\nas well as its practical variant SARAH+, as a novel approach to the finite-sum\nminimization problems. Different from the vanilla SGD and other modern\nstochasti…

A Decentralized Proximal-Gradient Method With Network Independent Step-Sizes and Separated Convergence Rates Open

Zhi Li, Wei Shi, Ming Yan · 2019

Mathematics Computer science Economics

This paper proposes a novel proximal-gradient algorithm for a decentralized\noptimization problem with a composite objective containing smooth and\nnon-smooth terms. Specifically, the smooth and nonsmooth terms are dealt with\nby gradient …

q -Hermite Hadamard inequalities and quantum estimates for midpoint type inequalities via convex and quasi-convex functions Open

Necmettin Alp, Mehmet Zeki Sarıkaya, Mehmet Kunt, İmdat Işcan · 2016

Mathematics Biology

In this paper, we prove the correct q-Hermite–Hadamard inequality, some new q-Hermite–Hadamard inequalities, and generalized q-Hermite–Hadamard inequality. By using the left hand part of the correct q-Hermite–Hadamard inequality, we have a…

BAS-ADAM: an ADAM based approach to improve the performance of beetle antennae search optimizer Open

Ameer Hamza Khan, Xinwei Cao, Shuai Li, Vasilios N. Katsikis, Liefa Liao · 2020

Computer science Mathematics Economics

In this paper, we propose enhancements to Beetle Antennae search ( BAS ) algorithm, called BAS-ADAM, to smoothen the convergence behavior and avoid trapping in local-minima for a highly non-convex objective function. We achieve this by ada…

Asynchronous Distributed ADMM for Large-Scale Optimization—Part I: Algorithm andConvergence Analysis Open

Tsung‐Hui Chang, Mingyi Hong, Wei-Cheng Liao, Xiangfeng Wang · 2016

Computer science Mathematics Economics

Aiming at solving large-scale learning problems, this paper studies\ndistributed optimization methods based on the alternating direction method of\nmultipliers (ADMM). By formulating the learning problem as a consensus problem,\nthe ADMM c…

On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning Open

Bolin Gao, Lacra Pavel · 2017

Mathematics Computer science Engineering

In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax …

Convergence Rate of Distributed ADMM over Networks Open

Ali Makhdoumi, Asuman Ozdaglar · 2019

Mathematics Computer science Physics

We propose a new distributed algorithm based on alternating direction method of multipliers (ADMM) to minimize sum of locally known convex functions using communication over a network. This optimization problem emerges in many applications…

Dynamic Control of Agents Playing Aggregative Games With Coupling Constraints Open

Sergio Grammatico · 2017

Computer science Mathematics Economics

We address the problem to control a population of noncooperative heterogeneous agents, each with convex cost function depending on the average population state, and all sharing a convex constraint, toward an aggregative equilibrium. We ass…

LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed\n Learning Open

Tianyi Chen, Georgios B. Giannakis, Tao Sun, Wotao Yin · 2018

Computer science Mathematics Philosophy

This paper presents a new class of gradient methods for distributed machine\nlearning that adaptively skip the gradient calculations to learn with reduced\ncommunication and computation. Simple rules are designed to detect\nslowly-varying …

Optimal algorithms for smooth and strongly convex distributed optimization in networks Open

Kevin G. Seaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, Laurent Massoulié · 2017

Mathematics Computer science Biology

In this paper, we determine the optimal convergence rates for strongly convex and smooth distributed optimization in two settings: centralized and decentralized communications over a network. For centralized (i.e. master/slave) algorithms,…

On the Global Convergence of Gradient Descent for Over-parameterized\n Models using Optimal Transport Open

Lénaïc Chizat, Francis Bach · 2018

Computer science Mathematics Economics

Many tasks in machine learning and signal processing can be solved by\nminimizing a convex function of a measure. This includes sparse spikes\ndeconvolution or training a neural network with a single hidden layer. For\nthese problems, we s…

Matrix Completion has No Spurious Local Minimum Open

Rong Ge, Jason D. Lee, Tengyu Ma · 2016

Computer science Mathematics Physics

Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular and effective in practice. Despite recen…

Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication Open

Anastasiia Koloskova, Sebastian U. Stich, Martin Jaggi · 2019

Computer science Mathematics Psychology

We consider decentralized stochastic optimization with the objective function (e.g. data samples for machine learning task) being distributed over $n$ machines that can only communicate to their neighbors on a fixed communication graph. To…

THE SHARP BOUND FOR THE HANKEL DETERMINANT OF THE THIRD KIND FOR CONVEX FUNCTIONS Open

Bogumiła Kowalczyk, Adam Lecko, Young Jae Sim · 2018

Mathematics

We prove the sharp inequality $|H_{3,1}(f)|\leq 4/135$ for convex functions, that is, for analytic functions $f$ with $a_{n}:=f^{(n)}(0)/n!,~n\in \mathbb{N}$ , such that $$\begin{eqnarray}Re\bigg\{1+\frac{zf^{\prime \prime }(z)}{f^{\prime …

An Effective Optimization Method for Machine Learning Based on ADAM Open

Dokkyun Yi, Jae-Hyun Ahn, Sangmin Ji · 2020

Computer science Mathematics Biology

A machine is taught by finding the minimum value of the cost function which is induced by learning data. Unfortunately, as the amount of learning increases, the non-liner activation function in the artificial neural network (ANN), the comp…

On Analog Gradient Descent Learning Over Multiple Access Fading Channels Open

Tomer Sery, Kobi Cohen · 2020

Computer science Mathematics Economics

We consider a distributed learning problem over multiple access channel (MAC)\nusing a large wireless network. The computation is made by the network edge and\nis based on received data from a large number of distributed nodes which\ntrans…

New Jensen and Hermite–Hadamard type inequalities for h-convex interval-valued functions Open

Dafang Zhao, Tianqing An, Guoju Ye, Wei Liu · 2018

Mathematics Biology

In this paper, we introduce the h-convex concept for interval-valued functions. By using the h-convex concept, we present new Jensen and Hermite–Hadamard type inequalities for interval-valued functions. Our inequalities generalize some kno…

A Class of Prediction-Correction Methods for Time-Varying Convex Optimization Open

Andrea Simonetto, Aryan Mokhtari, Alec Koppel, Geert Leus, Alejandro Ribeiro · 2016

Mathematics Computer science Biology

This paper considers unconstrained convex optimization problems with time-varying objective functions. We propose algorithms with a discrete time-sampling scheme to find and track the solution trajectory based on prediction and correction …

Stochastic Successive Convex Approximation for Non-Convex Constrained Stochastic Optimization Open

An Liu, Vincent K. N. Lau, Borna Kananian · 2019

Mathematics Computer science

This paper proposes a constrained stochastic successive convex approximation (CSSCA) algorithm to find a stationary point for a general non-convex stochastic optimization problem, whose objective and constraint functions are non-convex and…

Coefficient estimates of new classes of q-starlike and q-convex functions of complex order Open

T. M. Seoudy, M. K. Aouf · 2016

Mathematics Economics

We introduce new classes of q -starlike and q -convex functions of complex order involving the q -derivative operator defined in the open unit disc.Furthermore, we find estimates on the coefficients for second and third coefficients of the…

Some new inequalities of Hermite-Hadamard type for s-convex functions with applications Open

Muhammad Adil Khan, Yu‐Ming Chu, Tahir Ullah Khan, Jamroz Khan · 2017

Mathematics Biology

In this paper, we present several new and generalized Hermite-Hadamard type inequalities for s-convex as well as s-concave functions via classical and Riemann-Liouville fractional integrals. As applications, we provide new error estimation…

Differentially Private Empirical Risk Minimization Revisited: Faster and\n More General Open

Di Wang, Minwei Ye, Jinhui Xu · 2018

Mathematics Computer science Biology

In this paper we study the differentially private Empirical Risk Minimization\n(ERM) problem in different settings. For smooth (strongly) convex loss function\nwith or without (non)-smooth regularization, we give algorithms that achieve\ne…

Robust Federated Learning With Noisy Communication Open

Fan Ang, Li Chen, Nan Zhao, Yunfei Chen, Weidong Wang , et al. · 2020

Computer science Mathematics Engineering

Federated learning is a communication-efficient training process that alternate between local training at the edge devices and averaging of the updated local model at the center server. Nevertheless, it is impractical to achieve perfect ac…

Third Hankel Determinants for Subclasses of Univalent Functions Open

Paweł Zaprawa · 2016

Mathematics Economics

The main aim of this paper is to discuss the third Hankel determinants for three classes: $$S^*$$ of starlike functions, $$\mathcal {K}$$ of convex functions and $$\mathcal {R}$$ of functions whose derivative has a positive real part. More…

Decentralized Stochastic Optimization and Gossip Algorithms with\n Compressed Communication Open

Anastasia Koloskova, Sebastian U. Stich, Martin Jaggi · 2019

Computer science Mathematics Psychology

We consider decentralized stochastic optimization with the objective function\n(e.g. data samples for machine learning task) being distributed over $n$\nmachines that can only communicate to their neighbors on a fixed communication\ngraph.…

Variance Reduction for Faster Non-Convex Optimization Open

Zeyuan Allen-Zhu, Elad Hazan · 2016

Mathematics Computer science

We consider the fundamental problem in non-convex optimization of efficiently reaching a stationary point. In contrast to the convex case, in the long history of this basic problem, the only known theoretical results on first-order non-con…

Convex function ≈ Convex function