Anirbit Mukherjee
YOU?
Author Swipe
View article: Noisy PDE Training Requires Bigger PINNs
Noisy PDE Training Requires Bigger PINNs Open
Physics-Informed Neural Networks (PINNs) are increasingly used to approximate solutions of partial differential equations (PDEs), especially in high dimensions. In real-world applications, data samples are noisy, so it is important to know…
View article: Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data
Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data Open
In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this via showing that under Total Variation …
View article: Global convergence of SGD on two layer neural nets
Global convergence of SGD on two layer neural nets Open
In this note, we consider appropriately regularized $\ell _{2}-$empirical risk of depth $2$ nets with any number of gates and show bounds on how the empirical loss evolves for Stochastic Gradient Descent (SGD) iterates on it—for arbitrary …
View article: Improving PINNs By Algebraic Inclusion of Boundary and Initial Conditions
Improving PINNs By Algebraic Inclusion of Boundary and Initial Conditions Open
"AI for Science" aims to solve fundamental scientific problems using AI techniques. As most physical phenomena can be described as Partial Differential Equations (PDEs) , approximating their solutions using neural networks has evolved as a…
View article: Investigating the ability of PINNs to solve Burgers’ PDE near finite-time blowup
Investigating the ability of PINNs to solve Burgers’ PDE near finite-time blowup Open
Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated Partial Differential Equations (PDEs) numerically while offering an attractive trade-off between accuracy and speed of inference. A partic…
View article: Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Open
We present and analyze a novel regularized form of the gradient clipping algorithm, proving that it converges to global minima of the loss surface of deep neural networks under the squared loss, provided that the layers are of sufficient w…
View article: Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Open
In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width. We present empirical…
View article: Towards Size-Independent Generalization Bounds for Deep Operator Nets
Towards Size-Independent Generalization Bounds for Deep Operator Nets Open
View article: Investigating the Ability of PINNs To Solve Burgers' PDE Near Finite-Time BlowUp
Investigating the Ability of PINNs To Solve Burgers' PDE Near Finite-Time BlowUp Open
Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated PDEs numerically while offering an attractive trade-off between accuracy and speed of inference. A particularly challenging aspect of PDEs…
View article: LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class
LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class Open
In this work, we instantiate a novel perturbation-based multi-class explanation framework, LIPEx (Locally Interpretable Probabilistic Explanation). We demonstrate that LIPEx not only locally replicates the probability distributions output …
View article: Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets Open
In this note, we demonstrate a first-of-its-kind provable convergence of SGD to the global minima of appropriately regularized logistic empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates with adequately smo…
View article: Size Lowerbounds for Deep Operator Networks
Size Lowerbounds for Deep Operator Networks Open
Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot. In this work, we aim to establish a first-of-its-kind data-dependent lowerbound on the …
View article: Depth-2 neural networks under a data-poisoning attack
Depth-2 neural networks under a data-poisoning attack Open
In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning with realizable labels for a class of depth-2 finite-wi…
View article: Global Convergence of SGD On Two Layer Neural Nets
Global Convergence of SGD On Two Layer Neural Nets Open
In this note, we consider appropriately regularized $\ell_2-$empirical risk of depth $2$ nets with any number of gates and show bounds on how the empirical loss evolves for SGD iterates on it -- for arbitrary data and if the activation is …
View article: Towards Size-Independent Generalization Bounds for Deep Operator Nets
Towards Size-Independent Generalization Bounds for Deep Operator Nets Open
In recent times machine learning methods have made significant advances in becoming a useful tool for analyzing physical systems. A particularly active area in this theme has been "physics-informed machine learning" which focuses on using …
View article: An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate
An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate Open
A particular direction of recent advance about stochastic deep-learning algorithms has been about uncovering a rather mysterious heavy-tailed nature of the stationary distribution of these algorithms, even when the data distribution is not…
View article: Provable training of a ReLU gate with an iterative non-gradient algorithm
Provable training of a ReLU gate with an iterative non-gradient algorithm Open
View article: Investigating the locality of neural network training dynamics.
Investigating the locality of neural network training dynamics. Open
A fundamental quest in the theory of deep-learning is to understand the properties of the trajectories in the weight space that a learning algorithm takes. One such property that had very recently been isolated is that of ($S_{\rm rel}$)…
View article: Dynamics of Local Elasticity During Training of Neural Nets
Dynamics of Local Elasticity During Training of Neural Nets Open
In the recent past, a property of neural training trajectories in weight-space had been isolated, that of "local elasticity" (denoted as $S_{\rm rel}$). Local elasticity attempts to quantify the propagation of the influence of a sampled da…
View article: A Study of the Mathematics of Deep Learning
A Study of the Mathematics of Deep Learning Open
"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now increasingly deployed at the cutting-edge of artificial intelligence tasks. This dramatic success of deep learning in the last few years has been hinged on an enormou…
View article: A Study of Neural Training with Iterative Non-Gradient Methods
A Study of Neural Training with Iterative Non-Gradient Methods Open
View article: A Study of the Mathematics of Deep Learning
A Study of the Mathematics of Deep Learning Open
View article: A Study of Neural Training with Non-Gradient and Noise Assisted Gradient Methods.
A Study of Neural Training with Non-Gradient and Noise Assisted Gradient Methods. Open
In this work we demonstrate provable guarantees on the training of depth-2 neural networks in new regimes than previously explored. (1) We start with exhibiting a non-gradient iterative algorithm Neuro-Tron which gives a first-of-its-kind …
View article: Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm
Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm Open
In this work, we demonstrate provable guarantees on the training of a single ReLU gate in hitherto unexplored regimes. We give a simple iterative stochastic algorithm that can train a ReLU gate in the realizable setting in linear time whil…
View article: Depth-2 Neural Networks Under a Data-Poisoning Attack
Depth-2 Neural Networks Under a Data-Poisoning Attack Open
In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning for a class of depth-2 finite-width neural networks, wh…
View article: Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders.
Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders. Open
RMSProp and ADAM continue to be extremely popular algorithms for training
neural nets but their theoretical foundations have remained unclear. In this
work we make progress towards that by giving proofs that these adaptive
gradient algorit…
View article: Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration
Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration Open
RMSProp and ADAM continue to be extremely popular algorithms for training neural nets but their theoretical convergence properties have remained unclear. Further, recent work has seemed to suggest that these algorithms have worse generaliz…
View article: Sparse Coding and Autoencoders
Sparse Coding and Autoencoders Open
In "Dictionary Learning" one tries to recover incoherent matrices $A^* \in \mathbb{R}^{n \times h}$ (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors $x^* \in \mathbb{R}^h$ with a small support of s…
View article: Lower bounds over Boolean inputs for deep neural networks with ReLU gates
Lower bounds over Boolean inputs for deep neural networks with ReLU gates Open
Motivated by the resurgence of neural networks in being able to solve complex learning tasks we undertake a study of high depth networks using ReLU gates which implement the function $x \mapsto \max\{0,x\}$. We try to understand the role o…
View article: Lower bounds over Boolean inputs for deep neural networks with ReLU\n gates
Lower bounds over Boolean inputs for deep neural networks with ReLU\n gates Open
Motivated by the resurgence of neural networks in being able to solve complex\nlearning tasks we undertake a study of high depth networks using ReLU gates\nwhich implement the function $x \\mapsto \\max\\{0,x\\}$. We try to understand the\…