Explanipedia

Noisy PDE Training Requires Bigger PINNs Open

Sebastien Andre-Sloan, Anirbit Mukherjee, Matthew J. Colbrook · 2025

Physics-Informed Neural Networks (PINNs) are increasingly used to approximate solutions of partial differential equations (PDEs), especially in high dimensions. In real-world applications, data samples are noisy, so it is important to know…

Langevin Monte-Carlo Provably Learns Depth Two Neural Nets at Any Size and Data Open

Dibyakanti Kumar, Samyak Jha, Anirbit Mukherjee · 2025

In this work, we will establish that the Langevin Monte-Carlo algorithm can learn depth-2 neural nets of any size and for any data and we give non-asymptotic convergence rates for it. We achieve this via showing that under Total Variation …

Global convergence of SGD on two layer neural nets Open

Pulkit Gopalani, Anirbit Mukherjee · 2025

In this note, we consider appropriately regularized $\ell _{2}-$empirical risk of depth $2$ nets with any number of gates and show bounds on how the empirical loss evolves for Stochastic Gradient Descent (SGD) iterates on it—for arbitrary …

Improving PINNs By Algebraic Inclusion of Boundary and Initial Conditions Open

Mohan Ren, Zhihao Fang, Keren Li, Anirbit Mukherjee · 2024

"AI for Science" aims to solve fundamental scientific problems using AI techniques. As most physical phenomena can be described as Partial Differential Equations (PDEs) , approximating their solutions using neural networks has evolved as a…

Investigating the ability of PINNs to solve Burgers’ PDE near finite-time blowup Open

Dibyakanti Kumar, Anirbit Mukherjee · 2024

Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated Partial Differential Equations (PDEs) numerically while offering an attractive trade-off between accuracy and speed of inference. A partic…

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Open

Matteo Tucat, Anirbit Mukherjee · 2024

We present and analyze a novel regularized form of the gradient clipping algorithm, proving that it converges to global minima of the loss surface of deep neural networks under the squared loss, provided that the layers are of sufficient w…

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks Open

Anirbit Mukherjee, Matteo Tucat · 2024

In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width. We present empirical…

Towards Size-Independent Generalization Bounds for Deep Operator Nets Open

Pulkit Gopalani, Sayar Karmakar, Dibyakanti Kumar, Anirbit Mukherjee · 2024

Investigating the Ability of PINNs To Solve Burgers' PDE Near Finite-Time BlowUp Open

Dibyakanti Kumar, Anirbit Mukherjee · 2023

Physics Informed Neural Networks (PINNs) have been achieving ever newer feats of solving complicated PDEs numerically while offering an attractive trade-off between accuracy and speed of inference. A particularly challenging aspect of PDEs…

LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class Open

H. Zhu, Angelo Cangelosi, Procheta Sen, Anirbit Mukherjee · 2023

In this work, we instantiate a novel perturbation-based multi-class explanation framework, LIPEx (Locally Interpretable Probabilistic Explanation). We demonstrate that LIPEx not only locally replicates the probability distributions output …

Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets Open

Pulkit Gopalani, Samyak Jha, Anirbit Mukherjee · 2023

In this note, we demonstrate a first-of-its-kind provable convergence of SGD to the global minima of appropriately regularized logistic empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates with adequately smo…

Size Lowerbounds for Deep Operator Networks Open

Anirbit Mukherjee, Amartya Roy · 2023

Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot. In this work, we aim to establish a first-of-its-kind data-dependent lowerbound on the …

Depth-2 neural networks under a data-poisoning attack Open

Sayar Karmakar, Anirbit Mukherjee, Theodore Papamarkou · 2023

In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning with realizable labels for a class of depth-2 finite-wi…

Global Convergence of SGD On Two Layer Neural Nets Open

Pulkit Gopalani, Anirbit Mukherjee · 2022

In this note, we consider appropriately regularized $\ell_2-$empirical risk of depth $2$ nets with any number of gates and show bounds on how the empirical loss evolves for SGD iterates on it -- for arbitrary data and if the activation is …

Towards Size-Independent Generalization Bounds for Deep Operator Nets Open

Pulkit Gopalani, Sayar Karmakar, Anirbit Mukherjee · 2022

In recent times machine learning methods have made significant advances in becoming a useful tool for analyzing physical systems. A particularly active area in this theme has been "physics-informed machine learning" which focuses on using …

An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate Open

Sayar Karmakar, Anirbit Mukherjee · 2022

A particular direction of recent advance about stochastic deep-learning algorithms has been about uncovering a rather mysterious heavy-tailed nature of the stationary distribution of these algorithms, even when the data distribution is not…

Provable training of a ReLU gate with an iterative non-gradient algorithm Open

Sayar Karmakar, Anirbit Mukherjee · 2022

Investigating the locality of neural network training dynamics. Open

Soham Dan, Phanideep Gampa, Anirbit Mukherjee · 2021

A fundamental quest in the theory of deep-learning is to understand the properties of the trajectories in the weight space that a learning algorithm takes. One such property that had very recently been isolated is that of ($S_{\rm rel}$)…

Dynamics of Local Elasticity During Training of Neural Nets Open

Soham Dan, Phanideep Gampa, Anirbit Mukherjee · 2021

In the recent past, a property of neural training trajectories in weight-space had been isolated, that of "local elasticity" (denoted as $S_{\rm rel}$). Local elasticity attempts to quantify the propagation of the influence of a sampled da…

A Study of the Mathematics of Deep Learning Open

Anirbit Mukherjee · 2021

"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now increasingly deployed at the cutting-edge of artificial intelligence tasks. This dramatic success of deep learning in the last few years has been hinged on an enormou…

A Study of Neural Training with Iterative Non-Gradient Methods Open

Sayar Karmakar, Anirbit Mukherjee · 2021

A Study of the Mathematics of Deep Learning Open

Anirbit Mukherjee · 2021

A Study of Neural Training with Non-Gradient and Noise Assisted Gradient Methods. Open

Anirbit Mukherjee, Ramchandran Muthukumar · 2020

In this work we demonstrate provable guarantees on the training of depth-2 neural networks in new regimes than previously explored. (1) We start with exhibiting a non-gradient iterative algorithm Neuro-Tron which gives a first-of-its-kind …

Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm Open

Sayar Karmakar, Anirbit Mukherjee · 2020

In this work, we demonstrate provable guarantees on the training of a single ReLU gate in hitherto unexplored regimes. We give a simple iterative stochastic algorithm that can train a ReLU gate in the realizable setting in linear time whil…

Depth-2 Neural Networks Under a Data-Poisoning Attack Open

Sayar Karmakar, Anirbit Mukherjee, Theodore Papamarkou · 2020

In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning for a class of depth-2 finite-width neural networks, wh…

Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders. Open

Amitabh Basu, Soham De, Anirbit Mukherjee, Enayat Ullah · 2018

RMSProp and ADAM continue to be extremely popular algorithms for training neural nets but their theoretical foundations have remained unclear. In this work we make progress towards that by giving proofs that these adaptive gradient algorit…

Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration Open

Soham De, Anirbit Mukherjee, Enayat Ullah · 2018

RMSProp and ADAM continue to be extremely popular algorithms for training neural nets but their theoretical convergence properties have remained unclear. Further, recent work has seemed to suggest that these algorithms have worse generaliz…

Sparse Coding and Autoencoders Open

Akshay Rangamani, Anirbit Mukherjee, Amitabh Basu, Tejaswini Ganapathy, Ashish Arora , et al. · 2018

In "Dictionary Learning" one tries to recover incoherent matrices $A^* \in \mathbb{R}^{n \times h}$ (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors $x^* \in \mathbb{R}^h$ with a small support of s…

Lower bounds over Boolean inputs for deep neural networks with ReLU gates Open

Anirbit Mukherjee, Amitabh Basu · 2017

Motivated by the resurgence of neural networks in being able to solve complex learning tasks we undertake a study of high depth networks using ReLU gates which implement the function $x \mapsto \max\{0,x\}$. We try to understand the role o…

Lower bounds over Boolean inputs for deep neural networks with ReLU\n gates Open

Anirbit Mukherjee, Amitabh Basu · 2017

Motivated by the resurgence of neural networks in being able to solve complex\nlearning tasks we undertake a study of high depth networks using ReLU gates\nwhich implement the function $x \\mapsto \\max\\{0,x\\}$. We try to understand the\…

Anirbit Mukherjee YOU? Author Swipe