Explanipedia

Neural networks with optimized single-neuron adaptation uncover biologically plausible regularization Open

Victor Geadah, Stefan Horoi, Giancarlo Kerg, Guy Wolf, Guillaume Lajoie · 2024

Neurons in the brain have rich and adaptive input-output properties. Features such as heterogeneous f-I curves and spike frequency adaptation are known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it…

On Neural Architecture Inductive Biases for Relational Tasks Open

Giancarlo Kerg, Sarthak Mittal, David Rolnick, Yoshua Bengio, Blake A. Richards , et al. · 2022

Computer science Mathematics Art

Current deep learning approaches have shown good in-distribution generalization performance, but struggle with out-of-distribution generalization. This is especially true in the case of tasks involving abstract relations like recognizing r…

Neural networks with optimized single-neuron adaptation uncover biologically plausible regularization Open

Victor Geadah, Stefan Horoi, Giancarlo Kerg, Guy Wolf, Guillaume Lajoie · 2022

Computer science Mathematics Physics

Neurons in the brain have rich and adaptive input-output properties. Features such as heterogeneous f-I curves and spike frequency adaptation are known to place single neurons in optimal coding regimes when facing changing stimuli. Yet, it…

Continuous-Time Meta-Learning with Forward Mode Differentiation Open

Tristan Deleu, David Kanaa, Leo Feng, Giancarlo Kerg, Yoshua Bengio , et al. · 2022

Computer science Mathematics Economics

Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector fi…

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization Open

Stanisław Jastrzȩbski, Devansh Arpit, Oliver Åstrand, Giancarlo Kerg, Huan Wang , et al. · 2020

Computer science Mathematics Philosophy

The early phase of training a deep neural network has a dramatic effect on the local curvature of the loss function. For instance, using a small learning rate does not guarantee stable optimization because the optimization trajectory has a…

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts\n Generalization Open

Stanisław Jastrzȩbski, Devansh Arpit, Oliver Åstrand, Giancarlo Kerg, Huan Wang , et al. · 2020

Computer science Mathematics Materials science

The early phase of training a deep neural network has a dramatic effect on\nthe local curvature of the loss function. For instance, using a small learning\nrate does not guarantee stable optimization because the optimization trajectory\nha…

Advantages of biologically-inspired adaptive neural activation in RNNs during learning Open

Victor Geadah, Giancarlo Kerg, Stefan Horoi, Guy Wolf, Guillaume Lajoie · 2020

Computer science Psychology

Dynamic adaptation in single-neuron response plays a fundamental role in neural coding in biological neural networks. Yet, most neural activation functions used in artificial networks are fixed and mostly considered as an inconsequential a…

Untangling tradeoffs between recurrence and self-attention in neural networks Open

Giancarlo Kerg, Bhargav Kanuparthi, Anirudh Goyal, Kyle Goyette, Yoshua Bengio , et al. · 2020

Computer science Psychology

Attention and self-attention mechanisms, are now central to state-of-the-art deep learning on sequential tasks. However, most recent progress hinges on heuristic approaches with limited understanding of attention's role in model optimizati…

Non-normal Recurrent Neural Network (nnRNN): learning long time\n dependencies while improving expressivity with transient dynamics Open

Giancarlo Kerg, Kyle Goyette, Maximilian Puelma Touzel, Gauthier Gidel, Eugene Vorontsov , et al. · 2019

Computer science Mathematics Physics

A recent strategy to circumvent the exploding and vanishing gradient problem\nin RNNs, and to allow the stable propagation of signals over long time scales,\nis to constrain recurrent connectivity matrices to be orthogonal or unitary.\nThi…

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics Open

Giancarlo Kerg, Kyle Goyette, Maximilian Puelma Touzel, Gauthier Gidel, Eugene Vorontsov , et al. · 2019

Computer science Mathematics Physics

A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary. This e…

h-detach: Modifying the LSTM Gradient Towards Better Optimization Open

Devansh Arpit, Bhargav Kanuparthi, Giancarlo Kerg, Nan Rosemary Ke, Ioannis Mitliagkas , et al. · 2018

Computer science Mathematics Economics

Recurrent neural networks are known for their notorious exploding and vanishing gradient problem (EVGP). This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because E…

Giancarlo Kerg YOU? Author Swipe