Niru Maheswaranathan
YOU?
Author Swipe
View article: Deep unsupervised learning using nonequilibrium thermodynamics
Deep unsupervised learning using nonequilibrium thermodynamics Open
A central problem in machine learning involves modeling complex data-sets using highly flexi-ble families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractab…
View article: Practical tradeoffs between memory, compute, and performance in learned optimizers
Practical tradeoffs between memory, compute, and performance in learned optimizers Open
Optimization plays a costly and crucial role in developing machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric function…
View article: A mechanistically interpretable model of the retinal neural code for natural scenes with multiscale adaptive dynamics
A mechanistically interpretable model of the retinal neural code for natural scenes with multiscale adaptive dynamics Open
The visual system processes stimuli over a wide range of spatiotemporal scales, with individual neurons receiving input from tens of thousands of neurons whose dynamics range from milliseconds to tens of seconds. This poses a challenge to …
View article: A mechanistically interpretable model of the retinal neural code for natural scenes with multiscale adaptive dynamics
A mechanistically interpretable model of the retinal neural code for natural scenes with multiscale adaptive dynamics Open
The visual system processes stimuli over a wide range of spatiotemporal scales, with individual neurons receiving input from tens of thousands of neurons whose dynamics range from milliseconds to tens of seconds. This poses a challenge to …
View article: Understanding How Encoder-Decoder Architectures Attend
Understanding How Encoder-Decoder Architectures Attend Open
Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However, …
View article: Training Learned Optimizers with Randomly Initialized Learned Optimizers
Training Learned Optimizers with Randomly Initialized Learned Optimizers Open
Learned optimizers are increasingly effective, with performance exceeding that of hand designed optimizers such as Adam~\citep{kingma2014adam} on specific tasks \citep{metz2019understanding}. Despite the potential gains available, in curre…
View article: Reverse engineering learned optimizers reveals known and novel mechanisms
Reverse engineering learned optimizers reveals known and novel mechanisms Open
Learned optimizers are algorithms that can themselves be trained to solve optimization problems. In contrast to baseline optimizers (such as momentum or Adam) that use simple update rules derived from theoretical principles, learned optimi…
View article: The geometry of integration in text classification RNNs
The geometry of integration in text classification RNNs Open
Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained …
View article: Tasks, stability, architecture, and compute: Training more effective\n learned optimizers, and using them to train themselves
Tasks, stability, architecture, and compute: Training more effective\n learned optimizers, and using them to train themselves Open
Much as replacing hand-designed features with learned functions has\nrevolutionized how we solve perceptual tasks, we believe learned algorithms\nwill transform how we train models. In this work we focus on general-purpose\nlearned optimiz…
View article: Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves Open
Much as replacing hand-designed features with learned functions has revolutionized how we solve perceptual tasks, we believe learned algorithms will transform how we train models. In this work we focus on general-purpose learned optimizers…
View article: How recurrent networks implement contextual processing in sentiment analysis
How recurrent networks implement contextual processing in sentiment analysis Open
Neural networks have a remarkable capacity for contextual processing--using recent or nearby inputs to modify processing of current input. For example, in natural language, contextual processing is necessary to correctly interpret negation…
View article: Using a thousand optimization tasks to learn hyperparameter search strategies
Using a thousand optimization tasks to learn hyperparameter search strategies Open
We present TaskSet, a dataset of tasks for use in training and evaluating optimizers. TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional neu…
View article: Meta-Learning Biologically Plausible Semi-Supervised Update Rules
Meta-Learning Biologically Plausible Semi-Supervised Update Rules Open
The question of how neurons embedded in a network update their synaptic weights to collectively achieve behavioral goals is a longstanding problem in systems neuroscience. Since Hebb’s hypothesis [10] that cells that fire together strength…
View article: From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction
From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction Open
Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about t…
View article: From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction.
From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction. Open
Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about t…
View article: Universality and individuality in neural dynamics across large populations of recurrent networks
Universality and individuality in neural dynamics across large populations of recurrent networks Open
Task-based modeling with recurrent neural networks (RNNs) has emerged as a popular way to infer the computational function of different brain regions. These models are quantitatively assessed by comparing the low-dimensional neural represe…
View article: Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics
Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics Open
Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative,…
View article: Using learned optimizers to make models robust to input noise
Using learned optimizers to make models robust to input noise Open
State-of-the art vision models can achieve superhuman performance on image classification tasks when testing and training data come from the same distribution. However, when models are tested on corrupted images (e.g. due to scale changes,…
View article: Discovering precise temporal patterns in large-scale neural recordings through robust and interpretable time warping
Discovering precise temporal patterns in large-scale neural recordings through robust and interpretable time warping Open
Though the temporal precision of neural computation has been studied intensively, a data-driven determination of this precision remains a fundamental challenge. Reproducible spike time patterns may be obscured on single trials by uncontrol…
View article: Understanding and correcting pathologies in the training of learned optimizers
Understanding and correcting pathologies in the training of learned optimizers Open
Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especiall…
View article: Understanding and correcting pathologies in the training of learned\n optimizers
Understanding and correcting pathologies in the training of learned\n optimizers Open
Deep learning has shown that learned functions can dramatically outperform\nhand-designed functions on perceptual tasks. Analogously, this suggests that\nlearned optimizers may similarly outperform current hand-designed optimizers,\nespeci…
View article: Inferring hidden structure in multilayered neural circuits
Inferring hidden structure in multilayered neural circuits Open
A central challenge in sensory neuroscience involves understanding how neural circuits shape computations across cascaded cell layers. Here we attempt to reconstruct the response properties of experimentally unobserved neurons in the inter…
View article: Guided evolutionary strategies: Augmenting random search with surrogate gradients
Guided evolutionary strategies: Augmenting random search with surrogate gradients Open
Many applications in machine learning require optimizing a function whose true gradient is unknown, but where surrogate gradient information (directions that may be correlated with, but not necessarily identical to, the true gradient) is a…
View article: The dynamic neural code of the retina for natural scenes
The dynamic neural code of the retina for natural scenes Open
Understanding how the visual system encodes natural scenes is a fundamental goal of sensory neuroscience. We show here that a three-layer network model predicts the retinal response to natural scenes with an accuracy nearing the fundamenta…
View article: Recurrent Segmentation for Variable Computational Budgets
Recurrent Segmentation for Variable Computational Budgets Open
State-of-the-art systems for semantic image segmentation use feed-forward pipelines with fixed computational costs. Building an image segmentation system that works across a range of computational budgets is challenging and time-intensive …