Explanipedia

Deep unsupervised learning using nonequilibrium thermodynamics Open

Jascha Sohl‐Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli · 2024

Computer science Mathematics

A central problem in machine learning involves modeling complex data-sets using highly flexi-ble families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractab…

Practical tradeoffs between memory, compute, and performance in learned optimizers Open

Luke Metz, C. Daniel Freeman, J. Harrison, Niru Maheswaranathan, Jascha Sohl‐Dickstein · 2022

Computer science Mathematics

Optimization plays a costly and crucial role in developing machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric function…

A mechanistically interpretable model of the retinal neural code for natural scenes with multiscale adaptive dynamics Open

Xuehao Ding, Dongsoo Lee, Satchel Grant, Heike Stein, Lane McIntosh , et al. · 2021

Computer science Biology

The visual system processes stimuli over a wide range of spatiotemporal scales, with individual neurons receiving input from tens of thousands of neurons whose dynamics range from milliseconds to tens of seconds. This poses a challenge to …

A mechanistically interpretable model of the retinal neural code for natural scenes with multiscale adaptive dynamics Open

Xuehao Ding, Dongsoo Lee, Satchel Grant, Heike Stein, Lane McIntosh , et al. · 2021

Computer science Biology

The visual system processes stimuli over a wide range of spatiotemporal scales, with individual neurons receiving input from tens of thousands of neurons whose dynamics range from milliseconds to tens of seconds. This poses a challenge to …

Understanding How Encoder-Decoder Architectures Attend Open

Kyle Aitken, Vinay Ramasesh, Yuan Cao, Niru Maheswaranathan · 2021

Computer science Engineering Biology

Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However, …

Training Learned Optimizers with Randomly Initialized Learned Optimizers Open

Luke Metz, C. Daniel Freeman, Niru Maheswaranathan, Jascha Sohl‐Dickstein · 2021

Computer science Physics Sociology

Learned optimizers are increasingly effective, with performance exceeding that of hand designed optimizers such as Adam~\citep{kingma2014adam} on specific tasks \citep{metz2019understanding}. Despite the potential gains available, in curre…

Reverse engineering learned optimizers reveals known and novel mechanisms Open

Niru Maheswaranathan, David Sussillo, Luke Metz, Ruoxi Sun, Jascha Sohl‐Dickstein · 2020

Computer science Geology Physics

Learned optimizers are algorithms that can themselves be trained to solve optimization problems. In contrast to baseline optimizers (such as momentum or Adam) that use simple update rules derived from theoretical principles, learned optimi…

The geometry of integration in text classification RNNs Open

Kyle Aitken, Vinay Ramasesh, Ankush Garg, Yuan Cao, David Sussillo , et al. · 2020

Computer science Mathematics Physics

Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained …

Tasks, stability, architecture, and compute: Training more effective\n learned optimizers, and using them to train themselves Open

Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl‐Dickstein · 2020

Computer science Economics

Much as replacing hand-designed features with learned functions has\nrevolutionized how we solve perceptual tasks, we believe learned algorithms\nwill transform how we train models. In this work we focus on general-purpose\nlearned optimiz…

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves Open

Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl‐Dickstein · 2020

Computer science Economics

Much as replacing hand-designed features with learned functions has revolutionized how we solve perceptual tasks, we believe learned algorithms will transform how we train models. In this work we focus on general-purpose learned optimizers…

How recurrent networks implement contextual processing in sentiment analysis Open

Niru Maheswaranathan, David Sussillo · 2020

Computer science Geology Biology

Neural networks have a remarkable capacity for contextual processing--using recent or nearby inputs to modify processing of current input. For example, in natural language, contextual processing is necessary to correctly interpret negation…

Using a thousand optimization tasks to learn hyperparameter search strategies Open

Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole , et al. · 2020

Computer science Mathematics Economics

We present TaskSet, a dataset of tasks for use in training and evaluating optimizers. TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional neu…

Meta-Learning Biologically Plausible Semi-Supervised Update Rules Open

Keren Gu, Sam Greydanus, Luke Metz, Niru Maheswaranathan, Jascha Sohl‐Dickstein · 2019

Computer science Psychology Biology

The question of how neurons embedded in a network update their synaptic weights to collectively achieve behavioral goals is a longstanding problem in systems neuroscience. Since Hebb’s hypothesis [10] that cells that fire together strength…

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction Open

Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus , et al. · 2019

Computer science Psychology Engineering

Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about t…

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction. Open

Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus , et al. · 2019

Computer science Psychology

Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about t…

Universality and individuality in neural dynamics across large populations of recurrent networks Open

Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo · 2019

Computer science Mathematics Psychology

Task-based modeling with recurrent neural networks (RNNs) has emerged as a popular way to infer the computational function of different brain regions. These models are quantitatively assessed by comparing the low-dimensional neural represe…

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics Open

Niru Maheswaranathan, Alex H. Williams, Matthew D. Golub, Surya Ganguli, David Sussillo · 2019

Computer science Mathematics Philosophy

Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative,…

Using learned optimizers to make models robust to input noise Open

Luke Metz, Niru Maheswaranathan, Jonathon Shlens, Jascha Sohl‐Dickstein, Ekin D. Cubuk · 2019

Computer science

State-of-the art vision models can achieve superhuman performance on image classification tasks when testing and training data come from the same distribution. However, when models are tested on corrupted images (e.g. due to scale changes,…

Discovering precise temporal patterns in large-scale neural recordings through robust and interpretable time warping Open

Alex H. Williams, Ben Poole, Niru Maheswaranathan, Ashesh K. Dhawale, Tucker G. Fisher , et al. · 2019

Computer science Psychology

Though the temporal precision of neural computation has been studied intensively, a data-driven determination of this precision remains a fundamental challenge. Reproducible spike time patterns may be obscured on single trials by uncontrol…

Understanding and correcting pathologies in the training of learned optimizers Open

Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl‐Dickstein · 2018

Computer science Medicine Geography

Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especiall…

Understanding and correcting pathologies in the training of learned\n optimizers Open

Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl‐Dickstein · 2018

Computer science Biology Geography

Deep learning has shown that learned functions can dramatically outperform\nhand-designed functions on perceptual tasks. Analogously, this suggests that\nlearned optimizers may similarly outperform current hand-designed optimizers,\nespeci…

Inferring hidden structure in multilayered neural circuits Open

Niru Maheswaranathan, David B. Kastner, Stephen A. Baccus, Surya Ganguli · 2018

Computer science Mathematics Physics

A central challenge in sensory neuroscience involves understanding how neural circuits shape computations across cascaded cell layers. Here we attempt to reconstruct the response properties of experimentally unobserved neurons in the inter…

Guided evolutionary strategies: Augmenting random search with surrogate gradients Open

Niru Maheswaranathan, Luke Metz, George Tucker, Dami Choi, Jascha Sohl‐Dickstein · 2018

Computer science Mathematics

Many applications in machine learning require optimizing a function whose true gradient is unknown, but where surrogate gradient information (directions that may be correlated with, but not necessarily identical to, the true gradient) is a…

The dynamic neural code of the retina for natural scenes Open

Niru Maheswaranathan, Lane McIntosh, Hidenori Tanaka, Satchel Grant, David B. Kastner , et al. · 2018

Computer science Psychology

Understanding how the visual system encodes natural scenes is a fundamental goal of sensory neuroscience. We show here that a three-layer network model predicts the retinal response to natural scenes with an accuracy nearing the fundamenta…

Recurrent Segmentation for Variable Computational Budgets Open

Lane McIntosh, Niru Maheswaranathan, David Sussillo, Jonathon Shlens · 2018

Computer science Mathematics Engineering

State-of-the-art systems for semantic image segmentation use feed-forward pipelines with fixed computational costs. Building an image segmentation system that works across a range of computational budgets is challenging and time-intensive …

Niru Maheswaranathan YOU? Author Swipe