Explanipedia

Second Order Methods for Bandit Optimization and Control Open

Arun Sai Suggala, Y. Jennifer Sun, Praneeth Netrapalli, Elad Hazan · 2024

Bandit convex optimization (BCO) is a general framework for online decision making under uncertainty. While tight regret bounds for general convex losses have been established, existing algorithms achieving these bounds have prohibitive co…

Tandem Transformers for Inference Efficient LLMs Open

P S Aishwarya, Pranav Nair, Yashas Samaga, Toby Boyd, Sanjiv Kumar , et al. · 2024

Computer science Engineering

The autoregressive nature of conventional large language models (LLMs) inherently limits inference speed, as tokens are generated sequentially. While speculative and parallel decoding techniques attempt to mitigate this, they face limitati…

The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks Open

Lénaïc Chizat, Praneeth Netrapalli · 2023

Computer science Physics Philosophy

Deep learning succeeds by doing hierarchical feature learning, yet tuning hyper-parameters (HP) such as initialization scales, learning rates etc., only give indirect control over this behavior. In this paper, we introduce a key notion to …

Near Optimal Heteroscedastic Regression with Symbiotic Learning Open

Dheeraj Baby, Aniket Das, Anant Raj, Praneeth Netrapalli · 2023

Mathematics

We consider the problem of heteroscedastic linear regression, where, given $n$ samples $(\mathbf{x}_i, y_i)$ from $y_i = \langle \mathbf{w}^{*}, \mathbf{x}_i \rangle + ε_i \cdot \langle \mathbf{f}^{*}, \mathbf{x}_i \rangle$ with $\mathbf{x…

Optimistic MLE: A Generic Model-Based Algorithm for Partially Observable Sequential Decision Making Open

Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin · 2023

Computer science Mathematics Philosophy

This paper introduces a simple efficient learning algorithms for general sequential decision making. The algorithm combines Optimism for exploration with Maximum Likelihood Estimation for model estimation, which is thus named OMLE. We prov…

Simplicity Bias in 1-Hidden Layer Neural Networks Open

Depen Morwani, Jatin Batra, Prateek Jain, Praneeth Netrapalli · 2023

Computer science Mathematics Physics

Recent works have demonstrated that neural networks exhibit extreme simplicity bias(SB). That is, they learn only the simplest features to solve a task at hand, even in the presence of other, more robust but more complex features. Due to t…

Consistent Multiclass Algorithms for Complex Metrics and Constraints Open

Harikrishna Narasimhan, Harish G. Ramaswamy, Shiv Kumar Tavker, Drona Khurana, Praneeth Netrapalli , et al. · 2022

Computer science Mathematics Psychology

We present consistent algorithms for multiclass learning with complex performance metrics and constraints, where the objective and constraints are defined by arbitrary functions of the confusion matrix. This setting includes many common pe…

Multi-User Reinforcement Learning with Low Rank Rewards Open

Naman Agarwal, Prateek Jain, Suhas S Kowshik, Anant Raj, Praneeth Netrapalli · 2022

Computer science Mathematics Physics

In this work, we consider the problem of collaborative multi-user reinforcement learning. In this setting there are multiple users with the same state-action space and transition probabilities but with different rewards. Under the assumpti…

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks Open

Sravanti Addepalli, Anshul Nasery, R. Venkatesh Babu, Praneeth Netrapalli, Prateek Jain · 2022

Computer science Philosophy

Deep Neural Networks are known to be brittle to even minor distribution shifts compared to the training distribution. While one line of work has demonstrated that Simplicity Bias (SB) of DNNs - bias towards learning only the simplest featu…

Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making Open

Qinghua Liu, Praneeth Netrapalli, Csaba Szepesvári, Chi Jin · 2022

Computer science Mathematics Political science

This paper introduces a simple efficient learning algorithms for general sequential decision making. The algorithm combines Optimism for exploration with Maximum Likelihood Estimation for model estimation, which is thus named OMLE. We prov…

DAFT: Distilling Adversarially Fine-tuned Models for Better OOD Generalization Open

Anshul Nasery, Sravanti Addepalli, Praneeth Netrapalli, Prateek Jain · 2022

Computer science Mathematics Philosophy

We consider the problem of OOD generalization, where the goal is to train a model that performs well on test distributions that are different from the training distribution. Deep learning models are known to be fragile to such shifts and c…

All Mistakes Are Not Equal: Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP) Open

Ashwin Vaswani, Gaurav Aggarwal, Praneeth Netrapalli, Narayan Hegde · 2022

Computer science Mathematics Chemistry

This paper considers the problem of Hierarchical Multi-Label Classification (HMC), where (i) several labels can be present for each example, and (ii) labels are related via a domain-specific hierarchy tree. Guided by the intuition that all…

MET: Masked Encoding for Tabular Data Open

Kushal Majmundar, Sachin Goyal, Praneeth Netrapalli, Prateek Jain · 2022

Computer science Political science Economics

We consider the task of self-supervised representation learning (SSL) for tabular data: tabular-SSL. Typical contrastive learning based SSL methods require instance-wise data augmentations which are difficult to design for unstructured tab…

Reproducibility in Optimization: Theoretical Framework and Limits Open

Kwangjun Ahn, Prateek Jain, Ziwei Ji, Satyen Kale, Praneeth Netrapalli , et al. · 2022

Computer science Mathematics

We initiate a formal study of reproducibility in optimization. We define a quantitative measure of reproducibility of optimization procedures in the face of noisy or error-prone operations such as inexact or stochastic gradient computation…

Near-Optimal Lower Bounds For Convex Optimization For All Orders of Smoothness Open

Ankit Garg, Robin Kothari, Praneeth Netrapalli, Suhail Sherif · 2021

Mathematics Computer science Biology

We study the complexity of optimizing highly smooth convex functions. For a positive integer $p$, we want to find an $ε$-approximate minimum of a convex function $f$, given oracle access to the function and its first $p$ derivatives, assum…

Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs Open

Naman Agarwal, Syomantak Chaudhuri, Prateek Jain, Anant Raj, Praneeth Netrapalli · 2021

Computer science Mathematics Economics

Q-learning is a popular Reinforcement Learning (RL) algorithm which is widely used in practice with function approximation (Mnih et al., 2015). In contrast, existing theoretical results are pessimistic about Q-learning. For example, (Baird…

Focus on the Common Good: Group Distributional Robustness Follows Open

Vihari Piratla, Praneeth Netrapalli, Sunita Sarawagi · 2021

Computer science Mathematics Geography

We consider the problem of training a classification model with group annotated training data. Recent work has established that, if there is distribution shift across different groups, models trained using the standard empirical risk minim…

Minimax Optimization with Smooth Algorithmic Adversaries Open

Tanner Fiez, Chi Jin, Praneeth Netrapalli, Lillian J. Ratliff · 2021

Computer science Mathematics

This paper considers minimax optimization $\min_x \max_y f(x, y)$ in the challenging setting where $f$ can be both nonconvex in $x$ and nonconcave in $y$. Though such optimization problems arise in many machine learning paradigms including…

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems Open

Prateek Jain, Suhas S Kowshik, Anant Raj, Praneeth Netrapalli · 2021

Computer science Mathematics Physics

We consider the setting of vector valued non-linear dynamical systems $X_{t+1} = ϕ(A^* X_t) + η_t$, where $η_t$ is unbiased noise and $ϕ: \mathbb{R} \to \mathbb{R}$ is a known link function that satisfies certain {\em expansivity property}…

Sample Efficient Linear Meta-Learning by Alternating Minimization Open

Kiran Koshy Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh · 2021

Computer science Mathematics Economics

Meta-learning synthesizes and leverages the knowledge from a given set of tasks to rapidly learn new tasks using very little data. Meta-learning of linear regression tasks, where the regressors lie in a low-dimensional subspace, is an exte…

Streaming Linear System Identification with Reverse Experience Replay Open

Prateek Jain, Suhas S Kowshik, Anant Raj, Praneeth Netrapalli · 2021

Computer science Mathematics Physics

We consider the problem of estimating a linear time-invariant (LTI) dynamical system from a single trajectory via streaming algorithms, which is encountered in several applications including reinforcement learning (RL) and time-series anal…

Do Input Gradients Highlight Discriminative Features? Open

Harshay Shah, Prateek Jain, Praneeth Netrapalli · 2021

Computer science Philosophy

Post-hoc gradient-based interpretability methods [Simonyan et al., 2013, Smilkov et al., 2017] that provide instance-specific explanations of model predictions are often based on assumption (A): magnitude of input gradients -- gradients of…

On Nonconvex Optimization for Machine Learning Open

Chi Jin, Praneeth Netrapalli, Rong Ge, Sham M. Kakade, Michael I. Jordan · 2021

Computer science Mathematics Engineering

Gradient descent (GD) and stochastic gradient descent (SGD) are the workhorses of large-scale machine learning. While classical theory focused on analyzing the performance of these methods in convex optimization problems, the most notable …

Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization Open

Aadirupa Saha, Nagarajan Natarajan, Praneeth Netrapalli, Prateek Jain · 2021

Mathematics Biology

We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions $\f_t$ admit a "pseudo-1d" structure, i.e. $\f_t(\w) = \loss_t(\pred_t(\w))$ where the output of $\pred_t$ is o…

No Quantum Speedup over Gradient Descent for Non-Smooth Convex Optimization Open

Ankit Garg, Robin Kothari, Praneeth Netrapalli, Suhail Sherif · 2021

Mathematics Computer science Physics

We study the first-order convex optimization problem, where we have black-box access to a (not necessarily smooth) function f:ℝⁿ → ℝ and its (sub)gradient. Our goal is to find an ε-approximate minimum of f starting from a point that is dis…

Making the Last Iterate of SGD Information Theoretically Optimal Open

Prateek Jain, Anant Raj, Praneeth Netrapalli · 2021

Mathematics Computer science Economics

Stochastic gradient descent (SGD) is one of the most widely used algorithms for large scale optimization problems. While classical theoretical analysis of SGD for convex problems studies (suffix) \emph{averages} of iterates and obtains inf…

Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method Open

Kiran Koshy Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh · 2020

Mathematics Computer science Biology

We consider the classical setting of optimizing a nonsmooth Lipschitz continuous convex function over a convex constraint set, when having access to a (stochastic) first-order oracle (FO) for the function and a projection oracle (PO) for t…

No quantum speedup over gradient descent for non-smooth convex optimization Open

Ankit Garg, Robin Kothari, Praneeth Netrapalli, Suhail Sherif · 2020

Computer science Mathematics Physics

We study the first-order convex optimization problem, where we have black-box access to a (not necessarily smooth) function $f:\mathbb{R}^n \to \mathbb{R}$ and its (sub)gradient. Our goal is to find an $ε$-approximate minimum of $f$ starti…

Efficient Domain Generalization via Common-Specific Low-Rank Decomposition Open

Vihari Piratla, Praneeth Netrapalli, Sunita Sarawagi · 2020

Computer science Mathematics Physics

Domain generalization refers to the task of training a model which generalizes to new domains that are not seen during training. We present CSD (Common Specific Decomposition), for this setting,which jointly learns a common component (whic…

Praneeth Netrapalli YOU? Author Swipe