Rajiv Khanna
YOU?
Author Swipe
View article: Structure-Aware Spectral Sparsification via Uniform Edge Sampling
Structure-Aware Spectral Sparsification via Uniform Edge Sampling Open
Spectral clustering is a fundamental method for graph partitioning, but its reliance on eigenvector computation limits scalability to massive graphs. Classical sparsification methods preserve spectral properties by sampling edges proportio…
View article: Approximating Memorization Using Loss Surface Geometry for Dataset Pruning and Summarization
Approximating Memorization Using Loss Surface Geometry for Dataset Pruning and Summarization Open
The sustainable training of modern neural network models represents an open challenge. Several existing methods approach this issue by identifying a subset of relevant data samples from the full training data to be used in model optimizati…
View article: A Precise Characterization of SGD Stability Using Loss Surface Geometry
A Precise Characterization of SGD Stability Using Loss Surface Geometry Open
Stochastic Gradient Descent (SGD) stands as a cornerstone optimization algorithm with proven real-world empirical successes but relatively limited theoretical understanding. Recent research has illuminated a key factor contributing to its …
View article: Membership Privacy Risks of Sharpness Aware Minimization
Membership Privacy Risks of Sharpness Aware Minimization Open
Optimization algorithms that seek flatter minima such as Sharpness-Aware Minimization (SAM) are widely credited with improved generalization. We ask whether such gains impact membership privacy. Surprisingly, we find that SAM is more prone…
View article: Generalization Properties of Stochastic Optimizers via Trajectory Analysis.
Generalization Properties of Stochastic Optimizers via Trajectory Analysis. Open
Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms on generalization performance in realistic non-convex settings is still poorly understood. In this paper, we provi…
View article: Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract)
Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract) Open
The Column Subset Selection Problem (CSSP) and the Nystrom method are among the leading tools for constructing interpretable low-rank approximations of large datasets by selecting a small but representative set of features or instances. A …
View article: LocalNewton: Reducing Communication Bottleneck for Distributed Learning
LocalNewton: Reducing Communication Bottleneck for Distributed Learning Open
To address the communication bottleneck problem in distributed optimization within a master-worker framework, we propose LocalNewton, a distributed second-order algorithm with local averaging. In LocalNewton, the worker machines update the…
View article: Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification Open
Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source da…
View article: Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification Open
Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source da…
View article: Adversarially-Trained Deep Nets Transfer Better
Adversarially-Trained Deep Nets Transfer Better Open
Transfer learning has emerged as a powerful methodology for adapting pre-trained deep neural networks to new domains. This process consists of taking a neural network pre-trained on a large feature-rich source dataset, freezing the early l…
View article: Boundary thickness and robustness in learning models
Boundary thickness and robustness in learning models Open
Robustness of machine learning models to various adversarial and non-adversarial corruptions continues to be of interest. In this paper, we introduce the notion of the boundary thickness of a classifier, and we describe its connection with…
View article: Boundary thickness and robustness in learning models
Boundary thickness and robustness in learning models Open
Robustness of machine learning models to various adversarial and non-adversarial corruptions continues to be of interest. In this paper, we introduce the notion of the boundary thickness of a classifier, and we describe its connection with…
View article: Bayesian Coresets: Revisiting the Nonconvex Optimization Perspective
Bayesian Coresets: Revisiting the Nonconvex Optimization Perspective Open
Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that the posterior inference using the selec…
View article: Bayesian Coresets: An Optimization Perspective.
Bayesian Coresets: An Optimization Perspective. Open
Bayesian coresets have emerged as a promising approach for implementing scalable Bayesian inference. The Bayesian coreset problem involves selecting a (weighted) subset of the data samples, such that posterior inference using the selected …
View article: Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nyström method
Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nyström method Open
The Column Subset Selection Problem (CSSP) and the Nyström method are among the leading tools for constructing small low-rank approximations of large datasets in machine learning and scientific computing. A fundamental question in this are…
View article: Improved guarantees and a multiple-descent curve for the Column Subset Selection Problem and the Nyström method.
Improved guarantees and a multiple-descent curve for the Column Subset Selection Problem and the Nyström method. Open
The Column Subset Selection Problem (CSSP) and the Nystrom method are among the leading tools for constructing small low-rank approximations of large datasets in machine learning and scientific computing. A fundamental question in this are…
View article: Learning Sparse Distributions using Iterative Hard Thresholding
Learning Sparse Distributions using Iterative Hard Thresholding Open
Iterative hard thresholding (IHT) is a projected gradient descent algorithm, known to achieve state of the art performance for a wide range of structured estimation problems, such as sparse inference. In this work, we consider IHT as a sol…
View article: Geometric Rates of Convergence for Kernel-based Sampling Algorithms
Geometric Rates of Convergence for Kernel-based Sampling Algorithms Open
The rate of convergence of weighted kernel herding (WKH) and sequential Bayesian quadrature (SBQ), two kernel-based sampling algorithms for estimating integrals with respect to some target probability measure, is investigated. Under verifi…
View article: On Linear Convergence of Weighted Kernel Herding.
On Linear Convergence of Weighted Kernel Herding. Open
We provide a novel convergence analysis of two popular sampling algorithms, Weighted Kernel Herding and Sequential Bayesian Quadrature, that are used to approximate the expectation of a function under a distribution. Existing theoretical a…
View article: Interpreting Black Box Predictions using Fisher Kernels
Interpreting Black Box Predictions using Fisher Kernels Open
Research in both machine learning and psychology suggests that salient examples can help humans to interpret learning models. To this end, we take a novel look at black box interpretation of test predictions in terms of training examples. …
View article: Restricted strong convexity implies weak submodularity
Restricted strong convexity implies weak submodularity Open
We connect high-dimensional subset selection and submodular maximization. Our results extend the work of Das and Kempe (2011) from the setting of linear regression to arbitrary objective functions. For greedy feature selection, this connec…
View article: Boosting Black Box Variational Inference
Boosting Black Box Variational Inference Open
Approximating a probability density in a tractable manner is a central task in Bayesian statistics. Variational Inference (VI) is a popular technique that achieves tractability by choosing a relatively simple variational family. Borrowing …
View article: Co-regularized Monotone Retargeting for Semi-supervised LeTOR
Co-regularized Monotone Retargeting for Semi-supervised LeTOR Open
This work proposes a new model for listwise Learning to Rank (LeTOR) in an inductive semi–supervised setting. We pose the task as that of ranking in a multiview setting, encountered quite commonly in practice. We formulate a novel and effi…
View article: New perspectives and applications for greedy algorithms in machine learning
New perspectives and applications for greedy algorithms in machine learning Open
Approximating probability densities is a core problem in Bayesian statistics, where the inference involves the computation of a posterior distribution. Variational Inference (VI) is a technique to approximate posterior distributions throug…
View article: IHT dies hard: Provable accelerated Iterative Hard Thresholding
IHT dies hard: Provable accelerated Iterative Hard Thresholding Open
We study --both in theory and practice-- the use of momentum motions in classic iterative hard thresholding (IHT) methods. By simply modifying plain IHT, we investigate its convergence behavior on convex optimization criteria with non-conv…
View article: Boosting Variational Inference: an Optimization Perspective
Boosting Variational Inference: an Optimization Perspective Open
Variational inference is a popular technique to approximate a possibly intractable Bayesian posterior with a more tractable one. Recently, boosting variational inference has been proposed as a new paradigm to approximate the posterior by a…
View article: Scalable Greedy Feature Selection via Weak Submodularity
Scalable Greedy Feature Selection via Weak Submodularity Open
Greedy algorithms are widely used for problems in machine learning such as feature selection and set function optimization. Unfortunately, for large datasets, the running time of even greedy algorithms can be quite high. This is because fo…
View article: On Approximation Guarantees for Greedy Low Rank Optimization
On Approximation Guarantees for Greedy Low Rank Optimization Open
We provide new approximation guarantees for greedy low rank matrix estimation under standard assumptions of restricted strong convexity and smoothness. Our novel analysis also uncovers previously unknown connections between the low rank es…
View article: A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe
A Unified Optimization View on Generalized Matching Pursuit and Frank-Wolfe Open
Two of the most fundamental prototypes of greedy optimization are the matching pursuit and Frank-Wolfe algorithms. In this paper, we take a unified view on both classes of methods, leading to the first explicit convergence rates of matchin…