Ronak Mehta
YOU?
Author Swipe
View article: Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty
Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty Open
Animal‐worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to …
View article: Supervised Stochastic Gradient Algorithms for Multi-Trial Source Separation
Supervised Stochastic Gradient Algorithms for Multi-Trial Source Separation Open
We develop a stochastic algorithm for independent component analysis that incorporates multi-trial supervision, which is available in many scientific contexts. The method blends a proximal gradient-type algorithm in the space of invertible…
View article: A Generalization Theory for Zero-Shot Prediction
A Generalization Theory for Zero-Shot Prediction Open
A modern paradigm for generalization in machine learning and AI consists of pre-training a task-agnostic foundation model, generally obtained using self-supervised and multimodal contrastive learning. The resulting representations can be u…
View article: Min-Max Optimization with Dual-Linear Coupling
Min-Max Optimization with Dual-Linear Coupling Open
We study a class of convex-concave min-max problems in which the coupled component of the objective is linear in at least one of the two decision vectors. We identify such problem structure as interpolating between the bilinearly and nonbi…
View article: Proving the Coding Interview: A Benchmark for Formally Verified Code Generation
Proving the Coding Interview: A Benchmark for Formally Verified Code Generation Open
We introduce the Formally Verified Automated Programming Progress Standards, or FVAPPS, a benchmark of 4715 samples for writing programs and proving their correctness, the largest formal verification benchmark, including 1083 curated and q…
View article: Nonsilyl bicyclic secondary amine catalyzed Michael addition of nitromethane to β,β-disubstituted α,β-unsaturated aldehydes
Nonsilyl bicyclic secondary amine catalyzed Michael addition of nitromethane to β,β-disubstituted α,β-unsaturated aldehydes Open
An asymmetric Michael addition of nitromethane to β,β-disubstituted α,β-unsaturated aldehydes using a nonsilyl bicyclic secondary amine organocatalyst enables a concise asymmetric synthesis of methsuximide, an anticonvulsant drug.
View article: Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty
Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty Open
Animal-worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to …
View article: The Benefits of Balance: From Information Projections to Variance Reduction
The Benefits of Balance: From Information Projections to Variance Reduction Open
Data balancing across multiple modalities and sources appears in various forms in foundation models in machine learning and AI, e.g. in CLIP and DINO. We show that data balancing across modalities and sources actually offers an unsuspected…
View article: Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization
Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization Open
We consider the penalized distributionally robust optimization (DRO) problem with a closed, convex uncertainty set, a setting that encompasses learning using $f$-DRO and spectral/$L$-risk minimization. We present Drago, a stochastic primal…
View article: Distributionally Robust Optimization with Bias and Variance Reduction
Distributionally Robust Optimization with Bias and Variance Reduction Open
We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value…
View article: Front Cover: Polystyrene‐Supported Aminocatalyst Derived from Diarylprolinol for Asymmetric α‐Amination of Aldehydes (Eur. J. Org. Chem. 21/2023)
Front Cover: Polystyrene‐Supported Aminocatalyst Derived from Diarylprolinol for Asymmetric α‐Amination of Aldehydes (Eur. J. Org. Chem. 21/2023) Open
The Front Cover illustrates the natural environment that needs to be protected by following the principles of 3Rs (Reduce, Reuse, and Recycle). The background of the picture shows our institute's riverfront, where a reusable polymer-suppor…
View article: Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks
Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks Open
Decision forests, in particular random forests and gradient boosting trees have demonstrated state-of-the-art accuracy compared to other methods in many supervised learning scenarios. Forests dominate other methods in tabular data, that is…
View article: Stochastic Optimization for Spectral Risk Measures
Stochastic Optimization for Spectral Risk Measures Open
Spectral risk objectives - also called $L$-risks - allow for learning systems to interpolate between optimizing average-case performance (as in empirical risk minimization) and worst-case performance on a task. We develop stochastic algori…
View article: Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity
Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity Open
View article: Deep Unlearning via Randomized Conditionally Independent Hessians
Deep Unlearning via Randomized Conditionally Independent Hessians Open
No description supplied
View article: Omnidirectional Transfer for Quasilinear Lifelong Learning
Omnidirectional Transfer for Quasilinear Lifelong Learning Open
In biological learning, data are used to improve performance not only on the current task, but also on previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula rasa…
View article: Towards a theory of out-of-distribution learning
Towards a theory of out-of-distribution learning Open
What is learning? 20 century formalizations of learning theory -- which precipitated revolutions in artificial intelligence -- focus primarily on \textit{in-distribution} learning, that is, learning under the assumption that the training d…
View article: Towards a theory of out-of-distribution learning
Towards a theory of out-of-distribution learning Open
Learning is a process wherein a learning agent enhances its performance through exposure of experience or data. Throughout this journey, the agent may encounter diverse learning environments. For example, data may be presented to the leane…
View article: A partition-based similarity for classification distributions
A partition-based similarity for classification distributions Open
Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners. In particular,…
View article: A general approach to progressive learning
A general approach to progressive learning Open
In biological learning, data are used to improve performance simultaneously on the current task, as well as previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula…
View article: Simple Lifelong Learning Machines
Simple Lifelong Learning Machines Open
In lifelong learning, data are used to improve performance not only on the present task, but also on past and future (unencountered) tasks. While typical transfer learning algorithms can improve performance on future tasks, their performan…
View article: A general approach to progressive intelligence
A general approach to progressive intelligence Open
In biological learning, data are used to improve performance not only on the current task, but also on previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula rasa…
View article: Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks
Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks Open
Decision forests (Forests), in particular random forests and gradient boosting trees, have demonstrated state-of-the-art accuracy compared to other methods in many supervised learning scenarios. In particular, Forests dominate other method…
View article: Independence Testing for Multivariate Time Series
Independence Testing for Multivariate Time Series Open
Complex data structures such as time series are increasingly present in modern data science problems. A fundamental question is whether two such time-series are statistically dependent. Many current approaches make parametric assumptions o…
View article: Independence Testing for Temporal Data
Independence Testing for Temporal Data Open
Temporal data are increasingly prevalent in modern data science. A fundamental question is whether two time series are related or not. Existing approaches often have limitations, such as relying on parametric assumptions, detecting only li…
View article: A Consistent Independence Test for Multivariate Time-Series
A Consistent Independence Test for Multivariate Time-Series Open
A fundamental problem in statistical data analysis is testing whether two
phenomena are related. When the phenomena in question are time series, many
challenges emerge. The first is defining a dependence measure between time
series at the …
View article: hyppo: A Multivariate Hypothesis Testing Python Package
hyppo: A Multivariate Hypothesis Testing Python Package Open
We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are incons…
View article: hyppo: A Comprehensive Multivariate Hypothesis Testing Python Package
hyppo: A Comprehensive Multivariate Hypothesis Testing Python Package Open
We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are incons…
View article: Estimating Information-Theoretic Quantities with Uncertainty Forests
Estimating Information-Theoretic Quantities with Uncertainty Forests Open
Information-theoretic quantities, such as conditional entropy and mutual information, are critical data summaries for quantifying uncertainty. Existing estimators for these quantities either have strong theoretical guarantees or effective …
View article: Random Forests for Adaptive Nearest Neighbor Estimation of Information-Theoretic Quantities
Random Forests for Adaptive Nearest Neighbor Estimation of Information-Theoretic Quantities Open
Information-theoretic quantities, such as conditional entropy and mutual information, are critical data summaries for quantifying uncertainty. Current widely used approaches for computing such quantities rely on nearest neighbor methods an…