Explanipedia

Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty Open

Medha Agarwal, Ronak Mehta, Zaïd Harchaoui · 2025

Animal‐worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to …

Supervised Stochastic Gradient Algorithms for Multi-Trial Source Separation Open

Ronak Mehta, Mateus Piovezan Otto, Noah Stanis, Azadeh Yazdan-Shahmorad, Zaïd Harchaoui · 2025

We develop a stochastic algorithm for independent component analysis that incorporates multi-trial supervision, which is available in many scientific contexts. The method blends a proximal gradient-type algorithm in the space of invertible…

A Generalization Theory for Zero-Shot Prediction Open

Ronak Mehta, Zaïd Harchaoui · 2025

A modern paradigm for generalization in machine learning and AI consists of pre-training a task-agnostic foundation model, generally obtained using self-supervised and multimodal contrastive learning. The resulting representations can be u…

Min-Max Optimization with Dual-Linear Coupling Open

Ronak Mehta, Jelena Diakonikolas, Zaïd Harchaoui · 2025

We study a class of convex-concave min-max problems in which the coupled component of the objective is linear in at least one of the two decision vectors. We identify such problem structure as interpolating between the bilinearly and nonbi…

Proving the Coding Interview: A Benchmark for Formally Verified Code Generation Open

Quinn Dougherty, Ronak Mehta · 2025

We introduce the Formally Verified Automated Programming Progress Standards, or FVAPPS, a benchmark of 4715 samples for writing programs and proving their correctness, the largest formal verification benchmark, including 1083 curated and q…

Nonsilyl bicyclic secondary amine catalyzed Michael addition of nitromethane to β,β-disubstituted α,β-unsaturated aldehydes Open

Rohtash Kumar, Avinash Avinash, Ronak Mehta, Chandrakumar Appayee · 2025

An asymmetric Michael addition of nitromethane to β,β-disubstituted α,β-unsaturated aldehydes using a nonsilyl bicyclic secondary amine organocatalyst enables a concise asymmetric synthesis of methsuximide, an anticonvulsant drug.

Leveraging machine learning and accelerometry to classify animal behaviours with uncertainty Open

Medha Agarwal, Kasim Rafiq, Ronak Mehta, Briana Abrahms, Zaïd Harchaoui · 2024

Animal-worn sensors have revolutionised the study of animal behaviour and ecology. Accelerometers, which measure changes in acceleration across planes of movement, are increasingly being used in conjunction with machine learning models to …

The Benefits of Balance: From Information Projections to Variance Reduction Open

Lang Liu, Ronak Mehta, Soumik Pal, Zaïd Harchaoui · 2024

Data balancing across multiple modalities and sources appears in various forms in foundation models in machine learning and AI, e.g. in CLIP and DINO. We show that data balancing across modalities and sources actually offers an unsuspected…

Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization Open

Ronak Mehta, Jelena Diakonikolas, Zaïd Harchaoui · 2024

We consider the penalized distributionally robust optimization (DRO) problem with a closed, convex uncertainty set, a setting that encompasses learning using $f$-DRO and spectral/$L$-risk minimization. We present Drago, a stochastic primal…

Distributionally Robust Optimization with Bias and Variance Reduction Open

Ronak Mehta, Vincent Roulet, Krishna Pillutla, Zaïd Harchaoui · 2023

We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value…

Front Cover: Polystyrene‐Supported Aminocatalyst Derived from Diarylprolinol for Asymmetric α‐Amination of Aldehydes (Eur. J. Org. Chem. 21/2023) Open

Rohtash Kumar, Suraj Singh, Ronak Mehta, Chandrakumar Appayee · 2023

The Front Cover illustrates the natural environment that needs to be protected by following the principles of 3Rs (Reduce, Reuse, and Recycle). The background of the picture shows our institute's riverfront, where a reusable polymer-suppor…

Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks Open

Adam Li, Ronan Perry, Chester Huynh, Tyler M. Tomita, Ronak Mehta , et al. · 2023

Decision forests, in particular random forests and gradient boosting trees have demonstrated state-of-the-art accuracy compared to other methods in many supervised learning scenarios. Forests dominate other methods in tabular data, that is…

Stochastic Optimization for Spectral Risk Measures Open

Ronak Mehta, Vincent Roulet, Krishna Pillutla, Lang Liu, Zaïd Harchaoui · 2022

Spectral risk objectives - also called $L$-risks - allow for learning systems to interpolate between optimizing average-case performance (as in empirical risk minimization) and worst-case performance on a task. We develop stochastic algori…

Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity Open

Jayanta Dey, Joshua T Vogelstein, Hayden S. Helm, Will LeVine, Ronak Mehta , et al. · 2022

Deep Unlearning via Randomized Conditionally Independent Hessians Open

Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N. Ravi · 2022

No description supplied

Omnidirectional Transfer for Quasilinear Lifelong Learning Open

Jayanta Dey, Joshua T Vogelstein, Hayden S. Helm, Will LeVine, Ronak Mehta , et al. · 2021

In biological learning, data are used to improve performance not only on the current task, but also on previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula rasa…

Towards a theory of out-of-distribution learning Open

Ali Geisa, Ronak Mehta, Hayden S. Helm, Jayanta Dey, Eric Eaton , et al. · 2021

What is learning? 20 century formalizations of learning theory -- which precipitated revolutions in artificial intelligence -- focus primarily on \textit{in-distribution} learning, that is, learning under the assumption that the training d…

Towards a theory of out-of-distribution learning Open

Ali Geisa, Ronak Mehta, Hayden S. Helm, Jayanta Dey, Eric Eaton , et al. · 2021

Learning is a process wherein a learning agent enhances its performance through exposure of experience or data. Throughout this journey, the agent may encounter diverse learning environments. For example, data may be presented to the leane…

A partition-based similarity for classification distributions Open

Hayden S. Helm, Ronak Mehta, Brandon Duderstadt, Weiwei Yang, Christopher M. White , et al. · 2020

Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners. In particular,…

A general approach to progressive learning Open

Joshua T Vogelstein, Hayden S. Helm, Ronak Mehta, Jayanta Dey, Weiwei Yang , et al. · 2020

In biological learning, data are used to improve performance simultaneously on the current task, as well as previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula…

Simple Lifelong Learning Machines Open

Joshua T. Vogelstein, Jayanta Dey, Hayden S. Helm, Will LeVine, Ronak Mehta , et al. · 2020

In lifelong learning, data are used to improve performance not only on the present task, but also on past and future (unencountered) tasks. While typical transfer learning algorithms can improve performance on future tasks, their performan…

A general approach to progressive intelligence Open

Joshua T Vogelstein, Hayden S. Helm, Ronak Mehta, Jayanta Dey, Weiwei Yang , et al. · 2020

In biological learning, data are used to improve performance not only on the current task, but also on previously encountered and as yet unencountered tasks. In contrast, classical machine learning starts from a blank slate, or tabula rasa…

Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks Open

Adam Li, Ronan Perry, Chester Huynh, Tyler M. Tomita, Ronak Mehta , et al. · 2019

Decision forests (Forests), in particular random forests and gradient boosting trees, have demonstrated state-of-the-art accuracy compared to other methods in many supervised learning scenarios. In particular, Forests dominate other method…

Independence Testing for Multivariate Time Series Open

Ronak Mehta, Jaewon Chung, Cencheng Shen, Ting Xu, Joshua T Vogelstein · 2019

Complex data structures such as time series are increasingly present in modern data science problems. A fundamental question is whether two such time-series are statistically dependent. Many current approaches make parametric assumptions o…

Independence Testing for Temporal Data Open

Ronak Mehta, Jaewon Chung, Cencheng Shen, Ting Xu, Joshua T. Vogelstein · 2019

Temporal data are increasingly prevalent in modern data science. A fundamental question is whether two time series are related or not. Existing approaches often have limitations, such as relying on parametric assumptions, detecting only li…

A Consistent Independence Test for Multivariate Time-Series Open

Ronak Mehta, Cencheng Shen, Ting Xu, Joshua T Vogelstein · 2019

A fundamental problem in statistical data analysis is testing whether two phenomena are related. When the phenomena in question are time series, many challenges emerge. The first is defining a dependence measure between time series at the …

hyppo: A Multivariate Hypothesis Testing Python Package Open

Sambit Panda, Satish Palaniappan, Junhao Xiong, Eric Bridgeford, Ronak Mehta , et al. · 2019

We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are incons…

hyppo: A Comprehensive Multivariate Hypothesis Testing Python Package Open

Sambit Panda, Satish Palaniappan, Junhao Xiong, Eric Bridgeford, Ronak Mehta , et al. · 2019

We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing. While many multivariate independence tests have R packages available, the interfaces are incons…

Estimating Information-Theoretic Quantities with Uncertainty Forests Open

Ronak Mehta, Richard Guo, Jesús Arroyo, Michael Powell, Hayden S. Helm , et al. · 2019

Information-theoretic quantities, such as conditional entropy and mutual information, are critical data summaries for quantifying uncertainty. Existing estimators for these quantities either have strong theoretical guarantees or effective …

Random Forests for Adaptive Nearest Neighbor Estimation of Information-Theoretic Quantities Open

Ronan Perry, Ronak Mehta, Richard Guo, Jesús Arroyo, Michael Powell , et al. · 2019

Information-theoretic quantities, such as conditional entropy and mutual information, are critical data summaries for quantifying uncertainty. Current widely used approaches for computing such quantities rely on nearest neighbor methods an…

Ronak Mehta YOU? Author Swipe