Sai Praneeth Karimireddy
YOU?
Author Swipe
View article: Uncertainty as Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering
Uncertainty as Feature Gaps: Epistemic Uncertainty Quantification of LLMs in Contextual Question-Answering Open
Uncertainty Quantification (UQ) research has primarily focused on closed-book factual question answering (QA), while contextual QA remains unexplored, despite its importance in real-world applications. In this work, we focus on UQ for the …
View article: VoxGuard: Evaluating User and Attribute Privacy in Speech via Membership Inference Attacks
VoxGuard: Evaluating User and Attribute Privacy in Speech via Membership Inference Attacks Open
Voice anonymization aims to conceal speaker identity and attributes while preserving intelligibility, but current evaluations rely almost exclusively on Equal Error Rate (EER) that obscures whether adversaries can mount high-precision atta…
View article: Conformal Prediction Adaptive to Unknown Subpopulation Shifts
Conformal Prediction Adaptive to Unknown Subpopulation Shifts Open
Conformal prediction is widely used to equip black-box machine learning models with uncertainty quantification, offering formal coverage guarantees under exchangeable data. However, these guarantees fail when faced with subpopulation shift…
View article: Reconsidering LLM Uncertainty Estimation Methods in the Wild
Reconsidering LLM Uncertainty Estimation Methods in the Wild Open
Large Language Model (LLM) Uncertainty Estimation (UE) methods have become a crucial tool for detecting hallucinations in recent years. While numerous UE methods have been proposed, most existing studies evaluate them in isolated short-for…
View article: LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation Open
In Federated Learning, it is crucial to handle low-quality, corrupted, or malicious data. However, traditional data valuation methods are not suitable due to privacy concerns. To address this, we propose a simple yet effective approach tha…
View article: A Differentially Private Kaplan-Meier Estimator for Privacy-Preserving Survival Analysis
A Differentially Private Kaplan-Meier Estimator for Privacy-Preserving Survival Analysis Open
This paper presents a differentially private approach to Kaplan-Meier estimation that achieves accurate survival probability estimates while safeguarding individual privacy. The Kaplan-Meier estimator is widely used in survival analysis to…
View article: Defection-Free Collaboration between Competitors in a Learning System
Defection-Free Collaboration between Competitors in a Learning System Open
We study collaborative learning systems in which the participants are competitors who will defect from the system if they lose revenue by collaborating. As such, we frame the system as a duopoly of competitive firms who are each engaged in…
View article: Collaborative Heterogeneous Causal Inference Beyond Meta-analysis
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis Open
Collaboration between different data centers is often challenged by heterogeneity across sites. To account for the heterogeneity, the state-of-the-art method is to re-weight the covariate distributions in each site to match the distributio…
View article: Privacy Can Arise Endogenously in an Economic System with Learning Agents
Privacy Can Arise Endogenously in an Economic System with Learning Agents Open
We study price-discrimination games between buyers and a seller where privacy arises endogenously--that is, utility maximization yields equilibrium strategies where privacy occurs naturally. In this game, buyers with a high valuation for a…
View article: DAVED: Data Acquisition via Experimental Design for Data Markets
DAVED: Data Acquisition via Experimental Design for Data Markets Open
The acquisition of training data is crucial for machine learning applications. Data markets can increase the supply of data, particularly in data-scarce domains such as healthcare, by incentivizing potential data providers to join the mark…
View article: My-This-Your-That - Interpretable Identification of Systematic Bias in Federated Learning for Biomedical Images
My-This-Your-That - Interpretable Identification of Systematic Bias in Federated Learning for Biomedical Images Open
Deep learning has the potential to improve and even automate the interpretation of biomedical images, making it more accessible, particularly in low-resource settings where human experts are often lacking. The privacy concerns of these ima…
View article: Scaff-PD: Communication Efficient Fair and Robust Federated Learning
Scaff-PD: Communication Efficient Fair and Robust Federated Learning Open
We present Scaff-PD, a fast and communication-efficient algorithm for distributionally robust federated learning. Our approach improves fairness by optimizing a family of distributionally robust objectives tailored to heterogeneous clients…
View article: Provably Personalized and Robust Federated Learning
Provably Personalized and Robust Federated Learning Open
Identifying clients with similar objectives and learning a model-per-cluster is an intuitive and interpretable approach to personalization in federated learning. However, doing so with provable and optimal guarantees has remained an open c…
View article: Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning
Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning Open
For a federated learning model to perform well, it is crucial to have a diverse and representative dataset. However, the data contributors may only be concerned with the performance on a specific subset of the population, which may not ref…
View article: Federated Conformal Predictors for Distributed Uncertainty Quantification
Federated Conformal Predictors for Distributed Uncertainty Quantification Open
Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning since it can be easily applied as a post-processing step to already trained models. In this paper, we extend confo…
View article: Online Learning in a Creator Economy
Online Learning in a Creator Economy Open
The creator economy has revolutionized the way individuals can profit through online platforms. In this paper, we initiate the study of online learning in the creator economy by modeling the creator economy as a three-party game between th…
View article: FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings
FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings Open
Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds to the case of few ($2$--$50$) r…
View article: FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in\n Realistic Healthcare Settings
FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in\n Realistic Healthcare Settings Open
Federated Learning (FL) is a novel approach enabling several clients holding\nsensitive data to collaboratively train machine learning models, without\ncentralizing data. The cross-silo FL setting corresponds to the case of few\n($2$--$50$…
View article: TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels Open
State-of-the-art federated learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. For neural networks, even when centralized SGD easily finds a solution that is simultane…
View article: Mechanisms that Incentivize Data Sharing in Federated Learning
Mechanisms that Incentivize Data Sharing in Federated Learning Open
Federated learning is typically considered a beneficial technology which allows multiple agents to collaborate with each other, improve the accuracy of their models, and solve problems which are otherwise too data-intensive / expensive to …
View article: Optimization with Access to Auxiliary Information
Optimization with Access to Auxiliary Information Open
We investigate the fundamental optimization question of minimizing a target function $f$, whose gradients are expensive to compute or have limited availability, given access to some auxiliary side function $h$ whose gradients are cheap or …
View article: Agree to Disagree: Diversity through Disagreement for Better Transferability
Agree to Disagree: Diversity through Disagreement for Better Transferability Open
Gradient-based learning algorithms have an implicit simplicity bias which in effect can limit the diversity of predictors being sampled by the learning procedure. This behavior can hinder the transferability of trained models by (i) favori…
View article: Byzantine-Robust Decentralized Learning via ClippedGossip
Byzantine-Robust Decentralized Learning via ClippedGossip Open
In this paper, we study the challenging task of Byzantine-robust decentralized training on arbitrary communication graphs. Unlike federated learning where workers communicate through a server, workers in the decentralized environment can o…
View article: Linear Speedup in Personalized Collaborative Learning
Linear Speedup in Personalized Collaborative Learning Open
Collaborative training can improve the accuracy of a model for a user by trading off the model's bias (introduced by using data from other users who are potentially different) against its variance (due to the limited amount of data on any …
View article: Towards Model Agnostic Federated Learning Using Knowledge Distillation
Towards Model Agnostic Federated Learning Using Knowledge Distillation Open
Is it possible to design an universal API for federated learning using which an ad-hoc group of data-holders (agents) collaborate with each other and perform federated learning? Such an API would necessarily need to be model-agnostic i.e. …
View article: Optimal Model Averaging: Towards Personalized Collaborative Learning
Optimal Model Averaging: Towards Personalized Collaborative Learning Open
In federated learning, differences in the data or objectives between the participating nodes motivate approaches to train a personalized machine learning model for each node. One such approach is weighted averaging between a locally traine…
View article: RelaySum for Decentralized Deep Learning on Heterogeneous Data
RelaySum for Decentralized Deep Learning on Heterogeneous Data Open
In decentralized machine learning, workers compute model updates on their local data. Because the workers only communicate with few neighbors without central coordination, these updates propagate progressively over the network. This paradi…
View article: Learning from History for Byzantine Robust Optimization
Learning from History for Byzantine Robust Optimization Open
Byzantine robustness has received significant attention recently given its importance for distributed and federated learning. In spite of this, we identify severe flaws in existing algorithms even when the data across the participants is i…