Explanipedia

KAIROS: Scalable Model-Agnostic Data Valuation Open

Babak Salimi · 2025

Training data increasingly shapes not only model accuracy but also regulatory compliance and market valuation of AI assets. Yet existing valuation methods remain inadequate: model-based techniques depend on a single fitted model and inheri…

Towards Robust Offline Evaluation: A Causal and Information Theoretic Framework for Debiasing Ranking Systems Open

Ruomeng Xu, Babak Salimi · 2025

Evaluating retrieval-ranking systems is crucial for developing high-performing models. While online A/B testing is the gold standard, its high cost and risks to user experience require effective offline methods. However, relying on histori…

Using Causal Inference to Explore Government Policy Impact on Computer Usage Open

Ming Zhu, Lili Wang, Julien Sebot, Bijan Arbab, Babak Salimi , et al. · 2025

We explore the causal relationship between COVID-19 lockdown policies and changes in personal computer usage. In particular, we examine how lockdown policies affected average daily computer usage, as well as how it affected usage patterns …

A Lightweight Method to Disrupt Memorized Sequences in LLM Open

Parjanya Prashant, Kaustubh Ponkshe, Babak Salimi · 2025

As language models scale, their performance improves dramatically across a wide range of tasks, but so does their tendency to memorize and regurgitate parts of their training data verbatim. This tradeoff poses serious legal, ethical, and s…

Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders Open

Parjanya Prashant, Seyedeh Baharan Khatami, Bruno Ribeiro, Babak Salimi · 2024

We consider the task of out-of-distribution (OOD) generalization, where the distribution shift is due to an unobserved confounder ($Z$) affecting both the covariates ($X$) and the labels ($Y$). This confounding introduces heterogeneity in …

Learning from Uncertain Data: From Possible Worlds to Possible Models Open

Jiongli Zhu, Feng Su, Boris Glavic, Babak Salimi · 2024

We introduce an efficient method for learning linear models from uncertain data, where uncertainty is represented as a set of possible variations in the data, leading to predictive multiplicity. Our approach leverages abstract interpretati…

Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation Open

Jensen Hwa, Qingyu Zhao, Aditya Lahiri, Adnan Masood, Babak Salimi , et al. · 2024

Conditional independence (CI) constraints are critical for defining and evaluating fairness in machine learning, as well as for learning unconfounded or causal representations. Traditional methods for ensuring fairness either blindly learn…

Graph Machine Learning based Doubly Robust Estimator for Network Causal Effects Open

Seyedeh Baharan Khatami, Harsh Parikh, Haowei Chen, Sudeepa Roy, Babak Salimi · 2024

We address the challenge of inferring causal effects in social network data. This results in challenges due to interference -- where a unit's outcome is affected by neighbors' treatments -- and network-induced confounding factors. While th…

OTClean: Data Cleaning for Conditional Independence Violations using Optimal Transport Open

Alireza Pirhadi, Mohammad Hossein Moslemi, Alexander Cloninger, Mostafa Milani, Babak Salimi · 2024

Ensuring Conditional Independence (CI) constraints is pivotal for the development of fair and trustworthy machine learning models. In this paper, we introduce \sys, a framework that harnesses optimal transport theory for data repair under …

NEXUS: On Explaining Confounding Bias Open

Brit Youngmann, Michael Cafarella, Yuval Moskovitch, Babak Salimi · 2023

When analyzing large datasets, analysts are often interested in the explanations for unexpected results produced by their queries. In this work, we focus on aggregate SQL queries that expose correlations in the data. A major challenge that…

Causal Data Integration Open

Brit Youngmann, Michael Cafarella, Babak Salimi, Anna Zeng · 2023

Causal inference is fundamental to empirical scientific discoveries in natural and social sciences; however, in the process of conducting causal inference, data management problems can lead to false discoveries. Two such problems are (i) n…

Consistent Range Approximation for Fair Predictive Modeling Open

Jiongli Zhu, Nazanin Sabri, Sainyam Galhotra, Babak Salimi · 2022

This paper proposes a novel framework for certifying the fairness of predictive models trained on biased data. It draws from query answering for incomplete and inconsistent databases to formulate the problem of consistent range approximati…

On Explaining Confounding Bias Open

Brit Youngmann, Michael Cafarella, Yuval Moskovitch, Babak Salimi · 2022

When analyzing large datasets, analysts are often interested in the explanations for surprising or unexpected results produced by their queries. In this work, we focus on aggregate SQL queries that expose correlations in the data. A major …

Combining Counterfactuals With Shapley Values To Explain Image Models Open

Aditya Lahiri, Kamran Alipour, Ehsan Adeli, Babak Salimi · 2022

With the widespread use of sophisticated machine learning models in sensitive applications, understanding their decision-making has become an essential task. Models trained on tabular data have witnessed significant progress in explanation…

Interpretable Data-Based Explanations for Fairness Debugging Open

Romila Pradhan, Jiongli Zhu, Boris Glavic, Babak Salimi · 2022

A wide variety of fairness metrics and eXplainable Artificial Intelligence (XAI) approaches have been proposed in the literature to identify bias in machine learning models that are used in critical real-life contexts. However, merely repo…

Explainable AI: Foundations, Applications, Opportunities for Data Management Research Open

Romila Pradhan, Aditya Lahiri, Sainyam Galhotra, Babak Salimi · 2022

Algorithmic decision-making systems are successfully being adopted in a wide range of domains for diverse tasks. While the potential benefits of algorithmic decision-making are many, the importance of trusting these systems has only recent…

Explaining Image Classifiers Using Contrastive Counterfactuals in Generative Latent Spaces Open

Kamran Alipour, Aditya Lahiri, Ehsan Adeli, Babak Salimi, Michael J. Pazzani · 2022

Despite their high accuracies, modern complex image classifiers cannot be trusted for sensitive tasks due to their unknown decision-making process and potential biases. Counterfactual explanations are very effective in providing transparen…

Generating Interpretable Data-Based Explanations for Fairness Debugging using Gopher Open

Jiongli Zhu, Romila Pradhan, Boris Glavic, Babak Salimi · 2022

Machine learning (ML) models, while increasingly being used to make life-altering decisions, are known to reinforce systemic bias and discrimination. Consequently, practitioners and model developers need tools to facilitate debugging for b…

HypeR: Hypothetical Reasoning With What-If and How-To Queries Using a Probabilistic Causal Approach Open

Sainyam Galhotra, Amir Gilad, Sudeepa Roy, Babak Salimi · 2022

What-if (provisioning for an update to a database) and how-to (how to modify the database to achieve a goal) analyses provide insights to users who wish to examine hypothetical scenarios without making actual changes to a database and ther…

Interpretable Data-Based Explanations for Fairness Debugging Open

Romila Pradhan, Jiongli Zhu, Boris Glavic, Babak Salimi · 2021

A wide variety of fairness metrics and eXplainable Artificial Intelligence (XAI) approaches have been proposed in the literature to identify bias in machine learning models that are used in critical real-life contexts. However, merely repo…

Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals Open

Sainyam Galhotra, Romila Pradhan, Babak Salimi · 2021

There has been a recent resurgence of interest in explainable artificial intelligence (XAI) that aims to reduce the opaqueness of AI-based decision-making systems, allowing humans to scrutinize and trust them. Prior work in this context ha…

Heterogeneous Treatment Effects in Social Networks Open

Amir Gilad, Harsh Parikh, Sudeepa Roy, Babak Salimi · 2021

We study treatment effect modifiers for causal analysis in a social network, where neighbors' characteristics or network structure may affect the outcome of a unit, and the goal is to identify sub-populations with varying treatment effects…

Detecting Treatment Effect Modifiers in Social Networks. Open

Amir Gilad, Harsh Parikh, Babak Salimi, Sudeepa Roy · 2021

We study treatment effect modifiers for causal analysis in a social network, where neighbors' characteristics or network structure may affect the outcome of a unit, and the goal is to identify sub-populations with varying treatment effects…

Explaining Black-Box Algorithms Using Probabilistic Contrastive\n Counterfactuals Open

Sainyam Galhotra, Romila Pradhan, Babak Salimi · 2021

There has been a recent resurgence of interest in explainable artificial\nintelligence (XAI) that aims to reduce the opaqueness of AI-based\ndecision-making systems, allowing humans to scrutinize and trust them. Prior\nwork in this context…

Through the Data Management Lens: Experimental Analysis and Evaluation of Fair Classification Open

Maliha Tashfia Islam, Anna Fariha, Alexandra Meliou, Babak Salimi · 2021

Classification, a heavily-studied data-driven machine learning task, drives an increasing number of prediction systems involving critical human decisions such as loan approval and criminal risk assessment. However, classifiers often demons…

Mining Approximate Acyclic Schemes from Relations Open

Batya Kenig, Pranay Mundra, Guna Prasaad, Babak Salimi, Dan Suciu · 2020

Acyclic schemes have numerous applications in databases and in machine learning, such as improved design, more efficient storage, and increased performance for queries and machine learning algorithms. Multivalued dependencies (MVDs) are th…

Causal Relational Learning Open

Babak Salimi, Harsh Parikh, Moe Kayali, Lise Getoor, Sudeepa Roy , et al. · 2020

Causal inference is at the heart of empirical research in natural and social sciences and is critical for scientific discovery and informed decision making. The gold standard in causal inference is performing randomized controlled trials ;…

Causal Relational Learning Open

Babak Salimi, Harsh Parikh, Moe Kayali, Sudeepa Roy, Lise Getoor , et al. · 2020

Causal inference is at the heart of empirical research in natural and social sciences and is critical for scientific discovery and informed decision making. The gold standard in causal inference is performing randomized controlled trials; …

Mining Approximate Acyclic Schemes from Relations Open

Batya Kenig, Pranay Mundra, Guna Prasad, Babak Salimi, Dan Suciu · 2019

Acyclic schemes have numerous applications in databases and in machine learning, such as improved design, more efficient storage, and increased performance for queries and machine learning algorithms. Multivalued dependencies (MVDs) are th…

Data Management for Causal Algorithmic Fairness Open

Babak Salimi, Bill Howe, Dan Suciu · 2019

Fairness is increasingly recognized as a critical component of machine learning systems. However, it is the underlying data on which these systems are trained that often reflects discrimination, suggesting a data management problem. In thi…

Babak Salimi YOU? Author Swipe