Ilmun Kim
YOU?
Author Swipe
View article: Locally minimax optimal confidence sets for the best model
Locally minimax optimal confidence sets for the best model Open
This paper tackles a fundamental inference problem: given $n$ observations from a distribution $P$ over $\mathbb{R}^d$ with unknown mean $\boldsymbolμ$, we must form a confidence set for the index (or indices) corresponding to the smallest…
View article: The projected covariance measure for assumption-lean variable significance testing
The projected covariance measure for assumption-lean variable significance testing Open
Testing the significance of a variable or group of variables $X$ for predicting a response~$Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test w…
View article: Minimax Optimal Two-Sample Testing under Local Differential Privacy
Minimax Optimal Two-Sample Testing under Local Differential Privacy Open
We explore the trade-off between privacy and statistical utility in private two-sample testing under local differential privacy (LDP) for both multinomial and continuous data. We begin by addressing the multinomial case, where we introduce…
View article: General Frameworks for Conditional Two-Sample Testing
General Frameworks for Conditional Two-Sample Testing Open
We study the problem of conditional two-sample testing, which aims to determine whether two populations have the same distribution after accounting for confounding factors. This problem commonly arises in various applications, such as doma…
View article: Robust Kernel Hypothesis Testing under Data Corruption
Robust Kernel Hypothesis Testing under Data Corruption Open
We propose a general method for constructing robust permutation tests under data corruption. The proposed tests effectively control the non-asymptotic type I error under data corruption, and we prove their consistency in power under minima…
View article: Enhancing Sufficient Dimension Reduction via Hellinger Correlation
Enhancing Sufficient Dimension Reduction via Hellinger Correlation Open
In this work, we develop a new theory and method for sufficient dimension reduction (SDR) in single-index models, where SDR is a sub-field of supervised dimension reduction based on conditional independence. Our work is primarily motivated…
View article: Semi-Supervised U-statistics
Semi-Supervised U-statistics Open
Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the po…
View article: Conditional independence testing for discrete distributions: Beyond χ2- and G-tests
Conditional independence testing for discrete distributions: Beyond χ2- and G-tests Open
View article: Differentially Private Permutation Tests: Applications to Kernel Methods
Differentially Private Permutation Tests: Applications to Kernel Methods Open
Recent years have witnessed growing concerns about the privacy of sensitive data. In response to these concerns, differential privacy has emerged as a rigorous framework for privacy protection, gaining widespread recognition in both academ…
View article: Nearly Minimax Optimal Wasserstein Conditional Independence Testing
Nearly Minimax Optimal Wasserstein Conditional Independence Testing Open
This paper is concerned with minimax conditional independence testing. In contrast to some previous works on the topic, which use the total variation distance to separate the null from the alternative, here we use the Wasserstein distance.…
View article: Conditional Independence Testing for Discrete Distributions: Beyond $χ^2$- and $G$-tests
Conditional Independence Testing for Discrete Distributions: Beyond $χ^2$- and $G$-tests Open
This paper is concerned with the problem of conditional independence testing for discrete data. In recent years, researchers have shed new light on this fundamental problem, emphasizing finite-sample optimality. The non-asymptotic viewpoin…
View article: The Projected Covariance Measure for assumption-lean variable significance testing
The Projected Covariance Measure for assumption-lean variable significance testing Open
Testing the significance of a variable or group of variables $X$ for predicting a response $Y$, given additional covariates $Z$, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test w…
View article: Comparing multiple latent space embeddings using topological analysis
Comparing multiple latent space embeddings using topological analysis Open
The latent space model is one of the well-known methods for statistical inference of network data. While the model has been much studied for a single network, it has not attracted much attention to analyze collectively when multiple networ…
View article: Comments on "Testing Conditional Independence of Discrete Distributions"
Comments on "Testing Conditional Independence of Discrete Distributions" Open
In this short note, we identify and address an error in the proof of Theorem 1.3 in Canonne et al. (2018), a recent breakthrough in conditional independence testing. After correcting the error, we show that the general sample complexity re…
View article: Efficient Aggregated Kernel Tests using Incomplete $U$-statistics
Efficient Aggregated Kernel Tests using Incomplete $U$-statistics Open
We propose a series of computationally efficient nonparametric tests for the two-sample, independence, and goodness-of-fit problems, using the Maximum Mean Discrepancy (MMD), Hilbert Schmidt Independence Criterion (HSIC), and Kernel Stein …
View article: Local permutation tests for conditional independence
Local permutation tests for conditional independence Open
In this paper, we investigate local permutation tests for testing conditional independence between two random vectors $X$ and $Y$ given $Z$. The local permutation test determines the significance of a test statistic by locally shuffling sa…
View article: MMD Aggregated Two-Sample Test
MMD Aggregated Two-Sample Test Open
We propose two novel nonparametric two-sample kernel tests based on the Maximum Mean Discrepancy (MMD). First, for a fixed kernel, we construct an MMD test using either permutations or a wild bootstrap, two popular numerical procedures to …
View article: MMD Aggregated Two-Sample Test
MMD Aggregated Two-Sample Test Open
We propose two novel nonparametric two-sample kernel tests based on the\nMaximum Mean Discrepancy (MMD). First, for a fixed kernel, we construct an MMD\ntest using either permutations or a wild bootstrap, two popular numerical\nprocedures …
View article: Comparing a large number of multivariate distributions
Comparing a large number of multivariate distributions Open
In this paper, we propose a test for the equality of multiple distributions based on kernel mean embeddings. Our framework provides a flexible way to handle multivariate data by virtue of kernel methods and allows the number of distributio…
View article: Dimension-agnostic inference
Dimension-agnostic inference Open
Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards understandi…
View article: Dimension-agnostic inference using cross U-statistics
Dimension-agnostic inference using cross U-statistics Open
Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension $d$ while letting the sample size $n$ increase to infinity. Recently, much effort has been dedicated towards understandi…
View article: Minimax optimality of permutation tests
Minimax optimality of permutation tests Open
Permutation tests are widely used in statistics, providing a finite-sample guarantee on the type I error rate whenever the distribution of the samples under the null hypothesis is invariant to some rearrangement. Despite its increasing pop…
View article: Statistical Theory and Methods for Comparing Distributions
Statistical Theory and Methods for Comparing Distributions Open
With the recent advancement of data collection techniques, there has been an explosive growth in the sizeand complex of data sets in many application domains. The rise of such unprecedented data has posed newchallenges as well as new oppor…
View article: Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations
Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations Open
Complex phenomena in engineering and the sciences are often modeled with computationally intensive feed-forward simulations for which a tractable analytic likelihood does not exist. In these cases, it is sometimes necessary to estimate an …
View article: Comparing a Large Number of Multivariate Distributions
Comparing a Large Number of Multivariate Distributions Open
In this paper, we propose a test for the equality of multiple distributions based on kernel mean embeddings. Our framework provides a flexible way to handle multivariate or even high-dimensional data by virtue of kernel methods and allows …
View article: Multinomial Goodness-of-Fit Based on U-Statistics: High-Dimensional Asymptotic and Minimax Optimality
Multinomial Goodness-of-Fit Based on U-Statistics: High-Dimensional Asymptotic and Minimax Optimality Open
We consider multinomial goodness-of-fit tests in the high-dimensional regime where the number of bins increases with the sample size. In this regime, Pearson's chi-squared test can suffer from low power due to the substantial bias as well …
View article: Robust Multivariate Nonparametric Tests via Projection-Pursuit
Robust Multivariate Nonparametric Tests via Projection-Pursuit Open
In this work, we generalize the Cram\'er-von Mises statistic via projection pursuit to obtain robust tests for the multivariate two-sample problem. The proposed tests are consistent against all fixed alternatives, robust to heavy-tailed da…
View article: Robust Multivariate Nonparametric Tests via Projection-Averaging
Robust Multivariate Nonparametric Tests via Projection-Averaging Open
In this work, we generalize the Cramér-von Mises statistic via projection-averaging to obtain a robust test for the multivariate two-sample problem. The proposed test is consistent against all fixed alternatives, robust to heavy-tailed dat…
View article: Classification accuracy as a proxy for two sample testing
Classification accuracy as a proxy for two sample testing Open
When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimen…
View article: Kullback-Leibler Information of Consecutive Order Statistics
Kullback-Leibler Information of Consecutive Order Statistics Open
A calculation of the Kullback-Leibler information of consecutive order statistics is complicated because it depends on a multi-dimensional integral. Park (2014) discussed a representation of the Kullback-Leibler information of the first r …