Related papers: Kernel Discrepancy-Based Rerandomization for Contr…
Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes. A completely randomized design is usually used to randomly assign treatment levels to…
We present an optimized rerandomization design procedure for a non-sequential treatment-control experiment. Randomized experiments are the gold standard for finding causal effects in nature. But sometimes random assignments result in…
Generative models, like large language models, are becoming increasingly relevant in our daily lives, yet a theoretical framework to assess their generalization behavior and uncertainty does not exist. Particularly, the problem of…
Rerandomization is a modern experimental design technique that repeatedly randomizes treatment assignments until covariates are deemed balanced between treatment groups. This enhances the precision and coherence of causal effect estimators,…
Completely randomized experiments have been the gold standard for drawing causal inference because they can balance all potential confounding on average. However, they may suffer from unbalanced covariates for realized treatment…
Regression tasks, notably in safety-critical domains, require proper uncertainty quantification, yet the literature remains largely classification-focused. In this light, we introduce a family of measures for total, aleatoric, and epistemic…
We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…
Comparing conditional distributions is a fundamental challenge in statistics and machine learning, with applications across a wide range of domains. While proposed methods for measuring discrepancies using kernel embeddings of distributions…
A kernel based procedure for correcting experimental data for distortions due to the finite resolution and limited detector acceptance is presented. The unfolding problem is known to be an ill-posed problem that can not be solved without…
Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend the subset selection problem to the setting of…
Randomized experiments are a crucial tool for causal inference in many different fields. Rerandomization addresses any covariate imbalance in such experiments by resampling treatment assignments until certain balance criteria are satisfied.…
The Maximum Mean Discrepancy (MMD) is a kernel-based metric widely used for nonparametric tests and estimation. Recently, it has also been studied as an objective function for parametric estimation, as it has been shown to yield robust…
We investigate the discrepancy principle for choosing smoothing parameters for kernel density estimation. The method is based on the distance between the empirical and estimated distribution functions. We prove some new positive and…
We develop a unified theory of designs for controlled experiments that balance baseline covariates a priori (before treatment and before randomization) using the framework of minimax variance and a new method called kernel allocation. We…
This paper studies kernel ridge regression in high dimensions under covariate shifts and analyzes the role of importance re-weighting. We first derive the asymptotic expansion of high dimensional kernels under covariate shifts. By a…
The paper presents a new copula based method for measuring dependence between random variables. Our approach extends the Maximum Mean Discrepancy to the copula of the joint distribution. We prove that this approach has several advantageous…
We consider the problem of causal structure learning in the setting of heterogeneous populations, i.e., populations in which a single causal structure does not adequately represent all population members, as is common in biological and…
The kernel trick concept, formulated as an inner product in a feature space, facilitates powerful extensions to many well-known algorithms. While the kernel matrix involves inner products in the feature space, the sample covariance matrix…
We propose a framework to construct practical kernel-based two-sample tests from the family of $f$-divergences. The test statistic is computed from the witness function of a regularized variational representation of the divergence, which we…
Consider a Gaussian nonparametric regression problem having both an unknown mean function and unknown variance function. This article presents a class of difference-based kernel estimators for the variance function. Optimal convergence…