Related papers: Kernel Discrepancy-Based Rerandomization for Contr…

Covariate Balancing Based on Kernel Density Estimates for Controlled Experiments

Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes. A completely randomized design is usually used to randomly assign treatment levels to…

Methodology · Statistics 2026-05-12 Yiou Li , Lulu Kang , Xiao Huang

Optimal Rerandomization via a Criterion that Provides Insurance Against Failed Experiments

We present an optimized rerandomization design procedure for a non-sequential treatment-control experiment. Randomized experiments are the gold standard for finding causal effects in nature. But sometimes random assignments result in…

Methodology · Statistics 2021-01-26 Adam Kapelner , Abba M. Krieger , Michael Sklar , David Azriel

A Bias-Variance-Covariance Decomposition of Kernel Scores for Generative Models

Generative models, like large language models, are becoming increasingly relevant in our daily lives, yet a theoretical framework to assess their generalization behavior and uncertainty does not exist. Particularly, the problem of…

Machine Learning · Computer Science 2024-07-11 Sebastian G. Gruber , Florian Buettner

Does Rerandomization Help Beyond Covariate Adjustment? A Review and Guide for Theory and Practice

Rerandomization is a modern experimental design technique that repeatedly randomizes treatment assignments until covariates are deemed balanced between treatment groups. This enhances the precision and coherence of causal effect estimators,…

Methodology · Statistics 2025-12-08 Antônio Carlos Herling Ribeiro Junior , Zach Branson

Rerandomization with Diminishing Covariate Imbalance and Diverging Number of Covariates

Completely randomized experiments have been the gold standard for drawing causal inference because they can balance all potential confounding on average. However, they may suffer from unbalanced covariates for realized treatment…

Statistics Theory · Mathematics 2022-10-18 Yuhao Wang , Xinran Li

Uncertainty Quantification for Regression: A Unified Framework based on kernel scores

Regression tasks, notably in safety-critical domains, require proper uncertainty quantification, yet the literature remains largely classification-focused. In this light, we introduce a family of measures for total, aleatoric, and epistemic…

Machine Learning · Computer Science 2025-10-30 Christopher Bülte , Yusuf Sale , Gitta Kutyniok , Eyke Hüllermeier

Variable Selection for Kernel Two-Sample Tests

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…

Machine Learning · Statistics 2024-12-23 Jie Wang , Santanu S. Dey , Yao Xie

Measuring Differences between Conditional Distributions using Kernel Embeddings

Comparing conditional distributions is a fundamental challenge in statistics and machine learning, with applications across a wide range of domains. While proposed methods for measuring discrepancies using kernel embeddings of distributions…

Machine Learning · Statistics 2026-05-05 Peter Moskvichev , Siu Lun Chau , Dino Sejdinovic

Kernel based unfolding of data obtained from detectors with finite resolution and limited acceptance

A kernel based procedure for correcting experimental data for distortions due to the finite resolution and limited detector acceptance is presented. The unfolding problem is known to be an ill-posed problem that can not be solved without…

Data Analysis, Statistics and Probability · Physics 2012-09-19 N. D. Gagunashvili , M. Schmelling

Optimizing Kernel Discrepancies via Subset Selection

Kernel discrepancies are a powerful tool for analyzing worst-case errors in quasi-Monte Carlo (QMC) methods. Building on recent advances in optimizing such discrepancy measures, we extend the subset selection problem to the setting of…

Machine Learning · Statistics 2025-11-05 Deyao Chen , François Clément , Carola Doerr , Nathan Kirk

Fast Rerandomization via the BRAIN

Randomized experiments are a crucial tool for causal inference in many different fields. Rerandomization addresses any covariate imbalance in such experiments by resampling treatment assignments until certain balance criteria are satisfied.…

Methodology · Statistics 2025-05-27 Jiuyao Lu , Daogao Liu , Zhanran Lin , Xiaomeng Wang

regMMD: An R package for parametric estimation and regression with maximum mean discrepancy

The Maximum Mean Discrepancy (MMD) is a kernel-based metric widely used for nonparametric tests and estimation. Recently, it has also been studied as an objective function for parametric estimation, as it has been shown to yield robust…

Computation · Statistics 2025-04-25 Pierre Alquier , Mathieu Gerber

The Discrepancy Principle for Choosing Bandwidths in Kernel Density Estimation

We investigate the discrepancy principle for choosing smoothing parameters for kernel density estimation. The method is based on the distance between the empirical and estimated distribution functions. We prove some new positive and…

Statistics Theory · Mathematics 2015-03-19 Thoralf Mildenberger

Optimal A Priori Balance in the Design of Controlled Experiments

We develop a unified theory of designs for controlled experiments that balance baseline covariates a priori (before treatment and before randomization) using the framework of minimax variance and a new method called kernel allocation. We…

Statistics Theory · Mathematics 2017-08-02 Nathan Kallus

High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization

This paper studies kernel ridge regression in high dimensions under covariate shifts and analyzes the role of importance re-weighting. We first derive the asymptotic expansion of high dimensional kernels under covariate shifts. By a…

Machine Learning · Statistics 2024-06-06 Yihang Chen , Fanghui Liu , Taiji Suzuki , Volkan Cevher

Copula-based Kernel Dependency Measures

The paper presents a new copula based method for measuring dependence between random variables. Our approach extends the Maximum Mean Discrepancy to the copula of the joint distribution. We prove that this approach has several advantageous…

Machine Learning · Computer Science 2019-08-15 Barnabas Poczos , Zoubin Ghahramani , Jeff Schneider

A Distance Covariance-based Kernel for Nonlinear Causal Clustering in Heterogeneous Populations

We consider the problem of causal structure learning in the setting of heterogeneous populations, i.e., populations in which a single causal structure does not adequately represent all population members, as is common in biological and…

Machine Learning · Statistics 2022-02-21 Alex Markham , Richeek Das , Moritz Grosse-Wentrup

Regularization of the Kernel Matrix via Covariance Matrix Shrinkage Estimation

The kernel trick concept, formulated as an inner product in a feature space, facilitates powerful extensions to many well-known algorithms. While the kernel matrix involves inner products in the feature space, the sample covariance matrix…

Computation · Statistics 2017-07-20 Tomer Lancewicki

Regularized $f$-Divergence Kernel Tests

We propose a framework to construct practical kernel-based two-sample tests from the family of $f$-divergences. The test statistic is computed from the witness function of a regularized variational representation of the divergence, which we…

Machine Learning · Statistics 2026-01-28 Mónica Ribero , Antonin Schrab , Arthur Gretton

Variance estimation in nonparametric regression via the difference sequence method

Consider a Gaussian nonparametric regression problem having both an unknown mean function and unknown variance function. This article presents a class of difference-based kernel estimators for the variance function. Optimal convergence…

Statistics Theory · Mathematics 2009-09-29 Lawrence D. Brown , M. Levine