English
Related papers

Related papers: Optimizing Kernel Discrepancies via Subset Selecti…

200 papers

For two decades, reproducing kernels and their associated discrepancies have facilitated elegant theoretical analyses in the setting of quasi Monte Carlo. These same tools are now receiving interest in statistics and related fields, as…

Methodology · Statistics 2023-08-24 Chris. J. Oates

We consider the problem of improving the efficiency of randomized Fourier feature maps to accelerate training and testing speed of kernel methods on large datasets. These approximate feature maps arise as Monte Carlo approximations to…

Machine Learning · Statistics 2015-08-11 Haim Avron , Vikas Sindhwani , Jiyan Yang , Michael Mahoney

Approximate Markov chain Monte Carlo (MCMC) offers the promise of more rapid sampling at the cost of more biased inference. Since standard MCMC diagnostics fail to detect these biases, researchers have developed computable Stein discrepancy…

Machine Learning · Statistics 2020-10-16 Jackson Gorham , Lester Mackey

Maximum mean discrepancies (MMDs) like the kernel Stein discrepancy (KSD) have grown central to a wide range of applications, including hypothesis testing, sampler selection, distribution approximation, and variational inference. In each…

Machine Learning · Statistics 2025-03-26 Alessandro Barp , Carl-Johann Simon-Gabriel , Mark Girolami , Lester Mackey

This article provides a practical introduction to kernel discrepancies, focusing on the Maximum Mean Discrepancy (MMD), the Hilbert-Schmidt Independence Criterion (HSIC), and the Kernel Stein Discrepancy (KSD). Various estimators for these…

Machine Learning · Statistics 2025-11-03 Antonin Schrab

Much of machine learning relies on comparing distributions with discrepancy measures. Stein's method creates discrepancy measures between two distributions that require only the unnormalized density of one and samples from the other. Stein…

Machine Learning · Statistics 2020-07-21 Raghav Singhal , Xintian Han , Saad Lahlou , Rajesh Ranganath

We introduce kernel thinning, a new procedure for compressing a distribution $\mathbb{P}$ more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel $\mathbf{k}_{\star}$ and $O(n^2)$ time, kernel…

Machine Learning · Statistics 2024-05-14 Raaz Dwivedi , Lester Mackey

Motivated by applications in instance selection, we introduce the star discrepancy subset selection problem, which consists of finding a subset of m out of n points that minimizes the star discrepancy. First, we show that this problem is…

Computational Geometry · Computer Science 2022-01-05 François Clèment , Carola Doerr , Luís Paquete

Message-Passing Monte Carlo (MPMC) was recently introduced as a novel low-discrepancy sampling approach leveraging tools from geometric deep learning. While originally designed for generating uniform point sets, we extend this framework to…

Machine Learning · Computer Science 2025-03-28 Nathan Kirk , T. Konstantin Rusch , Jakob Zech , Daniela Rus

We consider the variable selection problem for two-sample tests, aiming to select the most informative variables to determine whether two collections of samples follow the same distribution. To address this, we propose a novel framework…

Machine Learning · Statistics 2024-12-23 Jie Wang , Santanu S. Dey , Yao Xie

This paper introduces a kernel discrepancy-based framework for rerandomization to enhance the precision of causal inference in controlled experiments. We demonstrate that the kernel discrepancy is the key part of the variance upper bound…

Methodology · Statistics 2025-11-05 Yiou Li , Lulu Kang

Modern large-scale kernel-based tests such as maximum mean discrepancy (MMD) and kernelized Stein discrepancy (KSD) optimize kernel hyperparameters on a held-out sample via data splitting to obtain the most powerful test statistics. While…

Machine Learning · Computer Science 2020-10-20 Jonas M. Kübler , Wittawat Jitkrittum , Bernhard Schölkopf , Krikamol Muandet

Several statistical approaches based on reproducing kernels have been proposed to detect abrupt changes arising in the full distribution of the observations and not only in the mean or variance. Some of these approaches enjoy good…

Statistics Theory · Mathematics 2017-10-13 Alain Celisse , Guillemette Marot , Morgane Pierre-Jean , Guillem Rigaill

The fast computation of large kernel sums is a challenging task, which arises as a subproblem in any kernel method. We approach the problem by slicing, which relies on random projections to one-dimensional subspaces and fast Fourier…

Numerical Analysis · Mathematics 2025-02-25 Johannes Hertrich , Tim Jahn , Michael Quellmalz

We propose novel kernel-based tests for assessing the equivalence between distributions. Traditional goodness-of-fit testing is inappropriate for concluding the absence of distributional differences, because failure to reject the null…

Machine Learning · Statistics 2026-03-17 Xing Liu , Axel Gandy

Geometric discrepancies are standard measures to quantify the irregularity of distributions. They are an important notion in numerical integration. One of the most important discrepancy notions is the so-called \emph{star discrepancy}.…

Neural and Evolutionary Computing · Computer Science 2013-10-08 Carola Doerr , Francois-Michel De Rainville

We discuss the problem of defining an estimate for the error in quasi-Monte Carlo integration. The key issue is the definition of an ensemble of quasi-random point sets that, on the one hand, includes a sufficiency of equivalent point sets,…

Computational Physics · Physics 2008-02-03 Fred James , Jiri Hoogland , Ronald Kleiss

The $L_{\infty}$ star discrepancy is a measure for the regularity of a finite set of points taken from $[0,1)^d$. Low discrepancy point sets are highly relevant for Quasi-Monte Carlo methods in numerical integration and several other…

Neural and Evolutionary Computing · Computer Science 2023-06-30 François Clément , Diederick Vermetten , Jacob de Nobel , Alexandre D. Jesus , Luís Paquete , Carola Doerr

In many contemporary statistical and machine learning methods, one needs to optimize an objective function that depends on the discrepancy between two probability distributions. The discrepancy can be referred to as a metric for…

Machine Learning · Computer Science 2025-02-11 Yijin Ni , Xiaoming Huo

The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Typically a number of the initial states are attributed to…

‹ Prev 1 2 3 10 Next ›