Related papers: Uncertainty quantification for distributed regress…
Gaussian Process Regression and Kernel Ridge Regression are popular nonparametric regression approaches. Unfortunately, they suffer from high computational complexity rendering them inapplicable to the modern massive datasets. To that end a…
In this paper, we investigate a divide and conquer approach to Kernel Ridge Regression (KRR). Given n samples, the division step involves separating the points based on some underlying disjoint partition of the input space (possibly via…
In modern scientific research, massive datasets with huge numbers of observations are frequently encountered. To facilitate the computational process, a divide-and-conquer scheme is often used for the analysis of big data. In such a…
We establish optimal convergence rates for a decomposition-based scalable approach to kernel ridge regression. The method is simple to describe: it randomly partitions a dataset of size N into m subsets of equal size, computes an…
Distributed machine learning systems have been receiving increasing attentions for their efficiency to process large scale data. Many distributed frameworks have been proposed for different machine learning tasks. In this paper, we study…
Regression tasks, notably in safety-critical domains, require proper uncertainty quantification, yet the literature remains largely classification-focused. In this light, we introduce a family of measures for total, aleatoric, and epistemic…
We provide uniform confidence bands for kernel ridge regression (KRR), a widely used nonparametric regression estimator for nonstandard data such as preferences, sequences, and graphs. Despite the prevalence of these data--e.g., student…
The divide-and-conquer method has been widely used for estimating large-scale kernel ridge regression estimates. Unfortunately, when the response variable is highly skewed, the divide-and-conquer kernel ridge regression (dacKRR) may…
A long-standing problem in the construction of asymptotically correct confidence bands for a regression function $m(x)=E[Y|X=x]$, where $Y$ is the response variable influenced by the covariate $X$, involves the situation where $Y$ values…
We propose a data-driven approach to quantify the uncertainty of models constructed by kernel methods. Our approach minimizes the needed distributional assumptions, hence, instead of working with, for example, Gaussian processes or…
Imputation is a popular technique for handling missing data. We consider a nonparametric approach to imputation using the kernel ridge regression technique and propose consistent variance estimation. The proposed variance estimator is based…
In this article, we introduce a kernel-based consensual aggregation method for regression problems. We aim to flexibly combine individual regression estimators $r_1, r_2, \ldots, r_M$ using a weighted average where the weights are defined…
Depth measures have gained popularity in the statistical literature for defining level sets in complex data structures like multivariate data, functional data, and graphs. Despite their versatility, integrating depth measures into…
In this paper, we propose a random projection approach to estimate variance in kernel ridge regression. Our approach leads to a consistent estimator of the true variance, while being computationally more efficient. Our variance estimator is…
The divide and conquer strategy, which breaks a massive data set into a se- ries of manageable data blocks, and then combines the independent results of data blocks to obtain a final decision, has been recognized as a state-of-the-art…
I propose kernel ridge regression estimators for nonparametric dose response curves and semiparametric treatment effects in the setting where an analyst has access to a selected sample rather than a random sample; only for select…
The problem of learning functions over spaces of probabilities - or distribution regression - is gaining significant interest in the machine learning community. A key challenge behind this problem is to identify a suitable representation…
We propose statistical inferential procedures for panel data models with interactive fixed effects in a kernel ridge regression framework.Compared with traditional sieve methods, our method is automatic in the sense that it does not require…
General predictive models do not provide a measure of confidence in predictions without Bayesian assumptions. A way to circumvent potential restrictions is to use conformal methods for constructing non-parametric confidence regions, that…
This study presents an efficient incremental/decremental approach for big streams based on Kernel Ridge Regression (KRR), a frequently used data analysis in cloud centers. To avoid reanalyzing the whole dataset whenever sensors receive new…