Related papers: Optimal Rates for Vector-Valued Spectral Regulariz…
The saturation effects, which originally refer to the fact that kernel ridge regression (KRR) fails to achieve the information-theoretical lower bound when the regression function is over-smooth, have been observed for almost 20 years and…
This paper provides a comprehensive error analysis of learning with vector-valued random features (RF). The theory is developed for RF ridge regression in a fully general infinite-dimensional input-output setting, but nonetheless applies to…
Kernel ridge regression (KRR), also known as the least-squares support vector machine, is a fundamental method for learning functions from finite samples. While most existing analyses focus on the noisy setting with constant-level label…
Vector-valued learning, where the output space admits a vector-valued structure, is an important problem that covers a broad family of important domains, e.g. multi-task learning and transfer learning. Using local Rademacher complexity and…
Regularization is an essential element of virtually all kernel methods for nonparametric regression problems. A critical factor in the effectiveness of a given kernel method is the type of regularization that is employed. This article…
Many fundamental machine learning tasks can be formulated as a problem of learning with vector-valued functions, where we learn multiple scalar-valued functions together. Although there is some generalization analysis on different specific…
We present the first optimal rates for infinite-dimensional vector-valued ridge regression on a continuous scale of norms that interpolate between $L_2$ and the hypothesis space, which we consider as a vector-valued reproducing kernel…
Motivated by the studies of neural networks (e.g.,the neural tangent kernel theory), we perform a study on the large-dimensional behavior of kernel ridge regression (KRR) where the sample size $n \asymp d^{\gamma}$ for some $\gamma > 0$.…
Existing statistical learning guarantees for general kernel regressors often yield loose bounds when used with finite-rank kernels. Yet, finite-rank kernels naturally appear in several machine learning problems, e.g.\ when fine-tuning a…
Kernel ridge regression (KRR) is a well-known and popular nonparametric regression approach with many desirable properties, including minimax rate-optimality in estimating functions that belong to common reproducing kernel Hilbert spaces…
Empirical Risk Minimization (ERM) algorithms are widely used in a variety of estimation and prediction tasks in signal-processing and machine learning applications. Despite their popularity, a theory that explains their statistical…
In this paper, we study regression problems over a separable Hilbert space with the square loss, covering non-parametric regression over a reproducing kernel Hilbert space. We investigate a class of spectral/regularized algorithms,…
While matrix variate regression models have been studied in many existing works, classical statistical and computational methods for the analysis of the regression coefficient estimation are highly affected by high dimensional and noisy…
Existing theory suggests that for linear regression problems categorized by capacity and source conditions, gradient descent (GD) is always minimax optimal, while both ridge regression and online stochastic gradient descent (SGD) are…
The saturation effect refers to the phenomenon that the kernel ridge regression (KRR) fails to achieve the information theoretical lower bound when the smoothness of the underground truth function exceeds certain level. The saturation…
We provide a statistical analysis of regularization-based continual learning on a sequence of linear regression tasks, with emphasis on how different regularization terms affect the model performance. We first derive the convergence rate…
We study realizable continual linear regression under random task orderings, a common setting for developing continual learning theory. In this setup, the worst-case expected loss after $k$ learning iterations admits a lower bound of…
We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces. The algorithm is a randomized adaptation of reduced rank regression, a technique…
We consider a statistical inverse learning problem, where we observe the image of a function $f$ through a linear operator $A$ at i.i.d. random design points $X_i$, superposed with an additive noise. The distribution of the design points is…
This paper presents an algorithm, Voted Kernel Regularization , that provides the flexibility of using potentially very complex kernel functions such as predictors based on much higher-degree polynomial kernels, while benefitting from…