Related papers: Aggregation for Gaussian regression
This paper studies statistical aggregation procedures in regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of…
Let $\cF$ be a set of $M$ classification procedures with values in $[-1,1]$. Given a loss function, we want to construct a procedure which mimics at the best possible rate the best procedure in $\cF$. This fastest rate is called optimal…
In the regression problem, we consider the problem of estimating the variance function by the means of aggregation methods. We focus on two particular aggregation setting: Model Selection aggregation (MS) and Convex aggregation (C) where…
$L_1$ regularized logistic regression has now become a workhorse of data mining and bioinformatics: it is widely used for many classification problems, particularly ones with many features. However, $L_1$ regularization typically selects…
It is generally believed that ensemble approaches, which combine multiple algorithms or models, can outperform any single algorithm at machine learning tasks, such as prediction. In this paper, we propose Bayesian convex and linear…
In the same spirit as Tsybakov (2003), we define the optimality of an aggregation procedure in the problem of classification. Using an aggregate with exponential weights, we obtain an optimal rate of convex aggregation for the hinge risk…
In order to scale standard Gaussian process (GP) regression to large-scale datasets, aggregation models employ factorized training process and then combine predictions from distributed experts. The state-of-the-art aggregation models,…
We study the problem of model selection type aggregation with respect to the Kullback-Leibler divergence for various probabilistic models. Rather than considering a convex combination of the initial estimators $f_1, \ldots, f_N$, our…
Gradient sampling (GS) has proved to be an effective methodology for the minimization of objective functions that may be nonconvex and/or nonsmooth. The most computationally expensive component of a contemporary GS method is the need to…
In this paper, we present a study of a kernel-based consensual aggregation on randomly projected high-dimensional features of predictions for regression. The aggregation scheme is composed of two steps: the high-dimensional features of…
Given a dictionary of $M_n$ initial estimates of the unknown true regression function, we aim to construct linearly aggregated estimators that target the best performance among all the linear combinations under a sparse $q$-norm ($0 \leq q…
We study the problem of aggregation of estimators when the estimators are not independent of the data used for aggregation and no sample splitting is allowed. If the estimators are deterministic vectors, it is well known that the minimax…
Let $F$ be a finite model of cardinality $M$ and denote by $\operatorname {conv}(F)$ its convex hull. The problem of convex aggregation is to construct a procedure having a risk as close as possible to the minimal risk over $\operatorname…
We introduce a very general method for sparse and large-scale variable selection. The large-scale regression settings is such that both the number of parameters and the number of samples are extremely large. The proposed method is based on…
Gaussian process (GP) models have received increasing attention in recent years due to their superb prediction accuracy and modeling flexibility. To address the computational burdens of GP models for large-scale datasets, distributed…
We consider a recursive algorithm to construct an aggregated estimator from a finite number of base decision rules in the classification problem. The estimator approximately minimizes a convex risk functional under the l1-constraint. It is…
Many decision problems cannot be solved exactly and use several estimation algorithms that assign scores to the different available options. The estimation errors can have various correlations, from low (e.g. between two very different…
Convex regression is a promising area for bridging statistical estimation and deterministic convex optimization. New piecewise linear convex regression methods are fast and scalable, but can have instability when used to approximate…
Rating aggregation plays a crucial role in various fields, such as product recommendations, hotel rankings, and teaching evaluations. However, traditional averaging methods can be affected by participation bias, where some raters do not…
Gaussian mixture models are widely used to study clustering problems. These model-based clustering methods require an accurate estimation of the unknown data density by Gaussian mixtures. In Maugis and Michel (2009), a penalized maximum…