Related papers: New developments in Sparse PLS regression
We develop a new robust stopping criterion in Partial Least Squares Regressions (PLSR) components construction characterised by a high level of stability. This new criterion is defined as a universal one since it is suitable both for PLSR…
Latent structure methods, specifically linear continuous latent structure methods, are a type of fundamental statistical learning strategy. They are widely used for dimension reduction, regression and prediction, in the fields of…
Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection,…
Partial least squares (PLS) regression combines dimensionality reduction and prediction using a latent variable model. Since partial least squares regression (PLS-R) does not require matrix inversion or diagonalization, it can be applied to…
Relating a set of variables X to a response y is crucial in chemometrics. A quantitative prediction objective can be enriched by qualitative data interpretation, for instance by locating the most influential features. When high-dimensional…
This paper investigates some theoretical properties of the Partial Least Square (PLS) method. We focus our attention on the single component case, that provides a useful framework to understand the underlying mechanism. We provide a…
Partial least squares, as a dimension reduction method, has become increasingly important for its ability to deal with problems with a large number of variables. Since noisy variables may weaken the performance of the model, the sparse…
Partial Least Square (PLS) is a dimension reduction method used to remove multicollinearities in a regression model. However contrary to Principal Components Analysis (PCA) the PLS components are also choosen to be optimal for predicting…
This paper presents a new variable selection approach integrated with Gaussian process (GP) regression. We consider a sparse projection of input variables and a general stationary covariance model that depends on the Euclidean distance…
Gaussian processes (GPs) have gained popularity as flexible machine learning models for regression and function approximation with an in-built method for uncertainty quantification. However, GPs suffer when the amount of training data is…
We introduce and study the Group Square-Root Lasso (GSRL) method for estimation in high dimensional sparse regression models with group structure. The new estimator minimizes the square root of the residual sum of squares plus a penalty…
Partial Least Squares (PLS) methods have been heavily exploited to analyse the association between two blocs of data. These powerful approaches can be applied to data sets where the number of variables is greater than the number of…
With massive high-dimensional data now commonplace in research and industry, there is a strong and growing demand for more scalable computational techniques for data analysis and knowledge discovery. Key to turning these data into knowledge…
The Bayesian Lasso is constructed in the linear regression framework and applies the Gibbs sampling to estimate the regression parameters. This paper develops a new sparse learning model, named the Bayesian Lasso Sparse (BLS) model, that…
High-dimensional compositional data are commonplace in the modern omics sciences amongst others. Analysis of compositional data requires a proper choice of orthonormal coordinate representation as their relative nature is not compatible…
The discovery of Partial Differential Equations (PDEs) is an essential task for applied science and engineering. However, data-driven discovery of PDEs is generally challenging, primarily stemming from the sensitivity of the discovered…
Bootstrap is commonly used as a tool for non-parametric statistical inference to estimate meaningful parameters in Variable Selection Models. However, for massive dataset that has exponential growth rate, the computation of Bootstrap…
The generalized linear model (GLM) plays a key role in regression analyses. In high-dimensional data, the sparse GLM has been used but it is not robust against outliers. Recently, the robust methods have been proposed for the specific…
This article investigates uncertainty quantification of the generalized linear lasso~(GLL), a popular variable selection method in high-dimensional regression settings. In many fields of study, researchers use data-driven methods to select…
Existing partial sequence labeling models mainly focus on max-margin framework which fails to provide an uncertainty estimation of the prediction. Further, the unique ground truth disambiguation strategy employed by these models may include…