Related papers: Structural randomised selection
High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation and gene expression associated with a disease. An integrated genomic profiling…
The motivation of this work is to improve the performance of standard stacking approaches or ensembles, which are composed of simple, heterogeneous base models, through the integration of the generation and selection stages for regression…
The two primary approaches for high-dimensional regression problems are sparse methods (e.g., best subset selection, which uses the L0-norm in the penalty) and ensemble methods (e.g., random forests). Although sparse methods typically yield…
Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection,…
We aim to develop a time series modeling methodology tailored to high-dimensional environments, addressing two critical challenges: variable selection from a large pool of candidates, and the detection of structural break points, where the…
Automated model selection is an important application in science and engineering. In this work, we develop a learning approach for identifying structured dynamical systems from undersampled and noisy spatiotemporal data. The learning is…
Predicting clinical variables from whole-brain neuroimages is a high dimensional problem that requires some type of feature selection or extraction. Penalized regression is a popular embedded feature selection method for high dimensional…
We study the performance of sparse regression methods and propose new techniques to distill the governing equations of dynamical systems from data. We first look at the generic methodology of learning interpretable equation forms from data,…
Sparse sampling schemes have the potential to dramatically reduce image acquisition time while simultaneously reducing radiation damage to samples. However, for a sparse sampling scheme to be useful it is important that we are able to…
A compositional tree refers to a tree structure on a set of random variables where each random variable is a node and composition occurs at each non-leaf node of the tree. As a generalization of compositional data, compositional trees…
We propose a method for variable selection and basis learning for high-dimensional classification with ordinal responses. The proposed method extends sparse multiclass linear discriminant analysis, with the aim of identifying not only the…
Large-scale {\it in vitro} drug sensitivity screens are an important tool in personalized oncology to predict the effectiveness of potential cancer drugs. The prediction of the sensitivity of cancer cell lines to a panel of drugs is a…
We consider the problem of sparse variable selection on high dimension heterogeneous data sets, which has been taking on renewed interest recently due to the growth of biological and medical data sets with complex, non-i.i.d. structures and…
Causal structure learning, also known as causal discovery, aims to estimate causal relationships between variables as a form of a causal directed acyclic graph (DAG) from observational data. One of the major frameworks is the order-based…
Sparse regression has emerged as a popular technique for learning dynamical systems from temporal data, beginning with the SINDy (Sparse Identification of Nonlinear Dynamics) framework proposed by arXiv:1509.03580. Quantifying the…
Covariance regression offers an effective way to model the large covariance matrix with the auxiliary similarity matrices. In this work, we propose a sparse covariance regression (SCR) approach to handle the potentially high-dimensional…
We study a generalized framework for structured sparsity. It extends the well-known methods of Lasso and Group Lasso by incorporating additional constraints on the variables as part of a convex optimization problem. This framework provides…
We consider the application of a popular penalised regression method, Ridge Regression, to data with very high dimensions and many more covariates than observations. Our motivation is the problem of out-of-sample prediction and the setting…
Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…
We propose a new method for supervised learning, especially suited to wide data where the number of features is much greater than the number of observations. The method combines the lasso ($\ell_1$) sparsity penalty with a quadratic penalty…