Related papers: Distributed Sparse Multicategory Discriminant Anal…
In recent years many sparse linear discriminant analysis methods have been proposed for high-dimensional classification and variable selection. However, most of these proposals focus on binary classification and they are not directly…
Sliced inverse regression is a popular tool for sufficient dimension reduction, which replaces covariates with a minimal set of their linear combinations without loss of information on the conditional distribution of the response given the…
Sparse linear discriminant analysis via penalized optimal scoring is a successful tool for classification in high-dimensional settings. While the variable selection consistency of sparse optimal scoring has been established, the…
This article proposes diffusion LMS strategies for distributed estimation over adaptive networks that are able to exploit sparsity in the underlying system model. The approach relies on convex regularization, common in compressive sensing,…
In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is…
In multiple domains, statistical tasks are performed in distributed settings, with data split among several end machines that are connected to a fusion center. In various applications, the end machines have limited bandwidth and power, and…
We consider the problem of multivariate regression in a setting where the relevant predictors could be shared among different responses. We propose an algorithm which decomposes the coefficient matrix into the product of a long matrix and a…
Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…
Recent studies in the literature have paid much attention to the sparsity in linear classification tasks. One motivation of imposing sparsity assumption on the linear discriminant direction is to rule out the noninformative features, making…
The performance of machine learning and pattern recognition algorithms generally depends on data representation. That is why, much of the current effort in performing machine learning algorithms goes into the design of preprocessing…
With the availability of extraordinarily huge data sets, solving the problems of distributed statistical methodology and computing for such data sets has become increasingly crucial in the big data area. In this paper, we focus on the…
We consider the scenario where one observes an outcome variable and sets of features from multiple assays, all measured on the same set of samples. One approach that has been proposed for dealing with this type of data is ``sparse multiple…
Clustering high-dimensional data often requires some form of dimensionality reduction, where clustered variables are separated from "noise-looking" variables. We cast this problem as finding a low-dimensional projection of the data which is…
In distributed optimization for large-scale learning, a major performance limitation comes from the communications between the different entities. When computations are performed by workers on local data while a coordinator machine…
We consider a non-convex constrained Lagrangian formulation of a fundamental bi-criteria optimization problem for variable selection in statistical learning; the two criteria are a smooth (possibly) nonconvex loss function, measuring the…
As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer…
Sparse representations using overcomplete dictionaries have proved to be a powerful tool in many signal processing applications such as denoising, super-resolution, inpainting, compression or classification. The sparsity of the…
Sparse coding has been popularly used as an effective data representation method in various applications, such as computer vision, medical imaging and bioinformatics, etc. However, the conventional sparse coding algorithms and its manifold…
We devise a one-shot approach to distributed sparse regression in the high-dimensional setting. The key idea is to average "debiased" or "desparsified" lasso estimators. We show the approach converges at the same rate as the lasso as long…
We propose a communication-efficient distributed estimation method for sparse linear discriminant analysis (LDA) in the high dimensional regime. Our method distributes the data of size $N$ into $m$ machines, and estimates a local sparse LDA…