Related papers: Feature Selection in High-dimensional Spaces Using…
When processing high-dimensional datasets, a common pre-processing step is feature selection. Filter-based feature selection algorithms are not tailored to a specific classification method, but rather rank the relevance of each feature with…
Feature selection is playing an increasingly significant role with respect to many computer vision applications spanning from object recognition to visual object tracking. However, most of the recent solutions in feature selection are not…
Feature selection is a critical step in the analysis of high-dimensional data, where the number of features often vastly exceeds the number of samples. Effective feature selection not only improves model performance and interpretability but…
In this paper, we focus on the unsupervised multi-view feature selection which tries to handle high dimensional data in the field of multi-view learning. Although some graph-based methods have achieved satisfactory performance, they ignore…
High-dimensional k-sample comparison is a common applied problem. We construct a class of easy-to-implement nonparametric distribution-free tests based on new tools and unexplored connections with spectral graph theory. The test is shown to…
Unsupervised feature selection is an important method to reduce dimensions of high dimensional data without labels, which is benefit to avoid ``curse of dimensionality'' and improve the performance of subsequent machine learning tasks, like…
The applications of traditional statistical feature selection methods to high-dimension, low sample-size data often struggle and encounter challenging problems, such as overfitting, curse of dimensionality, computational infeasibility, and…
Feature selection is a dimensionality reduction technique that selects a subset of representative features from high dimensional data by eliminating irrelevant and redundant features. Recently, feature selection combined with sparse…
In this paper, we introduce a novel unsupervised, graph-based filter feature selection technique which exploits the power of topologically constrained network representations. We model dependency structures among features using a family of…
The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years. In many applications such as sensor networks and proteomics it is often expensive to obtain samples…
Feature selection is a powerful dimension reduction technique which selects a subset of relevant features for model construction. Numerous feature selection methods have been proposed, but most of them fail under the high-dimensional and…
We consider the use of Bayesian information criteria for selection of the graph underlying an Ising model. In an Ising model, the full conditional distributions of each variable form logistic regression models, and variable selection…
Undirected graphical models are widely used to model the conditional independence structure of vector-valued data. However, in many modern applications, for example those involving EEG and fMRI data, observations are more appropriately…
We consider the problem of high-dimensional non-linear variable selection for supervised learning. Our approach is based on performing linear selection among exponentially many appropriately defined positive definite kernels that…
This paper introduces and demonstrates a computational pipeline for the statistical analysis of shape graph datasets, namely geometric networks embedded in 2D or 3D spaces. Unlike traditional abstract graphs, our purpose is not only to…
We develop some graph-based tests for spherical symmetry of a multivariate distribution using a method based on data augmentation. These tests are constructed using a new notion of signs and ranks that are computed along a path obtained by…
Feature selection is an important task in many problems occurring in pattern recognition, bioinformatics, machine learning and data mining applications. The feature selection approach enables us to reduce the computation burden and the…
We consider the problem of estimating the number of clusters (k) in a dataset. We propose a non-parametric approach to the problem that utilizes similarity graphs to construct a robust statistic that effectively captures similarity…
Numerous social, medical, engineering and biological challenges can be framed as graph-based learning tasks. Here, we propose a new feature based approach to network classification. We show how dynamics on a network can be useful to reveal…
We present a nonparametric method for selecting informative features in high-dimensional clustering problems. We start with a screening step that uses a test for multimodality. Then we apply kernel density estimation and mode clustering to…