相关论文: Macrostate Data Clustering
Spectral clustering uses the global information embedded in eigenvectors of an inter-item similarity matrix to correctly identify clusters of irregular shape, an ability lacking in commonly used approaches such as k-means and agglomerative…
Clustering is a central tool in biomedical research for discovering heterogeneous patient subpopulations, where group boundaries are often diffuse rather than sharply separated. Traditional methods produce hard partitions, whereas soft…
In a data matrix, we may distinguish between cases, each represented by a row vector for a statistical unit, and cells, which correspond to single entries of the data matrix. Recent developments in Robust Statistics have introduced the…
In several environmental applications data are functions of time, essentially con- tinuous, observed and recorded discretely, and spatially correlated. Most of the methods for analyzing such data are extensions of spatial statistical tools…
Identification of the clusters from an unlabeled data set is one of the most important problems in Unsupervised Machine Learning. The state of the art clustering algorithms are based on either the statistical properties or the geometric…
We propose a novel method for building fuzzy clusters of large data sets, using a smoothing numerical approach. The usual sum-of-squares criterion is relaxed so the search for good fuzzy partitions is made on a continuous space, rather than…
Statistical jump models have been recently introduced to detect persistent regimes by clustering temporal features and discouraging frequent regime changes. However, they are limited to hard clustering and thereby do not account for…
Clustering is a widely used unsupervised learning method for finding structure in the data. However, the resulting clusters are typically presented without any guarantees on their robustness; slightly changing the used data sample or…
In this paper a fuzzy clustering model for fuzzy data with outliers is proposed. The model is based on Wasserstein distance between interval valued data which is generalized to fuzzy data. In addition, Keller's approach is used to identify…
Multi-view data clustering refers to categorizing a data set by making good use of related information from multiple representations of the data. It becomes important nowadays because more and more data can be collected in a variety of…
The conventional clustering algorithms have difficulties in handling the challenges posed by the collection of natural data which is often vague and uncertain. Fuzzy clustering methods have the potential to manage such situations…
We consider the problem of clustering functional data according to their covariance structure. We contribute a soft clustering methodology based on the Wasserstein-Procrustes distance, where the in-between cluster variability is penalised…
Fuzzy clustering is a famous unsupervised learning method used to collecting similar data elements within cluster according to some similarity measurement. But, clustering algorithms suffer from some drawbacks. Among the main weakness…
The research interest of this paper is focused on the efficient clustering task for an arbitrary color data. In order to tackle this problem, we have tried to model the inherent uncertainty and vagueness of color data using fuzzy color…
The input of most clustering algorithms is a symmetric matrix quantifying similarity within data pairs. Such a matrix is here turned into a quadratic set function measuring cluster score or similarity within data subsets larger than pairs.…
Inference in clustering is paramount to uncovering inherent group structure in data. Clustering methods which assess statistical significance have recently drawn attention owing to their importance for the identification of patterns in high…
A fuzzy clustering algorithm for multidimensional data is proposed in this article. The data is described by vectors whose components are linguistic variables defined in an ordinal scale. The obtained results confirm the efficiency of the…
Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying…
The clustering methods have been used in a variety of fields such as image processing, data mining, pattern recognition, and statistical analysis. Generally, the clustering algorithms consider all variables equally relevant or not…
In recent years, the problem of fuzzy clustering has been widely concerned. The membership iteration of existing methods is mostly considered globally, which has considerable problems in noisy environments, and iterative calculations for…