Related papers: High Dimensional Cluster Analysis Using Path Lengt…
The description of complex configuration is a difficult issue. We present a powerful technique for cluster identification and characterization. The scheme is designed to treat with and analyze the experimental and/or simulation data from…
Datasets in high-dimension do not typically form clusters in their original space; the issue is worse when the number of points in the dataset is small. We propose a low-computation method to find statistically significant clustering…
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for example in image analysis. The difficulty is due to the fact that high-dimensional data usually live in different low-dimensional subspaces…
The problem of dimension reduction is of increasing importance in modern data analysis. In this paper, we consider modeling the collection of points in a high dimensional space as a union of low dimensional subspaces. In particular we…
This paper introduces a new clustering technique, called {\em dimensional clustering}, which clusters each data point by its latent {\em pointwise dimension}, which is a measure of the dimensionality of the data set local to that point.…
Multi-view clustering has been widely used in recent years in comparison to single-view clustering, for clear reasons, as it offers more insights into the data, which has brought with it some challenges, such as how to combine these views…
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…
As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…
With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…
In cluster analysis, a common first step is to scale the data aiming to better partition them into clusters. Even though many different techniques have throughout many years been introduced to this end, it is probably fair to say that the…
Understanding the global organization of complicated and high dimensional data is of primary interest for many branches of applied sciences. It is typically achieved by applying dimensionality reduction techniques mapping the considered…
Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The…
Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. "Structure" can be understood as symmetry and a range of symmetries are expressed by hierarchy. Such symmetries directly…
Clustering methods are a valuable tool for the identification of patterns in high dimensional data with applications in many scientific problems. However, quantifying uncertainty in clustering is a challenging problem, particularly when…
This paper focuses on density-based clustering, particularly the Density Peak (DP) algorithm and the one based on density-connectivity DBSCAN; and proposes a new method which takes advantage of the individual strengths of these two methods…
High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called…
In high-dimension, low-sample size (HDLSS) data, it is not always true that closeness of two objects reflects a hidden cluster structure. We point out the important fact that it is not the closeness, but the "values" of distance that…
Clustering of high-dimensional data sets is a growing need in artificial intelligence, machine learning and pattern recognition. In this paper, we propose a new clustering method based on a combinatorial-topological approach applied to…
In this work, we introduce a novel methodology for divisive hierarchical clustering. Our divisive (``top-down'') approach is motivated by the fact that agglomerative hierarchical clustering (``bottom-up''), which is commonly used for…
Big Data processing systems handle huge unstructured and structured data to store, process, and analyze through cluster analysis which helps in identifying unseen patterns to find the relationships between them. Clustering analysis over the…