Related papers: Shape complexity in cluster analysis
A hierarchical scheme for clustering data is presented which applies to spaces with a high number of dimension ($N_{_{D}}>3$). The data set is first reduced to a smaller set of partitions (multi-dimensional bins). Multiple clustering…
With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…
Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…
Data clustering is the process of identifying natural groupings or clusters within multidimensional data based on some similarity measure. Clustering is a fundamental process in many different disciplines. Hence, researchers from different…
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…
Clustering is a common technique for statistical data analysis, Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very…
In machine learning and data mining, Cluster analysis is one of the most widely used unsupervised learning technique. Philosophy of this algorithm is to find similar data items and group them together based on any distance function in…
In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial…
Although distance measures are used in many machine learning algorithms, the literature on the context-independent selection and evaluation of distance measures is limited in the sense that prior knowledge is used. In cluster analysis,…
Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…
Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…
Data clustering is an approach to seek for structure in sets of complex data, i.e., sets of "objects". The main objective is to identify groups of objects which are similar to each other, e.g., for classification. Here, an introduction to…
We propose a new approach for scaling prior to cluster analysis based on the concept of pooled variance. Unlike available scaling procedures such as the standard deviation and the range, our proposed scale avoids dampening the beneficial…
This paper presents a new statistical method for clustering step data, a popular form of health record data easily obtained from wearable devices. Since step data are high-dimensional and zero-inflated, classical methods such as K-means and…
Identifying the number $K$ of clusters in a dataset is one of the most difficult problems in clustering analysis. A choice of $K$ that correctly characterizes the features of the data is essential for building meaningful clusters. In this…
The description of complex configuration is a difficult issue. We present a powerful technique for cluster identification and characterization. The scheme is designed to treat with and analyze the experimental and/or simulation data from…
Understanding the global organization of complicated and high dimensional data is of primary interest for many branches of applied sciences. It is typically achieved by applying dimensionality reduction techniques mapping the considered…
Clustering algorithms remain valuable tools for grouping and summarizing the most important aspects of data. Example areas where this is the case include image segmentation, dimension reduction, signals analysis, model order reduction,…
We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…
Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…