Related papers: Shape complexity in cluster analysis

High Dimensional Cluster Analysis Using Path Lengths

A hierarchical scheme for clustering data is presented which applies to spaces with a high number of dimension ($N_{_{D}}>3$). The data set is first reduced to a smaller set of partitions (multi-dimensional bins). Multiple clustering…

Data Analysis, Statistics and Probability · Physics 2017-10-16 Kevin McIlhany , Stephen Wiggins

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Introduction to Clustering Algorithms and Applications

Data clustering is the process of identifying natural groupings or clusters within multidimensional data based on some similarity measure. Clustering is a fundamental process in many different disciplines. Hence, researchers from different…

Machine Learning · Computer Science 2014-08-26 Sibei Yang , Liangde Tao , Bingchen Gong

Clustering Plotted Data by Image Segmentation

Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…

Machine Learning · Computer Science 2021-10-12 Tarek Naous , Srinjay Sarkar , Abubakar Abid , James Zou

Graph partitioning advance clustering technique

Clustering is a common technique for statistical data analysis, Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very…

Machine Learning · Computer Science 2012-03-12 T Soni Madhulatha

Unique Metric for Health Analysis with Optimization of Clustering Activity and Cross Comparison of Results from Different Approach

In machine learning and data mining, Cluster analysis is one of the most widely used unsupervised learning technique. Philosophy of this algorithm is to find similar data items and group them together based on any distance function in…

Machine Learning · Statistics 2018-10-09 Kumarjit Pathak , Jitin Kapila

Information based clustering

In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial…

Quantitative Methods · Quantitative Biology 2009-11-11 Noam Slonim , Gurinder Singh Atwal , Gasper Tkacik , William Bialek

The Exploitation of Distance Distributions for Clustering

Although distance measures are used in many machine learning algorithms, the literature on the context-independent selection and evaluation of distance measures is limited in the sense that prior knowledge is used. In cluster analysis,…

Machine Learning · Computer Science 2021-08-24 Michael C. Thrun

Quartile Clustering: A quartile based technique for Generating Meaningful Clusters

Clustering is one of the main tasks in exploratory data analysis and descriptive statistics where the main objective is partitioning observations in groups. Clustering has a broad range of application in varied domains like climate,…

Databases · Computer Science 2012-03-20 Saptarsi Goswami , Amlan Chakrabarti

Clustering of Big Data with Mixed Features

Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…

Machine Learning · Statistics 2020-11-13 Joshua Tobin , Mimi Zhang

Practical Introduction to Clustering Data

Data clustering is an approach to seek for structure in sets of complex data, i.e., sets of "objects". The main objective is to identify groups of objects which are similar to each other, e.g., for classification. Here, an introduction to…

Data Analysis, Statistics and Probability · Physics 2016-02-17 Alexander K. Hartmann

Pooled variable scaling for cluster analysis

We propose a new approach for scaling prior to cluster analysis based on the concept of pooled variance. Unlike available scaling procedures such as the standard deviation and the range, our proposed scale avoids dampening the beneficial…

Methodology · Statistics 2020-07-28 Jakob Raymaekers , Ruben H. Zamar

Multi-feature Clustering of Step Data using Multivariate Functional Principal Component Analysis

This paper presents a new statistical method for clustering step data, a popular form of health record data easily obtained from wearable devices. Since step data are high-dimensional and zero-inflated, classical methods such as K-means and…

Methodology · Statistics 2020-10-16 Wookyeong Song , Hee-Seok Oh , Yaeji Lim , Ying Kuen Cheung

Selection of the Number of Clusters in Functional Data Analysis

Identifying the number $K$ of clusters in a dataset is one of the most difficult problems in clustering analysis. A choice of $K$ that correctly characterizes the features of the data is essential for building meaningful clusters. In this…

Methodology · Statistics 2019-05-06 Adriano Zanin Zambom , Julian A. Collazos , Ronaldo Dias

Cluster Identification and Characterization of Physical Fields

The description of complex configuration is a difficult issue. We present a powerful technique for cluster identification and characterization. The scheme is designed to treat with and analyze the experimental and/or simulation data from…

Statistical Mechanics · Physics 2013-08-29 Guangcai Zhang , Aiguo Xu , Guo Lu , Zeyao Mo

ClusterGraph: a new tool for visualization and compression of multidimensional data

Understanding the global organization of complicated and high dimensional data is of primary interest for many branches of applied sciences. It is typically achieved by applying dimensionality reduction techniques mapping the considered…

Computational Geometry · Computer Science 2024-11-11 Paweł Dłotko , Davide Gurnari , Mathis Hallier , Anna Jurek-Loughrey

A Unifying Family of Data-Adaptive Partitioning Algorithms

Clustering algorithms remain valuable tools for grouping and summarizing the most important aspects of data. Example areas where this is the case include image segmentation, dimension reduction, signals analysis, model order reduction,…

Numerical Analysis · Mathematics 2024-12-24 Guy B. Oldaker , Maria Emelianenko

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous subgroups. It is useful in a wide variety of applications, including document processing and modern genetics. Conventional clustering methods are unsupervised, meaning…

Methodology · Statistics 2014-07-11 Eric Bair