Related papers: High Dimensional Cluster Analysis Using Path Lengt…

Cluster Identification and Characterization of Physical Fields

The description of complex configuration is a difficult issue. We present a powerful technique for cluster identification and characterization. The scheme is designed to treat with and analyze the experimental and/or simulation data from…

Statistical Mechanics · Physics 2013-08-29 Guangcai Zhang , Aiguo Xu , Guo Lu , Zeyao Mo

Clustering small datasets in high-dimension by random projection

Datasets in high-dimension do not typically form clusters in their original space; the issue is worse when the number of points in the dataset is small. We propose a low-computation method to find statistically significant clustering…

Machine Learning · Statistics 2020-08-24 Alden Bradford , Tarun Yellamraju , Mireille Boutin

High-Dimensional Data Clustering

Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for example in image analysis. The difficulty is due to the fact that high-dimensional data usually live in different low-dimensional subspaces…

Statistics Theory · Mathematics 2016-08-16 Charles Bouveyron , Stéphane Girard , Cordelia Schmid

Subspace Clustering through Sub-Clusters

The problem of dimension reduction is of increasing importance in modern data analysis. In this paper, we consider modeling the collection of points in a high dimensional space as a union of low dimensional subspaces. In particular we…

Machine Learning · Statistics 2020-06-12 Weiwei Li , Jan Hannig , Sayan Mukherjee

Clustering by latent dimensions

This paper introduces a new clustering technique, called {\em dimensional clustering}, which clusters each data point by its latent {\em pointwise dimension}, which is a measure of the dimensionality of the data set local to that point.…

Machine Learning · Statistics 2018-05-29 Shohei Hidaka , Neeraj Kashyap

High-dimensional multi-view clustering methods

Multi-view clustering has been widely used in recent years in comparison to single-view clustering, for clear reasons, as it offers more insights into the data, which has brought with it some challenges, such as how to combine these views…

Machine Learning · Computer Science 2025-11-25 Alaeddine Zahir , Khalide Jbilou , Ahmed Ratnani

Clustering Plotted Data by Image Segmentation

Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…

Machine Learning · Computer Science 2021-10-12 Tarek Naous , Srinjay Sarkar , Abubakar Abid , James Zou

Variable Selection for Clustering and Classification

As data sets continue to grow in size and complexity, effective and efficient techniques are needed to target important features in the variable space. Many of the variable selection techniques that are commonly used alongside clustering…

Computation · Statistics 2013-03-22 Jeffrey L. Andrews , Paul D. McNicholas

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Shape complexity in cluster analysis

In cluster analysis, a common first step is to scale the data aiming to better partition them into clusters. Even though many different techniques have throughout many years been introduced to this end, it is probably fair to say that the…

Machine Learning · Computer Science 2023-05-30 Eduardo J. Aguilar , Valmir C. Barbosa

ClusterGraph: a new tool for visualization and compression of multidimensional data

Understanding the global organization of complicated and high dimensional data is of primary interest for many branches of applied sciences. It is typically achieved by applying dimensionality reduction techniques mapping the considered…

Computational Geometry · Computer Science 2024-11-11 Paweł Dłotko , Davide Gurnari , Mathis Hallier , Anna Jurek-Loughrey

Deep Continuous Clustering

Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The…

Machine Learning · Computer Science 2018-03-06 Sohil Atul Shah , Vladlen Koltun

Hierarchical Clustering for Finding Symmetries and Other Patterns in Massive, High Dimensional Datasets

Data analysis and data mining are concerned with unsupervised pattern finding and structure determination in data sets. "Structure" can be understood as symmetry and a range of symmetries are expressed by hierarchy. Such symmetries directly…

Machine Learning · Statistics 2015-03-17 Fionn Murtagh , Pedro Contreras

U-statistical inference for hierarchical clustering

Clustering methods are a valuable tool for the identification of patterns in high dimensional data with applications in many scientific problems. However, quantifying uncertainty in clustering is a challenging problem, particularly when…

Methodology · Statistics 2018-06-01 Marcio Valk , Gabriela Bettella Cybis

Hierarchical clustering that takes advantage of both density-peak and density-connectivity

This paper focuses on density-based clustering, particularly the Density Peak (DP) algorithm and the one based on density-connectivity DBSCAN; and proposes a new method which takes advantage of the individual strengths of these two methods…

Machine Learning · Computer Science 2024-01-30 Ye Zhu , Kai Ming Ting , Yuan Jin , Maia Angelova

Clustering High-dimensional Data via Feature Selection

High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called…

Methodology · Statistics 2022-10-31 Tianqi Liu , Yu Lu , Biqing Zhu , Hongyu Zhao

Clustering for high-dimension, low-sample size data using distance vectors

In high-dimension, low-sample size (HDLSS) data, it is not always true that closeness of two objects reflects a hidden cluster structure. We point out the important fact that it is not the closeness, but the "values" of distance that…

Machine Learning · Statistics 2013-12-30 Yoshikazu Terada

Hyperoctant Search Clustering: A Method for Clustering Data in High-Dimensional Hyperspheres

Clustering of high-dimensional data sets is a growing need in artificial intelligence, machine learning and pattern recognition. In this paper, we propose a new clustering method based on a combinatorial-topological approach applied to…

Machine Learning · Computer Science 2025-03-12 Mauricio Toledo-Acosta , Luis Ángel Ramos-García , Jorge Hermosillo-Valadez

Divisive Hierarchical Clustering of Variables Identified by Singular Vectors

In this work, we introduce a novel methodology for divisive hierarchical clustering. Our divisive (``top-down'') approach is motivated by the fact that agglomerative hierarchical clustering (``bottom-up''), which is commonly used for…

Methodology · Statistics 2025-10-07 Jan O. Bauer

Writing summary for the state-of-the-art methods for big data clustering in distributed environment

Big Data processing systems handle huge unstructured and structured data to store, process, and analyze through cluster analysis which helps in identifying unseen patterns to find the relationships between them. Clustering analysis over the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-11-11 Dipesh Gyawali