Related papers: A Fast General-Purpose Clustering Algorithm Based …

CLUE: A Fast Parallel Clustering Algorithm for High Granularity Calorimeters in High Energy Physics

One of the challenges of high granularity calorimeters, such as that to be built to cover the endcap region in the CMS Phase-2 Upgrade for HL-LHC, is that the large number of channels causes a surge in the computing load when clustering…

Instrumentation and Detectors · Physics 2020-01-29 Marco Rovere , Ziheng Chen , Antonio Di Pilato , Felice Pantaleo , Chris Seez

A Fast Algorithm for Clustering High Dimensional Feature Vectors

We propose an algorithm for clustering high dimensional data. If $P$ features for $N$ objects are represented in an $N\times P$ matrix ${\bf X}$, where $N\ll P$, the method is based on exploiting the cluster-dependent structure of the…

Machine Learning · Statistics 2018-11-05 Shahina Rahman , Valen E. Johnson

A Complexity Agnostic Clustering Engine for Time Projection Chambers and its Implementation in FPGA

A clustering functional block implemented in field-programable-gate-array (FPGA) for time projection chambers (TPC) operating with predictable time regardless the complexity of the event is described in this paper. The clustering functional…

Instrumentation and Detectors · Physics 2026-04-20 Jinyuan Wu , Michael Wang , Datao Gong

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Distributed Spatial Data Clustering as a New Approach for Big Data Analysis

In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases, the first phase…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Probabilistic Partitive Partitioning (PPP)

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heuristics are applied to cluster the data. Heuristics can be very resource-intensive, if not applied properly. For substantially large data sets computational efficiencies…

Databases · Computer Science 2020-03-11 Mujahid Sultan

A Rapid Review of Clustering Algorithms

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Faster Parallel Exact Density Peaks Clustering

Clustering multidimensional points is a fundamental data mining task, with applications in many fields, such as astronomy, neuroscience, bioinformatics, and computer vision. The goal of clustering algorithms is to group similar objects…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-22 Yihao Huang , Shangdi Yu , Julian Shun

Fast Density Estimation for Density-based Clustering Methods

Density-based clustering algorithms are widely used for discovering clusters in pattern recognition and machine learning since they can deal with non-hyperspherical clusters and are robustness to handle outliers. However, the runtime of…

Machine Learning · Computer Science 2022-07-07 Difei Cheng , Ruihang Xu , Bo Zhang , Ruinan Jin

Fast and explainable clustering based on sorting

We introduce a fast and explainable clustering method called CLASSIX. It consists of two phases, namely a greedy aggregation phase of the sorted data into groups of nearby data points, followed by the merging of groups into clusters. The…

Machine Learning · Computer Science 2024-02-16 Xinye Chen , Stefan Güttel

Contraction Clustering (RASTER): A Very Fast Big Data Algorithm for Sequential and Parallel Density-Based Clustering in Linear Time, Constant Memory, and a Single Pass

Clustering is an essential data mining tool for analyzing and grouping similar objects. In big data applications, however, many clustering algorithms are infeasible due to their high memory requirements and/or unfavorable runtime…

Data Structures and Algorithms · Computer Science 2026-01-27 Gregor Ulm , Simon Smith , Adrian Nilsson , Emil Gustavsson , Mats Jirstrand

Data Processing with FPGAs on Modern Architectures

Trends in hardware, the prevalence of the cloud, and the rise of highly demanding applications have ushered an era of specialization that quickly changes how data is processed at scale. These changes are likely to continue and accelerate in…

Databases · Computer Science 2023-06-27 Wenqi Jiang , Dario Korolija , Gustavo Alonso

Clustering Plotted Data by Image Segmentation

Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar…

Machine Learning · Computer Science 2021-10-12 Tarek Naous , Srinjay Sarkar , Abubakar Abid , James Zou

Clustering by Attention: Leveraging Prior Fitted Transformers for Data Partitioning

Clustering is a core task in machine learning with wide-ranging applications in data mining and pattern recognition. However, its unsupervised nature makes it inherently challenging. Many existing clustering algorithms suffer from critical…

Machine Learning · Computer Science 2025-07-29 Ahmed Shokry , Ayman Khalafallah

Clustering by Constructing Hyper-Planes

As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Luhong Diao , Jinying Gao1 , Manman Deng

Review of Clustering Methods for Functional Data

Functional data clustering is to identify heterogeneous morphological patterns in the continuous functions underlying the discrete measurements/observations. Application of functional data clustering has appeared in many publications across…

Methodology · Statistics 2022-10-04 Mimi Zhang , Andrew Parnell

A parallel sampling based clustering

The problem of automatically clustering data is an age old problem. People have created numerous algorithms to tackle this problem. The execution time of any of this algorithm grows with the number of input points and the number of cluster…

Machine Learning · Computer Science 2014-12-08 Aditya AV Sastry , Kalyan Netti

F-tree: an algorithm for clustering transactional data using frequency tree

Clustering is an important data mining technique that groups similar data records, recently categorical transaction clustering is received more attention. In this research, we study the problem of categorical data clustering for…

Databases · Computer Science 2017-05-03 Mahmoud Mahdi , Samir Abdelrahman , Reem Bahgat , Ismail Ismail

Two-stage Ensemble Clustering of Functional Data Using Random Projections

We propose a computationally simple framework for clustering functional data based on Gaussian-process-generated random projections. In this approach, each curve is first projected onto a large collection of independent Gaussian process…

Methodology · Statistics 2026-05-22 Sourav Chakrabarty , Anirvan Chakraborty , Shyamal K. De