Related papers: Distributed Lance-William Clustering Algorithm

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Clustering by Constructing Hyper-Planes

As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Luhong Diao , Jinying Gao1 , Manman Deng

Algorithms and Complexity of Range Clustering

We introduce a novel criterion in clustering that seeks clusters with limited range of values associated with each cluster's elements. In clustering or classification the objective is to partition a set of objects into subsets, called…

Data Structures and Algorithms · Computer Science 2018-05-15 Dorit S. Hochbaum

A matching based clustering algorithm for categorical data

Cluster analysis is one of the essential tasks in data mining and knowledge discovery. Each type of data poses unique challenges in achieving relatively efficient partitioning of the data into homogeneous groups. While the algorithms for…

Machine Learning · Computer Science 2018-12-11 Ruben A. Gevorgyan , Yenok B. Hakobyan

Scalable Density-Based Distributed Clustering

Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex…

Databases · Computer Science 2014-09-24 Eshref Januzaj , Hans-Peter Kriegel , Martin Pfeifle

Neural Network Clustering Based on Distances Between Objects

We present an algorithm of clustering of many-dimensional objects, where only the distances between objects are used. Centers of classes are found with the aid of neuron-like procedure with lateral inhibition. The result of clustering does…

Computer Vision and Pattern Recognition · Computer Science 2007-05-23 Leonid B. Litinskii , Dmitry E. Romanov

Hierarchical Clustering Supported by Reciprocal Nearest Neighbors

Clustering is a fundamental analysis tool aiming at classifying data points into groups based on their similarity or distance. It has found successful applications in all natural and social sciences, including biology, physics, economics,…

Information Retrieval · Computer Science 2021-02-24 Wen-Bo Xie , Yan-Li Lee , Cong Wang , Duan-Bing Chen , Tao Zhou

A Rapid Review of Clustering Algorithms

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Near-Optimal Comparison Based Clustering

The goal of clustering is to group similar objects into meaningful partitions. This process is well understood when an explicit similarity measure between the objects is given. However, far less is known when this information is not readily…

Machine Learning · Computer Science 2020-10-12 Michaël Perrot , Pascal Mattia Esser , Debarghya Ghoshdastidar

An Overview on Clustering Methods

Clustering is a common technique for statistical data analysis, which is used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Clustering is the process of grouping similar…

Data Structures and Algorithms · Computer Science 2012-05-08 T. Soni Madhulatha

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz

Clustering-based Low Rank Approximation Method

We propose a clustering-based generalized low rank approximation method, which takes advantage of appealing features from both the generalized low rank approximation of matrices (GLRAM) and cluster analysis. It exploits a more general form…

Optimization and Control · Mathematics 2025-02-21 Yujun Zhu , Jie Zhu , Hizba Arshad , Zhongming Wang , Ju Ming

Active clustering for labeling training data

Gathering training data is a key step of any supervised learning task, and it is both critical and expensive. Critical, because the quantity and quality of the training data has a high impact on the performance of the learned function.…

Data Structures and Algorithms · Computer Science 2021-10-28 Quentin Lutz , Élie de Panafieu , Alex Scott , Maya Stein

Distributed Clustering and Learning Over Networks

Distributed processing over networks relies on in-network processing and cooperation among neighboring agents. Cooperation is beneficial when agents share a common objective. However, in many applications agents may belong to different…

Optimization and Control · Mathematics 2023-07-19 Xiaochuan Zhao , Ali H. Sayed

Parallel D2-Clustering: Large-Scale Clustering of Discrete Distributions

The discrete distribution clustering algorithm, namely D2-clustering, has demonstrated its usefulness in image classification and annotation where each object is represented by a bag of weighed vectors. The high computational complexity of…

Machine Learning · Computer Science 2013-02-07 Yu Zhang , James Z. Wang , Jia Li

Clustering categorical data via ensembling dissimilarity matrices

We present a technique for clustering categorical data by generating many dissimilarity matrices and averaging over them. We begin by demonstrating our technique on low dimensional categorical data and comparing it to several other…

Machine Learning · Statistics 2017-09-20 Saeid Amiri , Bertrand Clarke , Jennifer Clarke

Crowdsourced correlation clustering with relative distance comparisons

Crowdsourced, or human computation based clustering algorithms usually rely on relative distance comparisons, as these are easier to elicit from human workers than absolute distance information. A relative distance comparison is a statement…

Data Structures and Algorithms · Computer Science 2017-09-26 Antti Ukkonen

Clustering matrices through optimal permutations

Matrices are two-dimensional data structures allowing one to conceptually organize information. For example, adjacency matrices are useful to store the links of a network; correlation matrices are simple ways to arrange gene co-expression…

Disordered Systems and Neural Networks · Physics 2022-09-29 Flaviano Morone

A Quantum Annealing-Based Approach to Extreme Clustering

Clustering, or grouping, dataset elements based on similarity can be used not only to classify a dataset into a few categories, but also to approximate it by a relatively large number of representative elements. In the latter scenario,…

Machine Learning · Computer Science 2019-09-13 Tim Jaschek , Marko Bucyk , Jaspreet S. Oberoi