Related papers: Efficient techniques for mining spatial databases

Clustering with Obstacles in Spatial Databases

Clustering large spatial databases is an important problem, which tries to find the densely populated regions in a spatial area to be used in data mining, knowledge discovery, or efficient information retrieval. However most algorithms have…

Databases · Computer Science 2009-09-25 Mohamed A. El-Zawawy , Mohamed E. El-Sharkawi

A Short Survey on Data Clustering Algorithms

With rapidly increasing data, clustering algorithms are important tools for data analytics in modern research. They have been successfully applied to a wide range of domains; for instance, bioinformatics, speech recognition, and financial…

Data Structures and Algorithms · Computer Science 2015-12-01 Ka-Chun Wong

Clustering by Constructing Hyper-Planes

As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Luhong Diao , Jinying Gao1 , Manman Deng

Methods of Hierarchical Clustering

We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments. We look at hierarchical self-organizing maps, and mixture models. We review grid-based…

Information Retrieval · Computer Science 2011-05-03 Fionn Murtagh , Pedro Contreras

Algorithm for Spatial Clustering with Obstacles

In this paper, we propose an efficient clustering technique to solve the problem of clustering in the presence of obstacles. The proposed algorithm divides the spatial area into rectangular cells. Each cell is associated with statistical…

Databases · Computer Science 2009-09-25 Mohamed E. El-Sharkawi , Mohamed A. El-Zawawy

StruClus: Structural Clustering of Large-Scale Graph Databases

We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…

Databases · Computer Science 2016-10-03 Till Schäfer , Petra Mutzel

A Rapid Review of Clustering Algorithms

Clustering algorithms aim to organize data into groups or clusters based on the inherent patterns and similarities within the data. They play an important role in today's life, such as in marketing and e-commerce, healthcare, data…

Machine Learning · Computer Science 2024-01-17 Hui Yin , Amir Aryani , Stephen Petrie , Aishwarya Nambissan , Aland Astudillo , Shengyuan Cao

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Clustering Strategies in Satellite-Aided Communications

With the rapid advancement of next-generation satellite networks, addressing clustering tasks, user grouping, and efficient link management has become increasingly critical to optimize network performance and reduce interference. In this…

Information Theory · Computer Science 2025-09-18 Tam Ninh Thi-Thanh , Nguyen Minh Quan , Do Son Tung , Trinh Van Chien , Hung Tran

GriT-DBSCAN: A Spatial Clustering Algorithm for Very Large Databases

DBSCAN is a fundamental spatial clustering algorithm with numerous practical applications. However, a bottleneck of the algorithm is in the worst case, the run time complexity is $O(n^2)$. To address this limitation, we propose a new…

Databases · Computer Science 2022-11-08 Xiaogang Huang , Tiefeng Ma , Conan Liu , Shuangzhe Liu

Distributed Clustering Algorithm for Spatial Data Mining

Distributed data mining techniques and mainly distributed clustering are widely used in the last decade because they deal with very large and heterogeneous datasets which cannot be gathered centrally. Current distributed clustering…

Databases · Computer Science 2018-02-02 Malika Bendechache , M-Tahar Kechadi

Distributed Spatial Data Clustering as a New Approach for Big Data Analysis

In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases, the first phase…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Data mining techniques on astronomical spectra data. I : Clustering Analysis

Clustering is an effective tool for astronomical spectral analysis, to mine clustering patterns among data. With the implementation of large sky surveys, many clustering methods have been applied to tackle spectroscopic and photometric data…

Instrumentation and Methods for Astrophysics · Physics 2022-12-19 Haifeng Yang , Chenhui Shi , Jianghui Cai , Lichan Zhou , Yuqing Yang , Xujun Zhao , Yanting He , Jing Hao

Parallel K-Medoids++ Spatial Clustering Algorithm Based on MapReduce

Clustering analysis has received considerable attention in spatial data mining for several years. With the rapid development of the geospatial information technologies, the size of spatial information data is growing exponentially which…

Distributed, Parallel, and Cluster Computing · Computer Science 2016-08-25 Xia Yue , Wang Man , Jun Yue , Guangcao Liu

Clustering Time Series Data Stream - A Literature Survey

Mining Time Series data has a tremendous growth of interest in today's world. To provide an indication various implementations are studied and summarized to identify the different problems in existing applications. Clustering time series is…

Information Retrieval · Computer Science 2010-05-25 V. Kavitha , M. Punithavalli

Automatic Clustering in Hyrise

Physical data layout is an important performance factor for modern databases. Clustering, i.e., storing similar values in proximity, can lead to performance gains in several ways. We present an automated model to determine beneficial…

Databases · Computer Science 2021-03-30 Alexander Löser

Spectral Clustering of Categorical and Mixed-type Data via Extra Graph Nodes

Clustering data objects into homogeneous groups is one of the most important tasks in data mining. Spectral clustering is arguably one of the most important algorithms for clustering, as it is appealing for its theoretical soundness and is…

Machine Learning · Statistics 2024-03-12 Dylan Soemitro , Jeova Farias Sales Rocha Neto

On a Distributed Approach for Density-based Clustering

Efficient extraction of useful knowledge from these data is still a challenge, mainly when the data is distributed, heterogeneous and of different quality depending on its corresponding local infrastructure. To reduce the overhead cost,…

Databases · Computer Science 2017-04-17 Nhien-An Le-Khac , M-Tahar Kechadi

Clustering of Big Data with Mixed Features

Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We…

Machine Learning · Statistics 2020-11-13 Joshua Tobin , Mimi Zhang

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz