English
Related papers

Related papers: Scalable Density-Based Distributed Clustering

200 papers

Efficient extraction of useful knowledge from these data is still a challenge, mainly when the data is distributed, heterogeneous and of different quality depending on its corresponding local infrastructure. To reduce the overhead cost,…

Databases · Computer Science 2017-04-17 Nhien-An Le-Khac , M-Tahar Kechadi

We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based…

Machine Learning · Computer Science 2025-08-06 Ninh Pham , Yingtao Zheng , Hugo Phibbs

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines. We propose a novel data assignment scheme that enables us to…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-17 Venkata Gandikota , Arya Mazumdar , Ankit Singh Rawat

In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases, the first phase…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

One important tool is the optimal clustering of data into useful categories. Dividing similar objects into a smaller number of clusters is of importance in many applications. These include search engines, monitoring of academic performance,…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-21 Gavriel Yarmish , Philip Listowsky , Simon Dexter

Nowadays, huge amounts of data are naturally collected in distributed sites due to different facts and moving these data through the network for extracting useful knowledge is almost unfeasible for either technical reasons or policies.…

Databases · Computer Science 2017-03-30 Lamine M. Aouad , Nhien-An Le-Khac , Tahar Kechadi

The increasing popularity of cloud computing has resulted in a proliferation of data centers. Effective placement of data centers improves network performance and minimizes clients' perceived latency. The problem of determining the optimal…

Networking and Internet Architecture · Computer Science 2018-02-06 Wuqiong Luo , Wee Peng Tay , Peng Sun , Yonggang Wen

Nowadays, with the widespread of smartphones and other portable gadgets equipped with a variety of sensors, data is ubiquitous available and the focus of machine learning has shifted from being able to infer from small training samples to…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-07-07 Radu Cristian Ionescu

Clustering algorithms are fundamental tools across many fields, with density-based methods offering particular advantages in identifying arbitrarily shaped clusters and handling noise. However, their effectiveness is often limited by the…

Machine Learning · Computer Science 2025-12-01 Meysam Shirdel Bilehsavar , Razieh Ghaedi , Samira Seyed Taheri , Xinqi Fan , Christian O'Reilly

We propose an algorithm that builds and maintains clusters over a network subject to mobility. This algorithm is fully decentralized and makes all the different clusters grow concurrently. The algorithm uses circulating tokens that collect…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-11-15 Thibault Bernard , Alain Bui , Laurence Pilard , Devan Sohier

A widely used approach to clustering a single data stream is the two-phased approach in which the online phase creates and maintains micro-clusters while the off-line phase generates the macro-clustering from the micro-clusters. We use this…

Databases · Computer Science 2012-09-20 Dang-Hoan Tran

Objective: The main objective of this paper is to construct a distributed clustering algorithm based upon spatial data correlation among sensor nodes and perform data accuracy for each distributed cluster at their respective cluster head…

Networking and Internet Architecture · Computer Science 2011-08-15 Jyotirmoy Karjee , H. S Jamadagni

This paper presents a distributed resource selection mechanism for diverse cloud-edge environments, enabling dynamic and context-aware allocation of resources to meet the demands of complex distributed applications. By distributing the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-10 Quentin Renau , Amjad Ullah , Emma Hart

In this paper, we present a new approach of distributed clustering for spatial datasets, based on an innovative and efficient aggregation technique. This distributed approach consists of two phases: 1) local clustering phase, where each…

Databases · Computer Science 2018-02-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

The applicability of agglomerative clustering, for inferring both hierarchical and flat clustering, is limited by its scalability. Existing scalable hierarchical clustering methods sacrifice quality for speed and often lead to over-merging…

Co-clustering simultaneously clusters rows and columns, revealing more fine-grained groups. However, existing co-clustering methods suffer from poor scalability and cannot handle large-scale data. This paper presents a novel and scalable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-20 Zihan Wu , Zhaoke Huang , Hong Yan

The last decades have seen a surge of interests in distributed computing thanks to advances in clustered computing and big data technology. Existing distributed algorithms typically assume {\it all the data are already in one place}, and…

Machine Learning · Computer Science 2019-05-07 Donghui Yan , Yingjie Wang , Jin Wang , Guodong Wu , Honggang Wang

The clusters of a distribution are often defined by the connected components of a density level set. However, this definition depends on the user-specified level. We address this issue by proposing a simple, generic algorithm, which uses an…

Methodology · Statistics 2015-10-29 Ingo Steinwart

Clustering analysis is of substantial significance for data mining. The properties of big data raise higher demand for more efficient and economical distributed clustering methods. However, existing distributed clustering methods mainly…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-03 Yifeng Xiao , Jiang Xue , Deyu Meng
‹ Prev 1 2 3 10 Next ›