English
Related papers

Related papers: Variance-based Clustering Technique for Distribute…

200 papers

Efficient extraction of useful knowledge from these data is still a challenge, mainly when the data is distributed, heterogeneous and of different quality depending on its corresponding local infrastructure. To reduce the overhead cost,…

Databases · Computer Science 2017-04-17 Nhien-An Le-Khac , M-Tahar Kechadi

Distributed data mining techniques and mainly distributed clustering are widely used in the last decade because they deal with very large and heterogeneous datasets which cannot be gathered centrally. Current distributed clustering…

Databases · Computer Science 2018-02-02 Malika Bendechache , M-Tahar Kechadi

In this paper, we present distributed generalized clustering algorithms that can handle large scale data across multiple machines in spite of straggling or unreliable machines. We propose a novel data assignment scheme that enables us to…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-03-17 Venkata Gandikota , Arya Mazumdar , Ankit Singh Rawat

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of…

Data Structures and Algorithms · Computer Science 2019-04-12 He Sun , Luca Zanetti

Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex…

Databases · Computer Science 2014-09-24 Eshref Januzaj , Hans-Peter Kriegel , Martin Pfeifle

Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of…

Data Structures and Algorithms · Computer Science 2017-11-06 He Sun , Luca Zanetti

In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases, the first phase…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

The problem of optimizing distributed database includes: fragmentation and positioning data. Several different approaches and algorithms have been proposed to solve this problem. In this paper, we propose an algorithm that builds the…

Databases · Computer Science 2015-05-08 Van Nghia Luong , Ha Huy Cuong Nguyen , Van Son Le

We propose a simple, projection-based algorithm for clustering mixtures of discrete (Bernoulli) distributions. Unlike previous approaches that rely on coordinate-specific ``combinatorial projections,'' our algorithm is rotationally…

Data Structures and Algorithms · Computer Science 2026-04-28 Pradipta Mitra

In this paper, we present a new approach of distributed clustering for spatial datasets, based on an innovative and efficient aggregation technique. This distributed approach consists of two phases: 1) local clustering phase, where each…

Databases · Computer Science 2018-02-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

As a kind of basic machine learning method, clustering algorithms group data points into different categories based on their similarity or distribution. We present a clustering algorithm by finding hyper-planes to distinguish the data…

Computer Vision and Pattern Recognition · Computer Science 2020-04-28 Luhong Diao , Jinying Gao1 , Manman Deng

Data mining algorithms are originally designed by assuming the data is available at one centralized site.These algorithms also assume that the whole data is fit into main memory while running the algorithm. But in today's scenario the data…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-12-03 Aruna Govada , Bhavul Gauri , S. K. Sahay

We introduce and address a novel distributed clustering problem where each participant has a private dataset containing only a subset of all available features, and some features are included in multiple datasets. This scenario occurs in…

Data Structures and Algorithms · Computer Science 2025-10-14 Alessio Maritan , Luca Schenato

Stochastic variance reduced methods have gained a lot of interest recently for empirical risk minimization due to its appealing run time complexity. When the data size is large and disjointly stored on different machines, it becomes…

Machine Learning · Computer Science 2020-08-26 Shicong Cen , Huishuai Zhang , Yuejie Chi , Wei Chen , Tie-Yan Liu

Link discovery is an active field of research to support data integration in the Web of Data. Due to the huge size and number of available data sources, efficient and effective link discovery is a very challenging task. Common pairwise link…

Databases · Computer Science 2017-08-31 Markus Nentwig , Anika Groß , Maximilian Möller , Erhard Rahm

The problem of automatically clustering data is an age old problem. People have created numerous algorithms to tackle this problem. The execution time of any of this algorithm grows with the number of input points and the number of cluster…

Machine Learning · Computer Science 2014-12-08 Aditya AV Sastry , Kalyan Netti

Spectral clustering approaches have led to well-accepted algorithms for finding accurate clusters in a given dataset. However, their application to large-scale datasets has been hindered by computational complexity of eigenvalue…

Machine Learning · Computer Science 2016-03-17 Shahzad Bhatti , Carolyn Beck , Angelia Nedic

This paper studies a class of distributed optimization problems with coupled equality constraints in networked systems. Many existing distributed algorithms rely on solving local subproblems via the $\operatorname{argmin}$ operator in each…

Optimization and Control · Mathematics 2025-11-26 Chenyang Qiu , Zongli Lin

We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based…

Machine Learning · Computer Science 2025-08-06 Ninh Pham , Yingtao Zheng , Hugo Phibbs
‹ Prev 1 2 3 10 Next ›