English
Related papers

Related papers: Parallel Index-Based Structural Graph Clustering a…

200 papers

Structural clustering is one of the most popular graph clustering methods, which has achieved great performance improvement by utilizing GPUs. Even though, the state-of-the-art GPU-based structural clustering algorithm, GPUSCAN, still…

Databases · Computer Science 2023-12-01 Long Yuan , Zeyu Zhou , Xuemin Lin , Zi Chen , Xiang Zhao , Fan Zhang

Graph clustering has many important applications in computing, but due to growing sizes of graphs, even traditionally fast clustering methods such as spectral partitioning can be computationally expensive for real-world graphs of interest.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-06-11 Julian Shun , Farbod Roosta-Khorasani , Kimon Fountoulakis , Michael W. Mahoney

DBSCAN is a popular density-based clustering algorithm. It computes the $\epsilon$-neighborhood graph of a dataset and uses the connected components of the high-degree nodes to decide the clusters. However, the full neighborhood graph may…

Machine Learning · Computer Science 2020-10-23 Heinrich Jiang , Jennifer Jang , Jakub Łącki

Computing strongly connected components (SCC) is a fundamental problems in graph processing. As today's real-world graphs are getting larger and larger, parallel SCC is increasingly important. SCC is challenging in the parallel setting and…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-16 Letong Wang , Xiaojun Dong , Yan Gu , Yihan Sun

Clustering is a fundamental task in machine learning. One of the most successful and broadly used algorithms is DBSCAN, a density-based clustering algorithm. DBSCAN requires $\epsilon$-nearest neighbor graphs of the input dataset, which are…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-12 Youguang Chen , William Ruys , George Biros

We develop an algorithm that finds the consensus of many different clustering solutions of a graph. We formulate the problem as a median set partitioning problem and propose a greedy optimization technique. Unlike other approaches that find…

Information Retrieval · Computer Science 2024-08-22 Md Taufique Hussain , Mahantesh Halappanavar , Samrat Chatterjee , Filippo Radicchi , Santo Fortunato , Ariful Azad

We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…

Databases · Computer Science 2016-10-03 Till Schäfer , Petra Mutzel

Nearest Neighbor Search (NNS) has recently drawn a rapid increase of interest due to its core role in managing high-dimensional vector data in data science and AI applications. The interest is fueled by the success of neural embedding,…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-01 Zhen Peng , Minjia Zhang , Kai Li , Ruoming Jin , Bin Ren

DBSCAN is a well-known density-based clustering algorithm to discover arbitrary shape clusters. While conceptually simple in serial, the algorithm is challenging to efficiently parallelize on manycore GPU architectures. Common pitfalls,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-06-30 Andrey Prokopenko , Damien Lebrun-Grandie , Daniel Arndt

The DBSCAN method for spatial clustering has received significant attention due to its applicability in a variety of data analysis tasks. There are fast sequential algorithms for DBSCAN in Euclidean space that take $O(n\log n)$ work for two…

Data Structures and Algorithms · Computer Science 2021-01-29 Yiqiu Wang , Yan Gu , Julian Shun

Graph-based ANNS algorithms have gained increasing research interest and market adoption due to their efficiency and accuracy in retrieval. Existing approaches primarily rely on CPUs for graph index construction and retrieval, but this…

Databases · Computer Science 2026-05-12 Lan Lu , Peiqi Yin , Isaac Yang , Tao Luo , Hua Fan , Wenchao Zhou , Feifei Li , Boon Thau Loo

We describe an approach to parallel graph partitioning that scales to hundreds of processors and produces a high solution quality. For example, for many instances from Walshaw's benchmark collection we improve the best known partitioning.…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-04-08 Manuel Holtgrewe , Peter Sanders , Christian Schulz

This paper presents new parallel algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN$^*$). Our approach is based on generating a well-separated pair decomposition followed by using…

Data Structures and Algorithms · Computer Science 2021-04-05 Yiqiu Wang , Shangdi Yu , Yan Gu , Julian Shun

We design and implement parallel prefix sum (scan) algorithms using Ascend AI accelerators. Ascend accelerators feature specialized computing units: the cube units for efficient matrix multiplication and the vector units for optimized…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-01-05 Bartłomiej Wróblewski , Gioele Gottardo , Anastasios Zouzias

Similarity search is critical for many database applications, including the increasingly popular online services for Content-Based Multimedia Retrieval (CBMR). These services, which include image search engines, must handle an overwhelming…

Distributed, Parallel, and Cluster Computing · Computer Science 2013-10-16 Thiago S. F. X. Teixeira , George Teodoro , Eduardo Valle , Joel H. Saltz

The Graph Convolutional Network (GCN) model and its variants are powerful graph embedding tools for facilitating classification and clustering on graphs. However, a major challenge is to reduce the complexity of layered GCNs and make them…

Machine Learning · Computer Science 2020-08-06 Hanqing Zeng , Hongkuan Zhou , Ajitesh Srivastava , Rajgopal Kannan , Viktor Prasanna

We present an accelerated algorithm for hierarchical density based clustering. Our new algorithm improves upon HDBSCAN*, which itself provided a significant qualitative improvement over the popular DBSCAN algorithm. The accelerated HDBSCAN*…

Machine Learning · Statistics 2018-12-20 Leland McInnes , John Healy

Cluster analysis plays a crucial role in database mining, and one of the most widely used algorithms in this field is DBSCAN. However, DBSCAN has several limitations, such as difficulty in handling high-dimensional large-scale data,…

Machine Learning · Computer Science 2024-04-30 Weibing Zhao

Remarkable achievements have been attained by deep neural networks in various applications. However, the increasing depth and width of such models also lead to explosive growth in both storage and computation, which has restricted the…

Machine Learning · Computer Science 2019-06-11 Linfeng Zhang , Zhanhong Tan , Jiebo Song , Jingwei Chen , Chenglong Bao , Kaisheng Ma

Obtaining scalable algorithms for hierarchical agglomerative clustering (HAC) is of significant interest due to the massive size of real-world datasets. At the same time, efficiently parallelizing HAC is difficult due to the seemingly…

Data Structures and Algorithms · Computer Science 2022-06-24 Laxman Dhulipala , David Eisenstat , Jakub Łącki , Vahab Mirronki , Jessica Shi
‹ Prev 1 2 3 10 Next ›