Related papers: Fast Mapping onto Census Blocks

Efficient Large Scale Clustering based on Data Partitioning

Clustering techniques are very attractive for extracting and identifying patterns in datasets. However, their application to very large spatial datasets presents numerous challenges such as high-dimensionality data, heterogeneity, and high…

Databases · Computer Science 2018-02-27 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

A Scalable Approach to Clustering Embedding Projections

Interactive visualization of embedding projections is a useful technique for understanding data and evaluating machine learning models. Labeling data within these visualizations is critical for interpretation, as labels provide an overview…

Human-Computer Interaction · Computer Science 2025-05-20 Donghao Ren , Fred Hohman , Dominik Moritz

LDC-Net: A Unified Framework for Localization, Detection and Counting in Dense Crowds

The rapid development in visual crowd analysis shows a trend to count people by positioning or even detecting, rather than simply summing a density map. It also enlightens us back to the essence of the field, detection to count, which can…

Computer Vision and Pattern Recognition · Computer Science 2021-10-12 Qi wang , Tao Han , Junyu Gao , Yuan Yuan , Xuelong Li

Fast communication-efficient spectral clustering over distributed data

The last decades have seen a surge of interests in distributed computing thanks to advances in clustered computing and big data technology. Existing distributed algorithms typically assume {\it all the data are already in one place}, and…

Machine Learning · Computer Science 2019-05-07 Donghui Yan , Yingjie Wang , Jin Wang , Guodong Wu , Honggang Wang

Fast clustering for scalable statistical analysis on structured images

The use of brain images as markers for diseases or behavioral differences is challenged by the small effects size and the ensuing lack of power, an issue that has incited researchers to rely more systematically on large cohorts. Coupled…

Machine Learning · Statistics 2015-11-17 Bertrand Thirion , Andrés Hoyos-Idrobo , Jonas Kahn , Gael Varoquaux

Faster Parallel Exact Density Peaks Clustering

Clustering multidimensional points is a fundamental data mining task, with applications in many fields, such as astronomy, neuroscience, bioinformatics, and computer vision. The goal of clustering algorithms is to group similar objects…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-05-22 Yihao Huang , Shangdi Yu , Julian Shun

Distributed Spatial Data Clustering as a New Approach for Big Data Analysis

In this paper we propose a new approach for Big Data mining and analysis. This new approach works well on distributed datasets and deals with data clustering task of the analysis. The approach consists of two main phases, the first phase…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-03-05 Malika Bendechache , Nhien-An Le-Khac , M-Tahar Kechadi

Fast consensus clustering in complex networks

Algorithms for community detection are usually stochastic, leading to different partitions for different choices of random seeds. Consensus clustering has proven to be an effective technique to derive more stable and accurate partitions…

Physics and Society · Physics 2019-04-23 Aditya Tandon , Aiiad Albeshri , Vijey Thayananthan , Wadee Alhalabi , Santo Fortunato

Scalable Varied-Density Clustering via Graph Propagation

We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based…

Machine Learning · Computer Science 2025-08-06 Ninh Pham , Yingtao Zheng , Hugo Phibbs

Scaling Graph Clustering with Distributed Sketches

The unsupervised learning of community structure, in particular the partitioning vertices into clusters or communities, is a canonical and well-studied problem in exploratory graph analysis. However, like most graph analyses the…

Machine Learning · Computer Science 2020-07-27 Benjamin W. Priest , Alec Dunton , Geoffrey Sanders

Modeling and Monitoring of Indoor Populations using Sparse Positioning Data (Extension)

In large venues like shopping malls and airports, knowledge on the indoor populations fuels applications such as business analytics, venue management, and safety control. In this work, we provide means of modeling populations in partitions…

Databases · Computer Science 2024-10-30 Xiao Li , Huan Li , Hua Lu , Christian S. Jensen

Large Scale Spectral Clustering Using Approximate Commute Time Embedding

Spectral clustering is a novel clustering method which can detect complex shapes of data clusters. However, it requires the eigen decomposition of the graph Laplacian matrix, which is proportion to $O(n^3)$ and thus is not suitable for…

Machine Learning · Computer Science 2013-07-02 Nguyen Lu Dang Khoa , Sanjay Chawla

A Fast and Efficient Incremental Approach toward Dynamic Community Detection

Community detection is a discovery tool used by network scientists to analyze the structure of real-world networks. It seeks to identify natural divisions that may exist in the input networks that partition the vertices into coherent…

Social and Information Networks · Computer Science 2019-09-24 Neda Zarayeneh , Ananth Kalyanaraman

Fast k-NN search

Efficient index structures for fast approximate nearest neighbor queries are required in many applications such as recommendation systems. In high-dimensional spaces, many conventional methods suffer from excessive usage of memory and slow…

Machine Learning · Statistics 2019-04-24 Ville Hyvönen , Teemu Pitkänen , Sotiris Tasoulis , Elias Jääsaari , Risto Tuomainen , Liang Wang , Jukka Corander , Teemu Roos

StruClus: Structural Clustering of Large-Scale Graph Databases

We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the…

Databases · Computer Science 2016-10-03 Till Schäfer , Petra Mutzel

Indexing and Partitioning the Spatial Linear Model for Large Data Sets

We consider four main goals when fitting spatial linear models: 1) estimating covariance parameters, 2) estimating fixed effects, 3) kriging (making point predictions), and 4) block-kriging (predicting the average value over a region). Each…

Methodology · Statistics 2023-05-16 Jay M. Ver Hoef , Michael Dumelle , Matt Higham , Erin E. Peterson , Daniel J. Isaak

Fast event-based epidemiological simulations on national scales

We present a computational modeling framework for data-driven simulations and analysis of infectious disease spread in large populations. For the purpose of efficient simulations, we devise a parallel solution algorithm targeting…

Populations and Evolution · Quantitative Biology 2018-02-19 Pavol Bauer , Stefan Engblom , Stefan Widgren

KNN-DBSCAN: a DBSCAN in high dimensions

Clustering is a fundamental task in machine learning. One of the most successful and broadly used algorithms is DBSCAN, a density-based clustering algorithm. DBSCAN requires $\epsilon$-nearest neighbor graphs of the input dataset, which are…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-12 Youguang Chen , William Ruys , George Biros

Fast Planning Over Roadmaps via Selective Densification

We propose the Selective Densification method for fast motion planning through configuration space. We create a sequence of roadmaps by iteratively adding configurations. We organize these roadmaps into layers and add edges between…

Robotics · Computer Science 2020-02-13 Brad Saund , Dmitry Berenson

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our…

Machine Learning · Computer Science 2022-03-30 Georgios Exarchakis , Omar Oubari , Gregor Lenz