English

An efficient density-based clustering algorithm using reverse nearest neighbour

Machine Learning 2018-11-20 v1 Machine Learning

Abstract

Density-based clustering is the task of discovering high-density regions of entities (clusters) that are separated from each other by contiguous regions of low-density. DBSCAN is, arguably, the most popular density-based clustering algorithm. However, its cluster recovery capabilities depend on the combination of the two parameters. In this paper we present a new density-based clustering algorithm which uses reverse nearest neighbour (RNN) and has a single parameter. We also show that it is possible to estimate a good value for this parameter using a clustering validity index. The RNN queries enable our algorithm to estimate densities taking more than a single entity into account, and to recover clusters that are not well-separated or have different densities. Our experiments on synthetic and real-world data sets show our proposed algorithm outperforms DBSCAN and its recent variant ISDBSCAN.

Keywords

Cite

@article{arxiv.1811.07615,
  title  = {An efficient density-based clustering algorithm using reverse nearest neighbour},
  author = {Stiphen Chowdhury and Renato Cordeiro de Amorim},
  journal= {arXiv preprint arXiv:1811.07615},
  year   = {2018}
}

Comments

Accepted in: Computing Conference 2019 in London, UK. http://saiconference.com/Computing