Random Indexing K-tree

Christopher M. De Vries; Lance De Vine; Shlomo Geva

Random Indexing K-tree

Information Retrieval 2010-02-02 v2 Artificial Intelligence Data Structures and Algorithms

Authors: Christopher M. De Vries , Lance De Vine , Shlomo Geva

Abstract

Random Indexing (RI) K-tree is the combination of two algorithms for clustering. Many large scale problems exist in document clustering. RI K-tree scales well with large inputs due to its low complexity. It also exhibits features that are useful for managing a changing collection. Furthermore, it solves previous issues with sparse document vectors when using K-tree. The algorithms and data structures are defined, explained and motivated. Specific modifications to K-tree are made for use with RI. Experiments have been executed to measure quality. The results indicate that RI K-tree improves document cluster quality over the original K-tree algorithm.

Keywords

decision tree cluster analysis random forest

Cite

@article{arxiv.1001.0833,
  title  = {Random Indexing K-tree},
  author = {Christopher M. De Vries and Lance De Vine and Shlomo Geva},
  journal= {arXiv preprint arXiv:1001.0833},
  year   = {2010}
}

Comments

8 pages, ADCS 2009; Hyperref and cleveref LaTeX packages conflicted. Removed cleveref

Random Indexing K-tree

Abstract

Keywords

Cite

Comments

Related papers