English

A Fast Greedy Algorithm for Outlier Mining

Databases 2007-05-23 v1 Artificial Intelligence

Abstract

The task of outlier detection is to find small groups of data objects that are exceptional when compared with rest large amount of data. In [38], the problem of outlier detection in categorical data is defined as an optimization problem and a local-search heuristic based algorithm (LSA) is presented. However, as is the case with most iterative type algorithms, the LSA algorithm is still very time-consuming on very large datasets. In this paper, we present a very fast greedy algorithm for mining outliers under the same optimization model. Experimental results on real datasets and large synthetic datasets show that: (1) Our algorithm has comparable performance with respect to those state-of-art outlier detection algorithms on identifying true outliers and (2) Our algorithm can be an order of magnitude faster than LSA algorithm.

Keywords

Cite

@article{arxiv.cs/0507065,
  title  = {A Fast Greedy Algorithm for Outlier Mining},
  author = {Zengyou He and Xiaofei Xu and Shengchun Deng},
  journal= {arXiv preprint arXiv:cs/0507065},
  year   = {2007}
}

Comments

11 pages