English

Genetic Programming for Document Segmentation and Region Classification Using Discipulus

Computer Vision and Pattern Recognition 2013-03-05 v1 Neural and Evolutionary Computing

Abstract

Document segmentation is a method of rending the document into distinct regions. A document is an assortment of information and a standard mode of conveying information to others. Pursuance of data from documents involves ton of human effort, time intense and might severely prohibit the usage of data systems. So, automatic information pursuance from the document has become a big issue. It is been shown that document segmentation will facilitate to beat such problems. This paper proposes a new approach to segment and classify the document regions as text, image, drawings and table. Document image is divided into blocks using Run length smearing rule and features are extracted from every blocks. Discipulus tool has been used to construct the Genetic programming based classifier model and located 97.5% classification accuracy.

Keywords

Cite

@article{arxiv.1303.0460,
  title  = {Genetic Programming for Document Segmentation and Region Classification Using Discipulus},
  author = {N. Priyadharshini and M. S. Vijaya},
  journal= {arXiv preprint arXiv:1303.0460},
  year   = {2013}
}

Comments

8 pages,13 figures

R2 v1 2026-06-21T23:35:37.901Z