Machine learning approach for text and document mining

Vishwanath Bijalwan; Pinki Kumari; Jordan Pascual; Vijay Bhaskar Semwal

Machine learning approach for text and document mining

Information Retrieval 2014-06-09 v1 Machine Learning

Authors: Vishwanath Bijalwan , Pinki Kumari , Jordan Pascual , Vijay Bhaskar Semwal

Abstract

Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a single-label classification task; otherwise, it is a multi-label classification task. TC uses several tools from Information Retrieval (IR) and Machine Learning (ML) and has received much attention in the last years from both researchers in the academia and industry developers. In this paper, we first categorize the documents using KNN based machine learning approach and then return the most relevant documents.

Keywords

text classification information retrieval information extraction

Cite

@article{arxiv.1406.1580,
  title  = {Machine learning approach for text and document mining},
  author = {Vishwanath Bijalwan and Pinki Kumari and Jordan Pascual and Vijay Bhaskar Semwal},
  journal= {arXiv preprint arXiv:1406.1580},
  year   = {2014}
}

Comments

arXiv admin note: text overlap with arXiv:1003.1795, arXiv:1212.2065 by other authors

Machine learning approach for text and document mining

Abstract

Keywords

Cite

Comments

Related papers