Structuring an unordered text document

Shashank Yadav; Tejas Shimpi; C. Ravindranath Chowdary; Prashant Sharma; Deepansh Agrawal; Shivang Agarwal

Structuring an unordered text document

Information Retrieval 2019-01-30 v1 Computation and Language

Authors: Shashank Yadav , Tejas Shimpi , C. Ravindranath Chowdary , Prashant Sharma , Deepansh Agrawal , Shivang Agarwal

View on arXiv ↗ PDF ↗

Abstract

Segmenting an unordered text document into different sections is a very useful task in many text processing applications like multiple document summarization, question answering, etc. This paper proposes structuring of an unordered text document based on the keywords in the document. We test our approach on Wikipedia documents using both statistical and predictive methods such as the TextRank algorithm and Google's USE (Universal Sentence Encoder). From our experimental results, we show that the proposed model can effectively structure an unordered document into sections.

Keywords

text classification information retrieval text summarization

Cite

@article{arxiv.1901.10133,
  title  = {Structuring an unordered text document},
  author = {Shashank Yadav and Tejas Shimpi and C. Ravindranath Chowdary and Prashant Sharma and Deepansh Agrawal and Shivang Agarwal},
  journal= {arXiv preprint arXiv:1901.10133},
  year   = {2019}
}

Structuring an unordered text document

Abstract

Keywords

Cite

Related papers