English

Keyphrase Extraction using Sequential Labeling

Computation and Language 2016-08-04 v2 Artificial Intelligence Information Retrieval

Abstract

Keyphrases efficiently summarize a document's content and are used in various document processing and retrieval tasks. Several unsupervised techniques and classifiers exist for extracting keyphrases from text documents. Most of these methods operate at a phrase-level and rely on part-of-speech (POS) filters for candidate phrase generation. In addition, they do not directly handle keyphrases of varying lengths. We overcome these modeling shortcomings by addressing keyphrase extraction as a sequential labeling task in this paper. We explore a basic set of features commonly used in NLP tasks as well as predictions from various unsupervised methods to train our taggers. In addition to a more natural modeling for the keyphrase extraction problem, we show that tagging models yield significant performance benefits over existing state-of-the-art extraction methods.

Keywords

Cite

@article{arxiv.1608.00329,
  title  = {Keyphrase Extraction using Sequential Labeling},
  author = {Sujatha Das Gollapalli and Xiao-li Li},
  journal= {arXiv preprint arXiv:1608.00329},
  year   = {2016}
}

Comments

10 pages including 2 pages of references, 6 figures