Related papers: Learning Document-Level Semantic Properties from F…

Mining the Web for Lexical Knowledge to Improve Keyphrase Extraction: Learning from Labeled and Unlabeled Data

Keyphrases are useful for a variety of purposes, including summarizing, indexing, labeling, categorizing, clustering, highlighting, browsing, and searching. The task of automatic keyphrase extraction is to select keyphrases from within the…

Machine Learning · Computer Science 2007-05-23 Peter D. Turney

Content-Based Quality Estimation for Automatic Subject Indexing of Short Texts under Precision and Recall Constraints

Semantic annotations have to satisfy quality constraints to be useful for digital libraries, which is particularly challenging on large and diverse datasets. Confidence scores of multi-label classification methods typically refer only to…

Information Retrieval · Computer Science 2018-06-08 Martin Toepfer , Christin Seifert

Improving Label Quality by Jointly Modeling Items and Annotators

We propose a fully Bayesian framework for learning ground truth labels from noisy annotators. Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and…

Artificial Intelligence · Computer Science 2021-06-22 Tharindu Cyril Weerasooriya , Alexander G. Ororbia , Christopher M. Homan

Keyphrase Annotation with Graph Co-Ranking

Keyphrase annotation is the task of identifying textual units that represent the main content of a document. Keyphrase annotation is either carried out by extracting the most important phrases from a document, keyphrase extraction, or by…

Computation and Language · Computer Science 2016-11-08 Adrien Bougouin , Florian Boudin , Béatrice Daille

Auto-Annotation Quality Prediction for Semi-Supervised Learning with Ensembles

Auto-annotation by ensemble of models is an efficient method of learning on unlabeled data. Wrong or inaccurate annotations generated by the ensemble may lead to performance degradation of the trained model. To deal with this problem we…

Computer Vision and Pattern Recognition · Computer Science 2024-03-14 Dror Simon , Miriam Farber , Roman Goldenberg

Bayesian Methods for Semi-supervised Text Annotation

Human annotations are an important source of information in the development of natural language understanding approaches. As under the pressure of productivity annotators can assign different labels to a given text, the quality of produced…

Computation and Language · Computer Science 2020-10-29 Kristian Miok , Gregor Pirs , Marko Robnik-Sikonja

Learning Semantic Correspondence with Sparse Annotations

Finding dense semantic correspondence is a fundamental problem in computer vision, which remains challenging in complex scenes due to background clutter, extreme intra-class variation, and a severe lack of ground truth. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2022-08-18 Shuaiyi Huang , Luyu Yang , Bo He , Songyang Zhang , Xuming He , Abhinav Shrivastava

Discovering Semantic Latent Structures in Psychological Scales: A Response-Free Pathway to Efficient Simplification

Psychological scale refinement traditionally relies on response-based methods such as factor analysis, item response theory, and network psychometrics to optimize item composition. Although rigorous, these approaches require large samples…

Computation and Language · Computer Science 2026-03-10 Bo Wang , Yuxuan Zhang , Yueqin Hu , Hanchao Hou , Kaiping Peng , Shiguang Ni

Semi-Supervised Learning for Neural Keyphrase Generation

We study the problem of generating keyphrases that summarize the key points for a given document. While sequence-to-sequence (seq2seq) models have achieved remarkable performance on this task (Meng et al., 2017), model training often relies…

Computation and Language · Computer Science 2019-09-09 Hai Ye , Lu Wang

Modeling Social Annotation: a Bayesian Approach

Collaborative tagging systems, such as Delicious, CiteULike, and others, allow users to annotate resources, e.g., Web pages or scientific papers, with descriptive labels called tags. The social annotations contributed by thousands of users,…

Artificial Intelligence · Computer Science 2010-05-28 Anon Plangprasopchok , Kristina Lerman

Exploring Latent Semantic Factors to Find Useful Product Reviews

Online reviews provided by consumers are a valuable asset for e-Commerce platforms, influencing potential consumers in making purchasing decisions. However, these reviews are of varying quality, with the useful ones buried deep within a…

Artificial Intelligence · Computer Science 2017-05-09 Subhabrata Mukherjee , Kashyap Popat , Gerhard Weikum

Implicit Knowledge in Argumentative Texts: An Annotated Corpus

When speaking or writing, people omit information that seems clear and evident, such that only part of the message is expressed in words. Especially in argumentative texts it is very common that (important) parts of the argument are implied…

Computation and Language · Computer Science 2019-12-24 Maria Becker , Katharina Korfhage , Anette Frank

Enhancing Automatic Keyphrase Labelling with Text-to-Text Transfer Transformer (T5) Architecture: A Framework for Keyphrase Generation and Filtering

Automatic keyphrase labelling stands for the ability of models to retrieve words or short phrases that adequately describe documents' content. Previous work has put much effort into exploring extractive techniques to address this task;…

Information Retrieval · Computer Science 2024-09-26 Jorge Gabín , M. Eduardo Ares , Javier Parapar

Modeling Loosely Annotated Images with Imagined Annotations

In this paper, we present an approach to learning latent semantic analysis models from loosely annotated images for automatic image annotation and indexing. The given annotation in training images is loose due to: (1) ambiguous…

Information Retrieval · Computer Science 2008-05-30 Hong Tang , Nozha Boujemma , Yunhao Chen

A Joint Learning Approach based on Self-Distillation for Keyphrase Extraction from Scientific Documents

Keyphrase extraction is the task of extracting a small set of phrases that best describe a document. Most existing benchmark datasets for the task typically have limited numbers of annotated documents, making it challenging to train…

Computation and Language · Computer Science 2020-10-26 Tuan Manh Lai , Trung Bui , Doo Soon Kim , Quan Hung Tran

Understanding Archives: Towards New Research Interfaces Relying on the Semantic Annotation of Documents

The digitisation campaigns carried out by libraries and archives in recent years have facilitated access to documents in their collections. However, exploring and exploiting these documents remain difficult tasks due to the sheer quantity…

Digital Libraries · Computer Science 2024-03-29 Nicolas Gutehrlé , Iana Atanassova

Pre-Trained Vision-Language Models as Partial Annotators

Pre-trained vision-language models learn massive data to model unified representations of images and natural languages, which can be widely applied to downstream machine learning tasks. In addition to zero-shot inference, in order to better…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Qian-Wei Wang , Yuqiu Xie , Letian Zhang , Zimo Liu , Shu-Tao Xia

An Analysis of the Semantic Annotation Task on the Linked Data Cloud

Semantic annotation, the process of identifying key-phrases in texts and linking them to concepts in a knowledge base, is an important basis for semantic information retrieval and the Semantic Web uptake. Despite the emergence of semantic…

Computation and Language · Computer Science 2018-11-15 Gagnon Michel , Zouaq Amal , Aranha Francisco , Ensan Faezeh , Jean-Louis Ludovic

Discovering Attribute Shades of Meaning with the Crowd

To learn semantic attributes, existing methods typically train one discriminative model for each word in a vocabulary of nameable properties. However, this "one model per word" assumption is problematic: while a word might have a precise…

Computer Vision and Pattern Recognition · Computer Science 2015-05-18 Adriana Kovashka , Kristen Grauman

Patent Sentiment Analysis to Highlight Patent Paragraphs

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any…

Machine Learning · Computer Science 2021-11-19 Renukswamy Chikkamath , Vishvapalsinhji Ramsinh Parmar , Christoph Hewel , Markus Endres