Related papers: Evaluating topic coherence measures

Modeling Topical Coherence in Discourse without Supervision

Coherence of text is an important attribute to be measured for both manually and automatically generated discourse; but well-defined quantitative metrics for it are still elusive. In this paper, we present a metric for scoring topical…

Computation and Language · Computer Science 2018-09-05 Disha Shrivastava , Abhijit Mishra , Karthik Sankaranarayanan

How to Find Strong Summary Coherence Measures? A Toolbox and a Comparative Study for Summary Coherence Measure Evaluation

Automatically evaluating the coherence of summaries is of great significance both to enable cost-efficient summarizer evaluation and as a tool for improving coherence by selecting high-scoring candidate summaries. While many different…

Computation and Language · Computer Science 2022-09-16 Julius Steen , Katja Markert

Enhance Topics Analysis based on Keywords Properties

Topic Modelling is one of the most prevalent text analysis technique used to explore and retrieve collection of documents. The evaluation of the topic model algorithms is still a very challenging tasks due to the absence of gold-standard…

Information Retrieval · Computer Science 2022-03-10 Antonio Penta

Objectifying the Subjective: Cognitive Biases in Topic Interpretations

Interpretation of topics is crucial for their downstream applications. State-of-the-art evaluation measures of topic quality such as coherence and word intrusion do not measure how much a topic facilitates the exploration of a corpus. To…

Computation and Language · Computer Science 2025-07-28 Swapnil Hingmire , Ze Shi Li , Shiyu , Zeng , Ahmed Musa Awon , Luiz Franciscatto Guerra , Neil Ernst

Hierarchical Re-estimation of Topic Models for Measuring Topical Diversity

A high degree of topical diversity is often considered to be an important characteristic of interesting text documents. A recent proposal for measuring topical diversity identifies three elements for assessing diversity: words, topics, and…

Information Retrieval · Computer Science 2017-01-17 Hosein Azarbonyad , Mostafa Dehghani , Tom Kenter , Maarten Marx , Jaap Kamps , Maarten de Rijke

A Comprehensive Comparative Study of Word and Sentence Similarity Measures

Sentence similarity is considered the basis of many natural language tasks such as information retrieval, question answering and text summarization. The semantic meaning between compared text fragments is based on the words semantic…

Information Retrieval · Computer Science 2016-10-17 Issa Atoum , Ahmed Otoom , Narayanan Kulathuramaiyer

Automatic Evaluation of Local Topic Quality

Topic models are typically evaluated with respect to the global topic distributions that they generate, using metrics such as coherence, but without regard to local (token-level) topic assignments. Token-level assignments are important for…

Information Retrieval · Computer Science 2019-05-31 Jeffrey Lund , Piper Armstrong , Wilson Fearn , Stephen Cowley , Courtni Byun , Jordan Boyd-Graber , Kevin Seppi

Estimating coherence measures from limited experimental data available

Quantifying coherence has received increasing attention, and considerable work has been directed towards finding coherence measures. While various coherence measures have been proposed in theory, an important issue following is how to…

Quantum Physics · Physics 2018-05-02 Da-Jian Zhang , C. L. Liu , Xiao-Dong Yu , D. M. Tong

A modified model for topic detection from a corpus and a new metric evaluating the understandability of topics

This paper presents a modified neural model for topic detection from a corpus and proposes a new metric to evaluate the detected topics. The new model builds upon the embedded topic model incorporating some modifications such as document…

Computation and Language · Computer Science 2023-06-09 Tomoya Kitano , Yuto Miyatake , Daisuke Furihata

Topic Modeling based on Keywords and Context

Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference…

Computation and Language · Computer Science 2018-02-06 Johannes Schneider

Optimized Tracking of Topic Evolution

Topic evolution modeling has been researched for a long time and has gained considerable interest. A state-of-the-art method has been recently using word modeling algorithms in combination with community detection mechanisms to achieve…

Computation and Language · Computer Science 2019-12-17 Patrick Kiss , Elaheh Momeni

Quantifying Coherence

We introduce a rigorous framework for the quantification of coherence and identify intuitive and easily computable measures of coherence. We achieve this by adopting the viewpoint of coherence as a physical resource. By determining defining…

Quantum Physics · Physics 2014-10-07 T. Baumgratz , M. Cramer , M. B. Plenio

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…

Machine Learning · Computer Science 2022-03-16 Dongsheng Wang , Dandan Guo , He Zhao , Huangjie Zheng , Korawat Tanwisuth , Bo Chen , Mingyuan Zhou

Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret.…

Computation and Language · Computer Science 2021-06-18 Federico Bianchi , Silvia Terragni , Dirk Hovy

Scientometrics: Untangling the topics

Measuring science is based on comparing articles to similar others. However, keyword-based groups of thematically similar articles are dominantly small. These small sizes keep the statistical errors of comparisons high. With the growing…

Digital Libraries · Computer Science 2014-11-13 Adam Szanto-Varnagy , Peter Pollner , Tamas Vicsek , Illes J. Farkas

Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence

Topic model evaluation, like evaluation of other unsupervised methods, can be contentious. However, the field has coalesced around automated estimates of topic coherence, which rely on the frequency of word co-occurrences in a reference…

Computation and Language · Computer Science 2021-10-29 Alexander Hoyle , Pranav Goel , Denis Peskov , Andrew Hian-Cheong , Jordan Boyd-Graber , Philip Resnik

Coherence-Aware Neural Topic Modeling

Topic models are evaluated based on their ability to describe documents well (i.e. low perplexity) and to produce topics that carry coherent semantic meaning. In topic modeling so far, perplexity is a direct optimization target. However,…

Computation and Language · Computer Science 2018-09-11 Ran Ding , Ramesh Nallapati , Bing Xiang

Topic Modelling: Going Beyond Token Outputs

Topic modelling is a text mining technique for identifying salient themes from a number of documents. The output is commonly a set of topics consisting of isolated tokens that often co-occur in such documents. Manual effort is often…

Computation and Language · Computer Science 2024-04-26 Lowri Williams , Eirini Anthi , Laura Arman , Pete Burnap

A Topic Coverage Approach to Evaluation of Topic Models

Topic models are widely used unsupervised models capable of learning topics - weighted lists of words and documents - from large collections of text documents. When topic models are used for discovery of topics in text collections, a…

Information Retrieval · Computer Science 2021-09-03 Damir Korenčić , Strahil Ristov , Jelena Repar , Jan Šnajder

Measuring Semantic Coherence of a Conversation

Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation.…

Computation and Language · Computer Science 2018-06-19 Svitlana Vakulenko , Maarten de Rijke , Michael Cochez , Vadim Savenkov , Axel Polleres