Related papers: Conceptualization Topic Modeling

Text Modeling using Unsupervised Topic Models and Concept Hierarchies

Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can potentially discover a broad range of themes in a data set,…

Artificial Intelligence · Computer Science 2008-08-08 Chaitanya Chemudugunta , Padhraic Smyth , Mark Steyvers

Unveiling the semantic structure of text documents using paragraph-aware Topic Models

Classic Topic Models are built under the Bag Of Words assumption, in which word position is ignored for simplicity. Besides, symmetric priors are typically used in most applications. In order to easily learn topics with different properties…

Computation and Language · Computer Science 2018-06-27 Simón Roca-Sotelo , Jerónimo Arenas-García

A new evaluation framework for topic modeling algorithms based on synthetic corpora

Topic models are in widespread use in natural language processing and beyond. Here, we propose a new framework for the evaluation of probabilistic topic modeling algorithms based on synthetic corpora containing an unambiguously defined…

Computation and Language · Computer Science 2019-01-29 Hanyu Shi , Martin Gerlach , Isabel Diersen , Doug Downey , Luis A. N. Amaral

Dirichlet belief networks for topic structure learning

Recently, considerable research effort has been devoted to developing deep architectures for topic models to learn topic structures. Although several deep models have been proposed to learn better topic proportions of documents, how to…

Information Retrieval · Computer Science 2018-11-05 He Zhao , Lan Du , Wray Buntine , Mingyuan Zhou

Learning Concept Hierarchies through Probabilistic Topic Modeling

With the advent of semantic web, various tools and techniques have been introduced for presenting and organizing knowledge. Concept hierarchies are one such technique which gained significant attention due to its usefulness in creating…

Artificial Intelligence · Computer Science 2016-11-30 V. S. Anoop , S. Asharaf , P. Deepak

Knowledge-Aware Bayesian Deep Topic Model

We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. Although embedded topic models (ETMs) and its variants have gained promising performance in text analysis, they mainly focus…

Computation and Language · Computer Science 2022-09-29 Dongsheng Wang , Yishi Xu , Miaoge Li , Zhibin Duan , Chaojie Wang , Bo Chen , Mingyuan Zhou

A Bayesian approach to modeling topic-metadata relationships

The objective of advanced topic modeling is not only to explore latent topical structures, but also to estimate relationships between the discovered topics and theoretically relevant metadata. Methods used to estimate such relationships…

Computation and Language · Computer Science 2025-04-29 P. Schulze , S. Wiegrebe , P. W. Thurner , C. Heumann , M. Aßenmacher

A Query-Driven Topic Model

Topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus. It has been increasingly widely adopted as a tool in the social sciences, including political science, digital humanities and sociological…

Information Retrieval · Computer Science 2022-01-12 Zheng Fang , Yulan He , Rob Procter

Exploratory topic modeling with distributional semantics

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge…

Information Retrieval · Computer Science 2015-07-20 Samuel Rönnqvist

Keyword-based Topic Modeling and Keyword Selection

Certain type of documents such as tweets are collected by specifying a set of keywords. As topics of interest change with time it is beneficial to adjust keywords dynamically. The challenge is that these need to be specified ahead of…

Machine Learning · Statistics 2020-01-23 Xingyu Wang , Lida Zhang , Diego Klabjan

Tired of Topic Models? Clusters of Pretrained Word Embeddings Make for Fast and Good Topics too!

Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way…

Computation and Language · Computer Science 2020-10-08 Suzanna Sia , Ayush Dalmia , Sabrina J. Mielke

Discovering topics with neural topic models built from PLSA assumptions

In this paper we present a model for unsupervised topic discovery in texts corpora. The proposed model uses documents, words, and topics lookup table embedding as neural network model parameters to build probabilities of words given topics,…

Computation and Language · Computer Science 2019-11-26 Sileye 0. Ba

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models

Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features. While many multilingual topic models have been developed, their assumptions on the…

Computation and Language · Computer Science 2019-06-11 Shudong Hao , Michael J. Paul

Hierarchical Topic Presence Models

Topic models analyze text from a set of documents. Documents are modeled as a mixture of topics, with topics defined as probability distributions on words. Inferences of interest include the most probable topics and characterization of a…

Information Retrieval · Computer Science 2021-04-19 Jason Wang , Robert E. Weiss

Generalized Topic Modeling

Recently there has been significant activity in developing algorithms with provable guarantees for topic modeling. In standard topic models, a topic (such as sports, business, or politics) is viewed as a probability distribution $\vec a_i$…

Machine Learning · Computer Science 2016-11-07 Avrim Blum , Nika Haghtalab

Bilingual Topic Models for Comparable Corpora

Probabilistic topic models like Latent Dirichlet Allocation (LDA) have been previously extended to the bilingual setting. A fundamental modeling assumption in several of these extensions is that the input corpora are in the form of document…

Computation and Language · Computer Science 2021-12-01 Georgios Balikas , Massih-Reza Amini , Marianne Clausel

Visualizing Topics with Multi-Word Expressions

We describe a new method for visualizing topics, the distributions over terms that are automatically extracted from large text corpora using latent variable models. Our method finds significant $n$-grams related to a topic, which are then…

Machine Learning · Statistics 2009-07-07 David M. Blei , John D. Lafferty

Combining Thesaurus Knowledge and Probabilistic Topic Models

In this paper we present the approach of introducing thesaurus knowledge into probabilistic topic models. The main idea of the approach is based on the assumption that the frequencies of semantically related words and phrases, which are met…

Computation and Language · Computer Science 2017-08-01 Natalia Loukachevitch , Michael Nokel , Kirill Ivanov

Probabilistic Modelling of Morphologically Rich Languages

This thesis investigates how the sub-structure of words can be accounted for in probabilistic models of language. Such models play an important role in natural language processing tasks such as translation or speech recognition, but often…

Computation and Language · Computer Science 2015-08-19 Jan A. Botha

Explainable and Discourse Topic-aware Neural Language Understanding

Marrying topic models and language models exposes language understanding to a broader source of document-level context beyond sentences via topics. While introducing topical semantics in language models, existing approaches incorporate…

Computation and Language · Computer Science 2023-06-28 Yatin Chaudhary , Hinrich Schütze , Pankaj Gupta