English
Related papers

Related papers: Embedded Topic Models Enhanced by Wikification

200 papers

Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea.…

Computation and Language · Computer Science 2016-12-26 Shraey Bhatia , Jey Han Lau , Timothy Baldwin

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…

Information Retrieval · Computer Science 2019-07-12 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret.…

Computation and Language · Computer Science 2021-06-18 Federico Bianchi , Silvia Terragni , Dirk Hovy

Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way…

Computation and Language · Computer Science 2020-10-08 Suzanna Sia , Ayush Dalmia , Sabrina J. Mielke

Topic modeling analyzes documents to learn meaningful patterns of words. For documents collected in sequence, dynamic topic models capture how these patterns vary over time. We develop the dynamic embedded topic model (D-ETM), a generative…

Computation and Language · Computer Science 2019-10-14 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei

Topic models and all their variants analyse text by learning meaningful representations through word co-occurrences. As pointed out by Williamson et al. (2010), such models implicitly assume that the probability of a topic to be active and…

Computation and Language · Computer Science 2023-01-27 Kostadin Cvejoski , Ramsés J. Sánchez , César Ojeda

Topic models aim to reveal latent structures within a corpus of text, typically through the use of term-frequency statistics over bag-of-words representations from documents. In recent years, conceptual entities -- interpretable,…

Computation and Language · Computer Science 2024-08-27 Manuel V. Loureiro , Steven Derby , Tri Kurniawan Wijaya

Topic modeling is a powerful technique to discover hidden topics and patterns within a collection of documents without prior knowledge. Traditional topic modeling and clustering-based techniques encounter challenges in capturing contextual…

Computation and Language · Computer Science 2024-10-04 Melkamu Abay Mersha , Mesay Gemeda yigezu , Jugal Kalita

Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred…

Machine Learning · Computer Science 2021-03-02 He Zhao , Dinh Phung , Viet Huynh , Yuan Jin , Lan Du , Wray Buntine

By illuminating latent structures in a corpus of text, topic models are an essential tool for categorizing, summarizing, and exploring large collections of documents. Probabilistic topic models, such as latent Dirichlet allocation (LDA),…

Information Retrieval · Computer Science 2021-12-07 Bahareh Harandizadeh , J. Hunter Priniski , Fred Morstatter

Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference…

Computation and Language · Computer Science 2018-02-06 Johannes Schneider

Topic models are frequently used in machine learning owing to their high interpretability and modular structure. However, extending a topic model to include a supervisory signal, to incorporate pre-trained word embedding vectors and to…

Machine Learning · Statistics 2019-09-17 Ryohei Hisano

We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. Although embedded topic models (ETMs) and its variants have gained promising performance in text analysis, they mainly focus…

Computation and Language · Computer Science 2022-09-29 Dongsheng Wang , Yishi Xu , Miaoge Li , Zhibin Duan , Chaojie Wang , Bo Chen , Mingyuan Zhou

Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task for many content analysis tasks, such as content charactering, user interest profiling, and emerging topic detecting. Existing methods such…

Computation and Language · Computer Science 2016-09-28 Jipeng Qiang , Ping Chen , Tong Wang , Xindong Wu

Topic modeling is a widely used approach for analyzing and exploring large document collections. Recent research efforts have incorporated pre-trained contextualized language models, such as BERT embeddings, into topic modeling. However,…

Computation and Language · Computer Science 2025-02-18 Suman Adhya , Debarshi Kumar Sanyal

We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent…

Computation and Language · Computer Science 2014-09-29 Arun S. Maiya , Robert M. Rolfe

Topic models are often used to identify human-interpretable topics to help make sense of large document collections. We use knowledge distillation to combine the best attributes of probabilistic topic models and pretrained transformers. Our…

Computation and Language · Computer Science 2020-10-07 Alexander Hoyle , Pranav Goel , Philip Resnik

Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks. Generative topic models infer topic-word distributions, taking…

Information Retrieval · Computer Science 2018-08-14 Pankaj Gupta , Florian Buettner , Hinrich Schütze

Embedding-based neural topic models could explicitly represent words and topics by embedding them to a homogeneous feature space, which shows higher interpretability. However, there are no explicit constraints for the training of…

Computation and Language · Computer Science 2022-06-17 Wei Shao , Lei Huang , Shuqi Liu , Shihua Ma , Linqi Song

In recent years, Pre-trained Language Models (PLMs) have shown their superiority by pre-training on unstructured text corpus and then fine-tuning on downstream tasks. On entity-rich textual resources like Wikipedia, Knowledge-Enhanced PLMs…

Computation and Language · Computer Science 2023-05-04 Yichuan Li , Jialong Han , Kyumin Lee , Chengyuan Ma , Benjamin Yao , Derek Liu
‹ Prev 1 2 3 10 Next ›