English
Related papers

Related papers: Generalized Topic Modeling

200 papers

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document…

Machine Learning · Computer Science 2022-03-16 Dongsheng Wang , Dandan Guo , He Zhao , Huangjie Zheng , Korawat Tanwisuth , Bo Chen , Mingyuan Zhou

Word embedding maps words into a low-dimensional continuous embedding space by exploiting the local word collocation patterns in a small context window. On the other hand, topic modeling maps documents onto a low-dimensional topic space, by…

Computation and Language · Computer Science 2016-08-09 Shaohua Li , Tat-Seng Chua , Jun Zhu , Chunyan Miao

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…

Information Retrieval · Computer Science 2019-07-12 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei

Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks. Generative topic models infer topic-word distributions, taking…

Information Retrieval · Computer Science 2018-08-14 Pankaj Gupta , Florian Buettner , Hinrich Schütze

Probabilistic topic models are a powerful tool for extracting latent themes from large text datasets. In many text datasets, we also observe per-document covariates (e.g., source, style, political affiliation) that act as environments that…

Computation and Language · Computer Science 2024-11-04 Dominic Sobhani , Amir Feder , David Blei

Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collection such as the Internet, there has also been great interest…

Information Retrieval · Computer Science 2012-06-18 Amit Gruber , Michal Rosen-Zvi , Yair Weiss

Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents. A number of foundational…

Machine Learning · Computer Science 2012-04-13 Sanjeev Arora , Rong Ge , Ankur Moitra

Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way…

Computation and Language · Computer Science 2020-10-08 Suzanna Sia , Ayush Dalmia , Sabrina J. Mielke

Classic Topic Models are built under the Bag Of Words assumption, in which word position is ignored for simplicity. Besides, symmetric priors are typically used in most applications. In order to easily learn topics with different properties…

Computation and Language · Computer Science 2018-06-27 Simón Roca-Sotelo , Jerónimo Arenas-García

Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can potentially discover a broad range of themes in a data set,…

Artificial Intelligence · Computer Science 2008-08-08 Chaitanya Chemudugunta , Padhraic Smyth , Mark Steyvers

Topic modeling is used for discovering latent semantic structure, usually referred to as topics, in a large collection of documents. The most widely used methods are Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis.…

Computation and Language · Computer Science 2020-08-24 Dimo Angelov

Most topic models are constructed under the assumption that documents follow a multinomial distribution. The Poisson distribution is an alternative distribution to describe the probability of count data. For topic modelling, the Poisson…

Computation and Language · Computer Science 2020-04-27 Jocelyn Mazarura , Alta de Waal , Pieter de Villiers

Topic modeling analyzes a collection of documents to learn meaningful patterns of words. However, previous topic models consider only the spelling of words and do not take into consideration the homography of words. In this study, we…

Computation and Language · Computer Science 2024-10-04 Takashi Shibuya , Takehito Utsuro

Recently, topic modeling has been widely used to discover the abstract topics in text corpora. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a…

Computation and Language · Computer Science 2017-04-10 Yi-Kun Tang , Xian-Ling Mao , Heyan Huang , Guihua Wen

Certain type of documents such as tweets are collected by specifying a set of keywords. As topics of interest change with time it is beneficial to adjust keywords dynamically. The challenge is that these need to be specified ahead of…

Machine Learning · Statistics 2020-01-23 Xingyu Wang , Lida Zhang , Diego Klabjan

Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference…

Computation and Language · Computer Science 2018-02-06 Johannes Schneider

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) or Latent Semantic…

Machine Learning · Computer Science 2025-11-04 Satyajeet Sahoo , Jhareswar Maiti

Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret.…

Computation and Language · Computer Science 2021-06-18 Federico Bianchi , Silvia Terragni , Dirk Hovy

Distributed word representations have been demonstrated to be effective in capturing semantic and syntactic regularities. Unsupervised representation learning from large unlabeled corpora can learn similar representations for those words…

Computation and Language · Computer Science 2015-12-01 Chunting Zhou , Chonglin Sun , Zhiyuan Liu , Francis C. M. Lau

Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred…

Machine Learning · Computer Science 2021-03-02 He Zhao , Dinh Phung , Viet Huynh , Yuan Jin , Lan Du , Wray Buntine
‹ Prev 1 2 3 10 Next ›