English
Related papers

Related papers: Probit Normal Correlated Topic Models

200 papers

Topic models are popular statistical tools for detecting latent semantic topics in a text corpus. They have been utilized in various applications across different fields. However, traditional topic models have some limitations, including…

Computation and Language · Computer Science 2023-10-10 Pritom Saha Akash , Trisha Das , Kevin Chen-Chuan Chang

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics,…

Applications · Statistics 2009-09-29 David M. Blei , John D. Lafferty

Probabilistic topic models like Latent Dirichlet Allocation (LDA) have been previously extended to the bilingual setting. A fundamental modeling assumption in several of these extensions is that the input corpora are in the form of document…

Computation and Language · Computer Science 2021-12-01 Georgios Balikas , Massih-Reza Amini , Marianne Clausel

Traditional Relational Topic Models provide a way to discover the hidden topics from a document network. Many theoretical and practical tasks, such as dimensional reduction, document clustering, link prediction, benefit from this revealed…

Machine Learning · Statistics 2015-03-31 Junyu Xuan , Jie Lu , Guangquan Zhang , Richard Yi Da Xu , Xiangfeng Luo

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge…

Information Retrieval · Computer Science 2015-07-20 Samuel Rönnqvist

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) or Latent Semantic…

Machine Learning · Computer Science 2025-11-04 Satyajeet Sahoo , Jhareswar Maiti

The abundant sequential documents such as online archival, social media and news feeds are streamingly updated, where each chunk of documents is incorporated with smoothly evolving yet dependent topics. Such digital texts have attracted…

Information Retrieval · Computer Science 2021-06-28 Jinjin Guo , Longbing Cao , Zhiguo Gong

Topic models provide a useful tool to organize and understand the structure of large corpora of text documents, in particular, to discover hidden thematic structure. Clustering documents from big unstructured corpora into topics is an…

Statistics Theory · Mathematics 2021-07-09 Olga Klopp , Maxim Panov , Suzanne Sigalla , Alexandre Tsybakov

Dynamic topic modeling is widely used to analyze evolving trends in scientific literature, medical records, and social media. Traditional topic models represent each topic through a single probability vector on the multinomial simplex and…

Machine Learning · Computer Science 2026-05-28 Hanjia Gao , Hanwen Ye , Qing Nie , Annie Qu

A popular approach to topic modeling involves extracting co-occurring n-grams of a corpus into semantic themes. The set of n-grams in a theme represents an underlying topic, but most topic modeling approaches are not able to label these…

Computation and Language · Computer Science 2017-05-19 Justin Wood , Patrick Tan , Wei Wang , Corey Arnold

This study introduces Bidirectional Topic Matching (BTM), a novel method for cross-corpus topic modeling that quantifies thematic overlap and divergence between corpora. BTM is a flexible framework that can incorporate various topic…

Computation and Language · Computer Science 2024-12-25 Raven Adam , Marie Lisa Kogler

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations…

Machine Learning · Computer Science 2017-07-04 Junxian He , Zhiting Hu , Taylor Berg-Kirkpatrick , Ying Huang , Eric P. Xing

Topic modeling is a well-established technique for exploring text corpora. Conventional topic models (e.g., LDA) represent topics as bags of words that often require "reading the tea leaves" to interpret; additionally, they offer users…

Computation and Language · Computer Science 2024-04-03 Chau Minh Pham , Alexander Hoyle , Simeng Sun , Philip Resnik , Mohit Iyyer

Topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus. It has been increasingly widely adopted as a tool in the social sciences, including political science, digital humanities and sociological…

Information Retrieval · Computer Science 2022-01-12 Zheng Fang , Yulan He , Rob Procter

Most topic models are constructed under the assumption that documents follow a multinomial distribution. The Poisson distribution is an alternative distribution to describe the probability of count data. For topic modelling, the Poisson…

Computation and Language · Computer Science 2020-04-27 Jocelyn Mazarura , Alta de Waal , Pieter de Villiers

Recent years have witnessed a surge of interests of using neural topic models for automatic topic extraction from text, since they avoid the complicated mathematical derivations for model inference as in traditional topic models such as…

Computation and Language · Computer Science 2020-04-28 Rui Wang , Xuemeng Hu , Deyu Zhou , Yulan He , Yuxuan Xiong , Chenchen Ye , Haiyang Xu

As electronically stored data grow in daily life, obtaining novel and relevant information becomes challenging in text mining. Thus people have sought statistical methods based on term frequency, matrix algebra, or topic modeling for text…

Information Retrieval · Computer Science 2019-07-04 Clint P. George , Wei Xia , George Michailidis

We address two challenges in topic models: (1) Context information around words helps in determining their actual meaning, e.g., "networks" used in the contexts "artificial neural networks" vs. "biological neuron networks". Generative topic…

Computation and Language · Computer Science 2019-01-16 Pankaj Gupta , Yatin Chaudhary , Florian Buettner , Hinrich Schütze

Classic Topic Models are built under the Bag Of Words assumption, in which word position is ignored for simplicity. Besides, symmetric priors are typically used in most applications. In order to easily learn topics with different properties…

Computation and Language · Computer Science 2018-06-27 Simón Roca-Sotelo , Jerónimo Arenas-García

We proposed a novel multilayer correlated topic model (MCTM) to analyze how the main ideas inherit and vary between a document and its different segments, which helps understand an article's structure. The variational…

Information Retrieval · Computer Science 2021-01-07 Ye Tian
‹ Prev 1 2 3 10 Next ›