English
Related papers

Related papers: A Spectral Algorithm for Latent Dirichlet Allocati…

200 papers

We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate…

Information Retrieval · Computer Science 2015-07-24 Ashwinkumar Ganesan , Kiante Brantley , Shimei Pan , Jian Chen

In latent Dirichlet allocation (LDA), topics are multinomial distributions over the entire vocabulary. However, the vocabulary usually contains many words that are not relevant in forming the topics. We adopt a variable selection method…

Machine Learning · Computer Science 2012-05-08 Dongwoo Kim , Yeonseung Chung , Alice Oh

Topic modeling, a method for extracting the underlying themes from a collection of documents, is an increasingly important component of the design of intelligent systems enabling the sense-making of highly dynamic and diverse streams of…

Information Retrieval · Computer Science 2019-10-07 Chris Gropp , Alexander Herzog , Ilya Safro , Paul W. Wilson , Amy W. Apon

Latent Dirichlet Allocation (LDA) mining thematic structure of documents plays an important role in nature language processing and machine learning areas. However, the probability distribution from LDA only describes the statistical…

Computation and Language · Computer Science 2015-06-30 Li-Qiang Niu , Xin-Yu Dai

Standard LDA model suffers the problem that the topic assignment of each word is independent and word correlation hence is neglected. To address this problem, in this paper, we propose a model called Word Related Latent Dirichlet Allocation…

Computation and Language · Computer Science 2014-11-11 Xun Wang

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics,…

Applications · Statistics 2009-09-29 David M. Blei , John D. Lafferty

The question of how to determine the number of independent latent factors (topics) in mixture models such as Latent Dirichlet Allocation (LDA) is of great practical importance. In most applications, the exact number of topics is unknown,…

Machine Learning · Statistics 2014-01-23 E. D. Gutiérrez

Distributed dense word vectors have been shown to be effective at capturing token-level semantic and syntactic regularities in language, while topic models can form interpretable representations over documents. In this work, we describe…

Computation and Language · Computer Science 2016-05-09 Christopher E Moody

Topic models, such as Latent Dirichlet Allocation (LDA), posit that documents are drawn from admixtures of distributions over words, known as topics. The inference problem of recovering topics from admixtures, is NP-hard. Assuming…

Machine Learning · Statistics 2014-11-05 Trapit Bansal , Chiranjib Bhattacharyya , Ravindran Kannan

Traditionally, Latent Dirichlet Allocation (LDA) ingests words in a collection of documents to discover their latent topics using word-document co-occurrences. However, it is unclear how to achieve the best results for languages without…

Computation and Language · Computer Science 2021-08-25 Jin Cheevaprawatdomrong , Alexandra Schofield , Attapol T. Rutherford

For organizing large text corpora topic modeling provides useful tools. A widely used method is Latent Dirichlet Allocation (LDA), a generative probabilistic model which models single texts in a collection of texts as mixtures of latent…

Computation and Language · Computer Science 2020-04-02 Jonas Rieger , Lars Koppers , Carsten Jentsch , Jörg Rahnenführer

Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and…

Information Retrieval · Computer Science 2018-12-07 Hamed Jelodar , Yongli Wang , Chi Yuan , Xia Feng , Xiahui Jiang , Yanchao Li , Liang Zhao

Topic modeling has found wide application in many problems where latent structures of the data are crucial for typical inference tasks. When applying a topic model, a relatively standard pre-processing step is to first build a vocabulary of…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Yuzhen Ding , Baoxin Li

Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each document. Existing inference methods are based on variational approximation or Monte…

Machine Learning · Computer Science 2016-02-22 Yong Ren , Yining Wang , Jun Zhu

Word clouds became a standard tool for presenting results of natural language processing methods such as topic modelling. They exhibit most important words, where word size is often chosen proportional to the relevance of words within a…

Computation · Statistics 2023-02-14 Peter Winker

Social scientists employ latent Dirichlet allocation (LDA) to find highly specific topics in large corpora, but they often struggle in this task because (1) LDA, in general, takes a significant amount of time to fit on large corpora; (2)…

Methodology · Statistics 2025-12-23 Kohei Watanabe

In this paper, we propose guaranteed spectral methods for learning a broad range of topic models, which generalize the popular Latent Dirichlet Allocation (LDA). We overcome the limitation of LDA to incorporate arbitrary topic correlations,…

Machine Learning · Computer Science 2016-11-15 Forough Arabshahi , Animashree Anandkumar

Nowadays, data analysis has become a problem as the amount of data is constantly increasing. In order to overcome this problem in textual data, many models and methods are used in natural language processing. The topic modeling field is one…

Computation and Language · Computer Science 2021-10-22 Zekeriya Anil Guven , Banu Diri , Tolgahan Cakaloglu

As electronically stored data grow in daily life, obtaining novel and relevant information becomes challenging in text mining. Thus people have sought statistical methods based on term frequency, matrix algebra, or topic modeling for text…

Information Retrieval · Computer Science 2019-07-04 Clint P. George , Wei Xia , George Michailidis

Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document…

Machine Learning · Computer Science 2025-11-04 Biyi Fang , Truong Vo , Kripa Rajshekhar , Diego Klabjan
‹ Prev 1 2 3 10 Next ›