English
Related papers

Related papers: Combinatorial Topic Models using Small-Variance As…

200 papers

In this paper, we propose guaranteed spectral methods for learning a broad range of topic models, which generalize the popular Latent Dirichlet Allocation (LDA). We overcome the limitation of LDA to incorporate arbitrary topic correlations,…

Machine Learning · Computer Science 2016-11-15 Forough Arabshahi , Animashree Anandkumar

Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents. However, classic topic modelling approaches (e.g., LDA) have certain…

Computation and Language · Computer Science 2024-03-27 Yida Mu , Chun Dong , Kalina Bontcheva , Xingyi Song

This article describes posterior maximization for topic models, identifying computational and conceptual gains from inference under a non-standard parametrization. We then show that fitted parameters can be used as the basis for a novel…

Applications · Statistics 2011-12-30 Matthew A. Taddy

Word clouds became a standard tool for presenting results of natural language processing methods such as topic modelling. They exhibit most important words, where word size is often chosen proportional to the relevance of words within a…

Computation · Statistics 2023-02-14 Peter Winker

As the emergence and the thriving development of social networks, a huge number of short texts are accumulated and need to be processed. Inferring latent topics of collected short texts is useful for understanding its hidden structure and…

Machine Learning · Statistics 2018-04-04 Zhenghang Cui , Issei Sato , Masashi Sugiyama

An important aspect of text mining involves information retrieval in form of discovery of semantic themes (topics) from documents using topic modelling. While generative topic models like Latent Dirichlet Allocation (LDA) or Latent Semantic…

Machine Learning · Computer Science 2025-11-04 Satyajeet Sahoo , Jhareswar Maiti

The tremendous growth of social media content on the Internet has inspired the development of the text analytics to understand and solve real-life problems. Leveraging statistical topic modelling helps researchers and practitioners in…

Social and Information Networks · Computer Science 2016-08-09 Marina Sokolova , Kanyi Huang , Stan Matwin , Joshua Ramisch , Vera Sazonova , Renee Black , Chris Orwa , Sidney Ochieng , Nanjira Sambuli

Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic…

Information Retrieval · Computer Science 2019-07-12 Adji B. Dieng , Francisco J. R. Ruiz , David M. Blei

By illuminating latent structures in a corpus of text, topic models are an essential tool for categorizing, summarizing, and exploring large collections of documents. Probabilistic topic models, such as latent Dirichlet allocation (LDA),…

Information Retrieval · Computer Science 2021-12-07 Bahareh Harandizadeh , J. Hunter Priniski , Fred Morstatter

Supervised topic models utilize document's side information for discovering predictive low dimensional representations of documents. Existing models apply the likelihood-based estimation. In this paper, we present a general framework of…

Machine Learning · Statistics 2013-04-09 Jun Zhu , Amr Ahmed , Eric P. Xing

Latent Dirichlet analysis, or topic modeling, is a flexible latent variable framework for modeling high-dimensional sparse count data. Various learning algorithms have been developed in recent years, including collapsed Gibbs sampling,…

Machine Learning · Computer Science 2012-05-14 Arthur Asuncion , Max Welling , Padhraic Smyth , Yee Whye Teh

Using nonparametric methods has been increasingly explored in Bayesian hierarchical modeling as a way to increase model flexibility. Although the field shows a lot of promise, inference in many models, including Hierachical Dirichlet…

Machine Learning · Statistics 2015-01-19 Alexander Spangher

Distributed dense word vectors have been shown to be effective at capturing token-level semantic and syntactic regularities in language, while topic models can form interpretable representations over documents. In this work, we describe…

Computation and Language · Computer Science 2016-05-09 Christopher E Moody

Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals:…

Machine Learning · Computer Science 2017-12-05 Michael C. Hughes , Gabriel Hope , Leah Weiner , Thomas H. McCoy , Roy H. Perlis , Erik B. Sudderth , Finale Doshi-Velez

Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs),…

Machine Learning · Computer Science 2018-04-11 Zihao Xiao , Jianfei Chen , Jun Zhu

Latent Dirichlet Allocation (LDA) is a foundational model for discovering latent thematic structure in discrete data, but its Dirichlet prior cannot represent the rich correlations and hierarchical relationships often present among topics.…

Machine Learning · Computer Science 2026-02-24 Zheng Wang , Nizar Bouguila

Latent Dirichlet Allocation (LDA) is a probabilistic model used to uncover latent topics in a corpus of documents. Inference is often performed using variational Bayes (VB) algorithms, which calculate a lower bound to the posterior…

Machine Learning · Computer Science 2022-08-26 Rebecca M. C. Taylor , Dirko Coetsee , Johan A. du Preez

Originally designed to model text, topic modeling has become a powerful tool for uncovering latent structure in domains including medicine, finance, and vision. The goals for the model vary depending on the application: in some cases, the…

Machine Learning · Statistics 2014-11-24 Finale Doshi-Velez , Byron Wallace , Ryan Adams

To solve the big topic modeling problem, we need to reduce both time and space complexities of batch latent Dirichlet allocation (LDA) algorithms. Although parallel LDA algorithms on the multi-processor architecture have low time and space…

Machine Learning · Computer Science 2013-11-19 Jian-Feng Yan , Jia Zeng , Zhi-Qiang Liu , Yang Gao

Topic models provide a useful text-mining tool for learning, extracting, and discovering latent structures in large text corpora. Although a plethora of methods have been proposed for topic modeling, lacking in the literature is a formal…

Machine Learning · Statistics 2022-08-12 Yinyin Chen , Shishuang He , Yun Yang , Feng Liang