English
Related papers

Related papers: Minimum Volume Topic Modeling

200 papers

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In…

Machine Learning · Computer Science 2016-05-30 Ke Jiang , Suvrit Sra , Brian Kulis

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the…

Machine Learning · Statistics 2016-10-31 Mikhail Yurochkin , XuanLong Nguyen

Latent Dirichlet Allocation (LDA) is a prominent generative probabilistic model used for uncovering abstract topics within document collections. In this paper, we explore the effectiveness of augmenting topic models with Large Language…

Computation and Language · Computer Science 2025-07-14 Mengze Hong , Chen Jason Zhang , Di Jiang

The problem of topic modeling can be seen as a generalization of the clustering problem, in that it posits that observations are generated due to multiple latent factors (e.g., the words in each document are generated as a mixture of…

Machine Learning · Computer Science 2013-01-21 Animashree Anandkumar , Dean P. Foster , Daniel Hsu , Sham M. Kakade , Yi-Kai Liu

In latent Dirichlet allocation (LDA), topics are multinomial distributions over the entire vocabulary. However, the vocabulary usually contains many words that are not relevant in forming the topics. We adopt a variable selection method…

Machine Learning · Computer Science 2012-05-08 Dongwoo Kim , Yeonseung Chung , Alice Oh

Topic modeling, a method for extracting the underlying themes from a collection of documents, is an increasingly important component of the design of intelligent systems enabling the sense-making of highly dynamic and diverse streams of…

Information Retrieval · Computer Science 2019-10-07 Chris Gropp , Alexander Herzog , Ilya Safro , Paul W. Wilson , Amy W. Apon

Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and…

Information Retrieval · Computer Science 2018-12-07 Hamed Jelodar , Yongli Wang , Chi Yuan , Xia Feng , Xiahui Jiang , Yanchao Li , Liang Zhao

Nowadays, data analysis has become a problem as the amount of data is constantly increasing. In order to overcome this problem in textual data, many models and methods are used in natural language processing. The topic modeling field is one…

Computation and Language · Computer Science 2021-10-22 Zekeriya Anil Guven , Banu Diri , Tolgahan Cakaloglu

Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document…

Machine Learning · Computer Science 2025-11-04 Biyi Fang , Truong Vo , Kripa Rajshekhar , Diego Klabjan

Latent Dirichlet Allocation (LDA) mining thematic structure of documents plays an important role in nature language processing and machine learning areas. However, the probability distribution from LDA only describes the statistical…

Computation and Language · Computer Science 2015-06-30 Li-Qiang Niu , Xin-Yu Dai

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics,…

Applications · Statistics 2009-09-29 David M. Blei , John D. Lafferty

The question of how to determine the number of independent latent factors (topics) in mixture models such as Latent Dirichlet Allocation (LDA) is of great practical importance. In most applications, the exact number of topics is unknown,…

Machine Learning · Statistics 2014-01-23 E. D. Gutiérrez

Topic modeling is admittedly a convenient way to monitor markets trend. Conventionally, Latent Dirichlet Allocation, LDA, is considered a must-do model to gain this type of information. By given the merit of deducing keyword with token…

Computation and Language · Computer Science 2023-09-19 Ching-Hsun Tseng , Shin-Jye Lee , Po-Wei Cheng , Chien Lee , Chih-Chieh Hung

We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate…

Information Retrieval · Computer Science 2015-07-24 Ashwinkumar Ganesan , Kiante Brantley , Shimei Pan , Jian Chen

With the advent and popularity of big data mining and huge text analysis in modern times, automated text summarization became prominent for extracting and retrieving important information from documents. This research investigates aspects…

Information Retrieval · Computer Science 2023-05-31 Daniel F. O. Onah , Elaine L. L. Pang , Mahmoud El-Haj

This paper proposes a topic modeling method that scales linearly to billions of documents. We make three core contributions: i) we present a topic modeling method, Tensor Latent Dirichlet Allocation (TLDA), that has identifiable and…

Machine Learning · Computer Science 2026-01-14 Sara Kangaslahti , Danny Ebanks , Jean Kossaifi , Anqi Liu , R. Michael Alvarez , Animashree Anandkumar

Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the…

Machine Learning · Statistics 2020-01-08 Iryna Korshunova , Hanchen Xiong , Mateusz Fedoryszak , Lucas Theis

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring document collections. Because of the increasing prevalence of large datasets, there is a need to improve the scalability of inference of LDA. In this…

Artificial Intelligence · Computer Science 2011-07-20 Ke Zhai , Jordan Boyd-Graber , Nima Asadi

Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address…

Machine Learning · Statistics 2017-03-07 Akash Srivastava , Charles Sutton

Latent Dirichlet Allocation (LDA) is a topic model widely used in natural language processing and machine learning. Most approaches to training the model rely on iterative algorithms, which makes it difficult to run LDA on big corpora that…

Machine Learning · Statistics 2020-10-23 Alexander Terenin , Måns Magnusson , Leif Jonsson , David Draper
‹ Prev 1 2 3 10 Next ›