English
Related papers

Related papers: A Scalable Asynchronous Distributed Algorithm for …

200 papers

Topic modeling, a method for extracting the underlying themes from a collection of documents, is an increasingly important component of the design of intelligent systems enabling the sense-making of highly dynamic and diverse streams of…

Information Retrieval · Computer Science 2019-10-07 Chris Gropp , Alexander Herzog , Ilya Safro , Paul W. Wilson , Amy W. Apon

This paper presents an intertemporal bimodal network to analyze the evolution of the semantic content of a scientific field within the framework of topic modeling, namely using the Latent Dirichlet Allocation (LDA). The main contribution is…

Computation and Language · Computer Science 2020-02-13 Luigi Di Caro , Marco Guerzoni , Massimiliano Nuccio , Giovanni Siragusa

We introduce the author-topic model, a generative model for documents that extends Latent Dirichlet Allocation (LDA; Blei, Ng, & Jordan, 2003) to include authorship information. Each author is associated with a multinomial distribution over…

Information Retrieval · Computer Science 2012-07-19 Michal Rosen-Zvi , Thomas Griffiths , Mark Steyvers , Padhraic Smyth

Social scientists employ latent Dirichlet allocation (LDA) to find highly specific topics in large corpora, but they often struggle in this task because (1) LDA, in general, takes a significant amount of time to fit on large corpora; (2)…

Methodology · Statistics 2025-12-23 Kohei Watanabe

This paper proposes a topic modeling method that scales linearly to billions of documents. We make three core contributions: i) we present a topic modeling method, Tensor Latent Dirichlet Allocation (TLDA), that has identifiable and…

Machine Learning · Computer Science 2026-01-14 Sara Kangaslahti , Danny Ebanks , Jean Kossaifi , Anqi Liu , R. Michael Alvarez , Animashree Anandkumar

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring document collections. Because of the increasing prevalence of large datasets, there is a need to improve the scalability of inference of LDA. In this…

Artificial Intelligence · Computer Science 2011-07-20 Ke Zhai , Jordan Boyd-Graber , Nima Asadi

This paper presents an algorithmic family of dynamic topic models called Aligned Neural Topic Models (ANTM), which combine novel data mining algorithms to provide a modular framework for discovering evolving topics. ANTM maintains the…

Information Retrieval · Computer Science 2023-06-06 Hamed Rahimi , Hubert Naacke , Camelia Constantin , Bernd Amann

Topic modeling is widely used for uncovering thematic structures within text corpora, yet traditional models often struggle with specificity and coherence in domain-focused applications. Guided approaches, such as SeededLDA and CorEx,…

Computation and Language · Computer Science 2025-05-23 Chia-Hsuan Chang , Jui-Tse Tsai , Yi-Hang Tsai , San-Yih Hwang

Latent Dirichlet allocation (LDA) is a popular topic modeling technique in academia but less so in industry, especially in large-scale applications involving search engine and online advertising systems. A main underlying reason is that the…

Information Retrieval · Computer Science 2015-12-08 Yi Wang , Xuemin Zhao , Zhenlong Sun , Hao Yan , Lifeng Wang , Zhihui Jin , Liubin Wang , Yang Gao , Ching Law , Jia Zeng

Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent…

Topic modeling is a very powerful technique in data analysis and data mining but it is generally slow. Many parallelization approaches have been proposed to speed up the learning process. However, they are usually not very efficient because…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-02-24 Hung Nghiep Tran , Atsuhiro Takasu

Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to deal with multimodal data, such as in image annotation tasks. Another popular approach to model the multimodal data is through deep neural networks,…

Computer Vision and Pattern Recognition · Computer Science 2016-01-01 Yin Zheng , Yu-Jin Zhang , Hugo Larochelle

In this paper, we provide the first practical algorithms with provable guarantees for the problem of inferring the topics assigned to each document in an LDA topic model. This is the primary inference problem for many applications of topic…

Machine Learning · Computer Science 2025-06-10 Adam Breuer

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a…

Machine Learning · Statistics 2018-07-20 Martin Gerlach , Tiago P. Peixoto , Eduardo G. Altmann

There is an explosion of data, documents, and other content, and people require tools to analyze and interpret these, tools to turn the content into information and knowledge. Topic modeling have been developed to solve these problems.…

Computation and Language · Computer Science 2015-10-23 Aaron Q Li

Topic modelling in Natural Language Processing uncovers hidden topics in large, unlabelled text datasets. It is widely applied in fields such as information retrieval, content summarisation, and trend analysis across various disciplines.…

Computation and Language · Computer Science 2025-11-18 Saranzaya Magsarjav , Melissa Humphries , Jonathan Tuke , Lewis Mitchell

Context: Topic modeling finds human-readable structures in unstructured textual data. A widely used topic modeler is Latent Dirichlet allocation. When run on different datasets, LDA suffers from "order effects" i.e. different topics are…

Software Engineering · Computer Science 2018-03-16 Amritanshu Agrawal , Wei Fu , Tim Menzies

The increasing volume of short texts generated on social media sites, such as Twitter or Facebook, creates a great demand for effective and efficient topic modeling approaches. While latent Dirichlet allocation (LDA) can be applied, it is…

Computation and Language · Computer Science 2013-01-29 Jeon-Hyung Kang , Jun Ma , Yan Liu

For organizing large text corpora topic modeling provides useful tools. A widely used method is Latent Dirichlet Allocation (LDA), a generative probabilistic model which models single texts in a collection of texts as mixtures of latent…

Computation and Language · Computer Science 2020-04-02 Jonas Rieger , Lars Koppers , Carsten Jentsch , Jörg Rahnenführer

The rapid expansion of biomedical publications creates challenges for organizing knowledge and detecting emerging trends, underscoring the need for scalable and interpretable methods. Common clustering and topic modeling approaches such as…

Machine Learning · Computer Science 2026-02-25 Lana E. Yeganova , Won G. Kim , Shubo Tian , Natalie Xie , Donald C. Comeau , W. John Wilbur , Zhiyong Lu
‹ Prev 1 2 3 10 Next ›