English
Related papers

Related papers: Graph-Sparse LDA: A Topic Model with Structured Sp…

200 papers

Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text…

Computation and Language · Computer Science 2016-06-02 Georgios Balikas , Massih-Reza Amini , Marianne Clausel

Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and…

Information Retrieval · Computer Science 2018-12-07 Hamed Jelodar , Yongli Wang , Chi Yuan , Xia Feng , Xiahui Jiang , Yanchao Li , Liang Zhao

We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent…

Computation and Language · Computer Science 2014-09-29 Arun S. Maiya , Robert M. Rolfe

Supervised topic models can help clinical researchers find interpretable cooccurence patterns in count data that are relevant for diagnostics. However, standard formulations of supervised Latent Dirichlet Allocation have two problems.…

Machine Learning · Statistics 2016-12-07 Michael C. Hughes , Huseyin Melih Elibol , Thomas McCoy , Roy Perlis , Finale Doshi-Velez

Topological data analysis (TDA) has emerged as one of the most promising techniques to reconstruct the unknown shapes of high-dimensional spaces from observed data samples. TDA, thus, yields key shape descriptors in the form of persistent…

Machine Learning · Statistics 2017-11-15 Wei Guo , Krithika Manohar , Steven L. Brunton , Ashis G. Banerjee

Topic modeling is a state-of-the-art technique for analyzing text corpora. It uses a statistical model, most commonly Latent Dirichlet Allocation (LDA), to discover abstract topics that occur in the document collection. However, the…

Human-Computer Interaction · Computer Science 2021-10-19 Valerie Müller , Christian Sieg , Lars Linsen

We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate…

Information Retrieval · Computer Science 2015-07-24 Ashwinkumar Ganesan , Kiante Brantley , Shimei Pan , Jian Chen

We study a parametric family of latent variable models, namely topic models, equipped with a hierarchical structure among the topic variables. Such models may be viewed as a finite mixture of the latent Dirichlet allocation (LDA) induced…

Statistics Theory · Mathematics 2024-08-27 Sunrit Chakraborty , Rayleigh Lei , XuanLong Nguyen

Tabular datasets with low-sample-size or many variables are prevalent in biomedicine. Practitioners in this domain prefer linear or tree-based models over neural networks since the latter are harder to interpret and tend to overfit when…

Machine Learning · Computer Science 2022-02-09 Junchen Yang , Ofir Lindenbaum , Yuval Kluger

A popular approach to topic modeling involves extracting co-occurring n-grams of a corpus into semantic themes. The set of n-grams in a theme represents an underlying topic, but most topic modeling approaches are not able to label these…

Computation and Language · Computer Science 2017-05-19 Justin Wood , Patrick Tan , Wei Wang , Corey Arnold

With the advent and popularity of big data mining and huge text analysis in modern times, automated text summarization became prominent for extracting and retrieving important information from documents. This research investigates aspects…

Information Retrieval · Computer Science 2023-05-31 Daniel F. O. Onah , Elaine L. L. Pang , Mahmoud El-Haj

Extracting and identifying latent topics in large text corpora has gained increasing importance in Natural Language Processing (NLP). Most models, whether probabilistic models similar to Latent Dirichlet Allocation (LDA) or neural topic…

Computation and Language · Computer Science 2023-03-31 Anton Thielmann , Quentin Seifert , Arik Reuter , Elisabeth Bergherr , Benjamin Säfken

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a…

Machine Learning · Statistics 2018-07-20 Martin Gerlach , Tiago P. Peixoto , Eduardo G. Altmann

We propose a general framework for topic-specific summarization of large text corpora, and illustrate how it can be used for analysis in two quite different contexts: an OSHA database of fatality and catastrophe reports (to facilitate…

Computation and Language · Computer Science 2016-07-26 Luke Miratrix , Robin Ackerman

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-21 Kaiwei Li , Jianfei Chen , Wenguang Chen , Jun Zhu

Online learning algorithms update models via one sample per iteration, thus efficient to process large-scale datasets and useful to detect malicious events for social benefits, such as disease outbreak and traffic congestion on the fly.…

Machine Learning · Computer Science 2019-05-28 Baojian Zhou , Feng Chen , Yiming Ying

Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents. However, classic topic modelling approaches (e.g., LDA) have certain…

Computation and Language · Computer Science 2024-03-27 Yida Mu , Chun Dong , Kalina Bontcheva , Xingyi Song

Topic modelling in Natural Language Processing uncovers hidden topics in large, unlabelled text datasets. It is widely applied in fields such as information retrieval, content summarisation, and trend analysis across various disciplines.…

Computation and Language · Computer Science 2025-11-18 Saranzaya Magsarjav , Melissa Humphries , Jonathan Tuke , Lewis Mitchell

In this paper, we provide the first practical algorithms with provable guarantees for the problem of inferring the topics assigned to each document in an LDA topic model. This is the primary inference problem for many applications of topic…

Machine Learning · Computer Science 2025-06-10 Adam Breuer

Topic modeling has found wide application in many problems where latent structures of the data are crucial for typical inference tasks. When applying a topic model, a relatively standard pre-processing step is to first build a vocabulary of…

Computer Vision and Pattern Recognition · Computer Science 2020-01-17 Yuzhen Ding , Baoxin Li
‹ Prev 1 2 3 10 Next ›