English
Related papers

Related papers: Supervised topic models for clinical interpretabil…

200 papers

Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals:…

Machine Learning · Computer Science 2017-12-05 Michael C. Hughes , Gabriel Hope , Leah Weiner , Thomas H. McCoy , Roy H. Perlis , Erik B. Sudderth , Finale Doshi-Velez

We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive an approximate maximum-likelihood procedure for parameter estimation, which…

Machine Learning · Statistics 2010-03-04 David M. Blei , Jon D. McAuliffe

Supervised topic models utilize document's side information for discovering predictive low dimensional representations of documents. Existing models apply the likelihood-based estimation. In this paper, we present a general framework of…

Machine Learning · Statistics 2013-04-09 Jun Zhu , Amr Ahmed , Eric P. Xing

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In…

Machine Learning · Computer Science 2016-05-30 Ke Jiang , Suvrit Sra , Brian Kulis

Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document…

Machine Learning · Computer Science 2025-11-04 Biyi Fang , Truong Vo , Kripa Rajshekhar , Diego Klabjan

Supervisory signals have the potential to make low-dimensional data representations, like those learned by mixture and topic models, more interpretable and useful. We propose a framework for training latent variable models that explicitly…

Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and…

Information Retrieval · Computer Science 2018-12-07 Hamed Jelodar , Yongli Wang , Chi Yuan , Xia Feng , Xiahui Jiang , Yanchao Li , Liang Zhao

Standard LDA model suffers the problem that the topic assignment of each word is independent and word correlation hence is neglected. To address this problem, in this paper, we propose a model called Word Related Latent Dirichlet Allocation…

Computation and Language · Computer Science 2014-11-11 Xun Wang

Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the…

Machine Learning · Statistics 2020-01-08 Iryna Korshunova , Hanchen Xiong , Mateusz Fedoryszak , Lucas Theis

Originally designed to model text, topic modeling has become a powerful tool for uncovering latent structure in domains including medicine, finance, and vision. The goals for the model vary depending on the application: in some cases, the…

Machine Learning · Statistics 2014-11-24 Finale Doshi-Velez , Byron Wallace , Ryan Adams

Two challenging problems in the clinical study of cancer are the characterization of cancer subtypes and the classification of individual patients according to those subtypes. Statistical approaches addressing these problems are hampered by…

Methodology · Statistics 2012-02-28 John A. Dawson , Christina Kendziorski

The problem of topic modeling can be seen as a generalization of the clustering problem, in that it posits that observations are generated due to multiple latent factors (e.g., the words in each document are generated as a mixture of…

Machine Learning · Computer Science 2013-01-21 Animashree Anandkumar , Dean P. Foster , Daniel Hsu , Sham M. Kakade , Yi-Kai Liu

In this paper, we provide the first practical algorithms with provable guarantees for the problem of inferring the topics assigned to each document in an LDA topic model. This is the primary inference problem for many applications of topic…

Machine Learning · Computer Science 2025-06-10 Adam Breuer

Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to perform scene recognition and annotation. Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator…

Computer Vision and Pattern Recognition · Computer Science 2013-05-24 Yin Zheng , Yu-Jin Zhang , Hugo Larochelle

Latent Dirichlet Allocation (LDA) models trained without stopword removal often produce topics with high posterior probabilities on uninformative words, obscuring the underlying corpus content. Even when canonical stopwords are manually…

Computation and Language · Computer Science 2017-10-17 Angela Fan , Finale Doshi-Velez , Luke Miratrix

With the advent and popularity of big data mining and huge text analysis in modern times, automated text summarization became prominent for extracting and retrieving important information from documents. This research investigates aspects…

Information Retrieval · Computer Science 2023-05-31 Daniel F. O. Onah , Elaine L. L. Pang , Mahmoud El-Haj

Probabilistic topic models are generative models that describe the content of documents by discovering the latent topics underlying them. However, the structure of the textual input, and for instance the grouping of words in coherent text…

Computation and Language · Computer Science 2016-06-02 Georgios Balikas , Massih-Reza Amini , Marianne Clausel

In latent Dirichlet allocation (LDA), topics are multinomial distributions over the entire vocabulary. However, the vocabulary usually contains many words that are not relevant in forming the topics. We adopt a variable selection method…

Machine Learning · Computer Science 2012-05-08 Dongwoo Kim , Yeonseung Chung , Alice Oh

Topic modeling, a method for extracting the underlying themes from a collection of documents, is an increasingly important component of the design of intelligent systems enabling the sense-making of highly dynamic and diverse streams of…

Information Retrieval · Computer Science 2019-10-07 Chris Gropp , Alexander Herzog , Ilya Safro , Paul W. Wilson , Amy W. Apon

In this paper, we present hierarchical relationbased latent Dirichlet allocation (hrLDA), a data-driven hierarchical topic model for extracting terminological ontologies from a large number of heterogeneous documents. In contrast to…

Computation and Language · Computer Science 2020-01-10 Xiaofeng Zhu , Diego Klabjan , Patrick Bless
‹ Prev 1 2 3 10 Next ›