English
Related papers

Related papers: On Estimation and Selection for Topic Models

200 papers

Topic models provide a useful text-mining tool for learning, extracting, and discovering latent structures in large text corpora. Although a plethora of methods have been proposed for topic modeling, lacking in the literature is a formal…

Machine Learning · Statistics 2022-08-12 Yinyin Chen , Shishuang He , Yun Yang , Feng Liang

In an effort to develop topic modeling methods that can be quickly applied to large data sets, we revisit the problem of maximum-likelihood estimation in topic models. It is known, at least informally, that maximum-likelihood estimation in…

Machine Learning · Statistics 2026-02-10 Peter Carbonetto , Abhishek Sarkar , Zihao Wang , Matthew Stephens

Likelihood based-learning of graphical models faces challenges of computational-complexity and robustness to model mis-specification. This paper studies methods that fit parameters directly to maximize a measure of the accuracy of predicted…

Machine Learning · Computer Science 2014-07-04 Justin Domke

Supervised topic models utilize document's side information for discovering predictive low dimensional representations of documents. Existing models apply the likelihood-based estimation. In this paper, we present a general framework of…

Machine Learning · Statistics 2013-04-09 Jun Zhu , Amr Ahmed , Eric P. Xing

Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer models of the likelihood-to-evidence…

Machine Learning · Computer Science 2022-06-08 Giulio Isacchini , Natanael Spisak , Armita Nourmohammad , Thierry Mora , Aleksandra M. Walczak

Topic models have become popular tools for dimension reduction and exploratory analysis of text data which consists in observed frequencies of a vocabulary of $p$ words in $n$ documents, stored in a $p\times n$ matrix. The main premise is…

Machine Learning · Statistics 2020-01-23 Xin Bing , Florentina Bunea , Marten Wegkamp

Topic models have achieved significant successes in analyzing large-scale text corpus. In practical applications, we are always confronted with the challenge of model selection, i.e., how to appropriately set the number of topics. Following…

Machine Learning · Statistics 2015-02-18 Dehua Cheng , Xinran He , Yan Liu

We propose a neural network based approach for learning topics from text and image datasets. The model makes no assumptions about the conditional distribution of the observed features given the latent topics. This allows us to perform topic…

Machine Learning · Computer Science 2017-03-01 Gaurav Pandey , Ambedkar Dukkipati

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds…

Machine Learning · Statistics 2019-09-06 Xin Bing , Florentina Bunea , Marten Wegkamp

We here introduce a novel classification approach adopted from the nonlinear model identification framework, which jointly addresses the feature selection and classifier design tasks. The classifier is constructed as a polynomial expansion…

Machine Learning · Computer Science 2016-07-29 Aida Brankovic , Alessandro Falsone , Maria Prandini , Luigi Piroddi

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In…

Machine Learning · Computer Science 2016-05-30 Ke Jiang , Suvrit Sra , Brian Kulis

Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including…

Machine Learning · Statistics 2013-05-27 Animashree Anandkumar , Daniel Hsu , Adel Javanmard , Sham M. Kakade

Variational inference is a very efficient and popular heuristic used in various forms in the context of latent variable models. It's closely related to Expectation Maximization (EM), and is applied when exact EM is computationally…

Machine Learning · Computer Science 2015-08-25 Pranjal Awasthi , Andrej Risteski

Recently, topic modeling has been widely used to discover the abstract topics in text corpora. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a…

Computation and Language · Computer Science 2017-04-10 Yi-Kun Tang , Xian-Ling Mao , Heyan Huang , Guihua Wen

In this paper we present a modification to a latent topic model, which makes the model exploit supervision to produce a factorized representation of the observed data. The structured parameterization separately encodes variance that is…

Machine Learning · Computer Science 2013-04-24 Cheng Zhang , Carl Henrik Ek , Andreas Damianou , Hedvig Kjellstrom

We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population…

Machine Learning · Computer Science 2015-01-27 Weicong Ding , Prakash Ishwar , Venkatesh Saligrama

Probabilistic topic models such as latent Dirichlet allocation (LDA) are popularly used with Bayesian inference methods such as Gibbs sampling to learn posterior distributions over topic model parameters. We derive a novel measure of LDA…

Computation and Language · Computer Science 2019-09-17 Linzi Xing , Michael J. Paul , Giuseppe Carenini

Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference…

Computation and Language · Computer Science 2018-02-06 Johannes Schneider

The estimation of parameters from data is a common problem in many areas of the physical sciences, and frequently used algorithms rely on sets of simulated data which are fit to data. In this article, an analytic solution for…

Data Analysis, Statistics and Probability · Physics 2022-09-27 Daniel Britzger

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model inference have been based on a maximum likelihood objective. Efficient algorithms exist…

Machine Learning · Computer Science 2012-12-20 Sanjeev Arora , Rong Ge , Yoni Halpern , David Mimno , Ankur Moitra , David Sontag , Yichen Wu , Michael Zhu
‹ Prev 1 2 3 10 Next ›