English
Related papers

Related papers: Spectral Learning for Supervised Topic Models

200 papers

This paper presents an algorithm for the unsupervised learning of latent variable models from unlabeled sets of data. We base our technique on spectral decomposition, providing a technique that proves to be robust both in theory and in…

Machine Learning · Statistics 2017-04-05 Matteo Ruffini , Marta Casanellas , Ricard Gavaldà

The problem of topic modeling can be seen as a generalization of the clustering problem, in that it posits that observations are generated due to multiple latent factors (e.g., the words in each document are generated as a mixture of…

Machine Learning · Computer Science 2013-01-21 Animashree Anandkumar , Dean P. Foster , Daniel Hsu , Sham M. Kakade , Yi-Kai Liu

In this paper, we propose guaranteed spectral methods for learning a broad range of topic models, which generalize the popular Latent Dirichlet Allocation (LDA). We overcome the limitation of LDA to incorporate arbitrary topic correlations,…

Machine Learning · Computer Science 2016-11-15 Forough Arabshahi , Animashree Anandkumar

The question of how to determine the number of independent latent factors (topics) in mixture models such as Latent Dirichlet Allocation (LDA) is of great practical importance. In most applications, the exact number of topics is unknown,…

Machine Learning · Statistics 2014-01-23 E. D. Gutiérrez

We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive an approximate maximum-likelihood procedure for parameter estimation, which…

Machine Learning · Statistics 2010-03-04 David M. Blei , Jon D. McAuliffe

Recently there has been substantial interest in spectral methods for learning dynamical systems. These methods are popular since they often offer a good tradeoff between computational and statistical efficiency. Unfortunately, they can be…

Machine Learning · Statistics 2015-11-05 Ahmed Hefny , Carlton Downey , Geoffrey Gordon

Spectral topic modeling algorithms operate on matrices/tensors of word co-occurrence statistics to learn topic-specific word distributions. This approach removes the dependence on the original documents and produces substantial gains in…

Computation and Language · Computer Science 2017-11-21 Moontae Lee , David Bindel , David Mimno

This paper re-visits the spectral method for learning latent variable models defined in terms of observable operators. We give a new perspective on the method, showing that operators can be recovered by minimizing a loss defined on a finite…

Machine Learning · Computer Science 2012-07-03 Borja Balle , Ariadna Quattoni , Xavier Carreras

We study the problem of learning a latent variable model from a stream of data. Latent variable models are popular in practice because they can explain observed data in terms of unobserved concepts. These models have been traditionally…

Machine Learning · Computer Science 2018-04-27 Tong Yu , Branislav Kveton , Zheng Wen , Hung Bui , Ole J. Mengshoel

Supervised topic models can help clinical researchers find interpretable cooccurence patterns in count data that are relevant for diagnostics. However, standard formulations of supervised Latent Dirichlet Allocation have two problems.…

Machine Learning · Statistics 2016-12-07 Michael C. Hughes , Huseyin Melih Elibol , Thomas McCoy , Roy Perlis , Finale Doshi-Velez

We provide an end-to-end differentially private spectral algorithm for learning LDA, based on matrix/tensor decompositions, and establish theoretical guarantees on utility/consistency of the estimated model parameters. The spectral…

Machine Learning · Statistics 2020-01-20 Christopher DeCarolis , Mukul Ram , Seyed A. Esmaeili , Yu-Xiang Wang , Furong Huang

Supervised topic models utilize document's side information for discovering predictive low dimensional representations of documents. Existing models apply the likelihood-based estimation. In this paper, we present a general framework of…

Machine Learning · Statistics 2013-04-09 Jun Zhu , Amr Ahmed , Eric P. Xing

Phase retrieval refers to the problem of recovering a signal $\mathbf{x}_{\star}\in\mathbb{C}^n$ from its phaseless measurements $y_i=|\mathbf{a}_i^{\mathrm{H}}\mathbf{x}_{\star}|$, where $\{\mathbf{a}_i\}_{i=1}^m$ are the measurement…

Information Theory · Computer Science 2020-09-10 Junjie Ma , Rishabh Dudeja , Ji Xu , Arian Maleki , Xiaodong Wang

In the internet era there has been an explosion in the amount of digital text information available, leading to difficulties of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference…

Machine Learning · Computer Science 2013-05-14 James Foulds , Levi Boyles , Christopher Dubois , Padhraic Smyth , Max Welling

Line spectral estimation theory aims to estimate the off-the-grid spectral components of a time signal with optimal precision. Recent results have shown that it is possible to recover signals having sparse line spectra from few temporal…

Information Theory · Computer Science 2017-01-31 Maxime Ferreira Da Costa , Wei Dai

The rapid adoption of synthetic data for training Large Language Models (LLMs) has introduced the technical challenge of "model collapse"-a degenerative process where recursive training on model-generated content leads to a contraction of…

Machine Learning · Computer Science 2026-03-24 Yi Gu , Lingyou Pang , Xiangkun Ye , Tianyu Wang , Jianyu Lin , Carey E. Priebe , Alexander Aue

Spectral methods have been the mainstay in several domains such as machine learning and scientific computing. They involve finding a certain kind of spectral decomposition to obtain basis functions that can capture important structures for…

Machine Learning · Computer Science 2020-04-20 Majid Janzamin , Rong Ge , Jean Kossaifi , Anima Anandkumar

Originally designed to model text, topic modeling has become a powerful tool for uncovering latent structure in domains including medicine, finance, and vision. The goals for the model vary depending on the application: in some cases, the…

Machine Learning · Statistics 2014-11-24 Finale Doshi-Velez , Byron Wallace , Ryan Adams

Latent class models are widely used for identifying unobserved subgroups from multivariate categorical data in social sciences, with binary data as a particularly popular example. However, accurately recovering individual latent class…

Methodology · Statistics 2026-02-25 Zhongyuan Lyu , Yuqi Gu

Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-21 Kaiwei Li , Jianfei Chen , Wenguang Chen , Jun Zhu
‹ Prev 1 2 3 10 Next ›