English
Related papers

Related papers: A Practical Algorithm for Topic Modeling with Prov…

200 papers

Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference…

Machine Learning · Computer Science 2016-05-30 Sanjeev Arora , Rong Ge , Frederic Koehler , Tengyu Ma , Ankur Moitra

In this paper, we provide the first practical algorithms with provable guarantees for the problem of inferring the topics assigned to each document in an LDA topic model. This is the primary inference problem for many applications of topic…

Machine Learning · Computer Science 2025-06-10 Adam Breuer

Machine unlearning algorithms are increasingly important as legal concerns arise around the provenance of training data, but verifying the success of unlearning is often difficult. Provable guarantees for unlearning are often limited to…

Machine Learning · Computer Science 2025-04-22 Stanley Wei , Sadhika Malladi , Sanjeev Arora , Amartya Sanyal

Current topic models often suffer from discovering topics not matching human intuition, unnatural switching of topics within documents and high computational demands. We address these concerns by proposing a topic model and an inference…

Computation and Language · Computer Science 2018-02-06 Johannes Schneider

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds…

Machine Learning · Statistics 2019-09-06 Xin Bing , Florentina Bunea , Marten Wegkamp

Topic models provide a useful text-mining tool for learning, extracting, and discovering latent structures in large text corpora. Although a plethora of methods have been proposed for topic modeling, lacking in the literature is a formal…

Machine Learning · Statistics 2022-08-12 Yinyin Chen , Shishuang He , Yun Yang , Feng Liang

Correlated topic modeling has been limited to small model and problem sizes due to their high computational cost and poor scaling. In this paper, we propose a new model which learns compact topic embeddings and captures topic correlations…

Machine Learning · Computer Science 2017-07-04 Junxian He , Zhiting Hu , Taylor Berg-Kirkpatrick , Ying Huang , Eric P. Xing

In this paper we discuss a well known computing problem -- inference for models with intractable normalizing functions. Models with intractable normalizing functions arise in a wide variety of areas, for instance network models, models for…

Methodology · Statistics 2026-03-19 Murali Haran , Bokgyeong Kang , Jaewoo Park

We consider the problem of explaining the predictions of an arbitrary blackbox model $f$: given query access to $f$ and an instance $x$, output a small set of $x$'s features that in conjunction essentially determines $f(x)$. We design an…

Machine Learning · Computer Science 2021-11-03 Guy Blanc , Jane Lange , Li-Yang Tan

One of the core problems in statistical models is the estimation of a posterior distribution. For topic models, the problem of posterior inference for individual texts is particularly important, especially when dealing with data streams,…

Machine Learning · Statistics 2016-08-18 Khoat Than , Tung Doan

Non-convex optimization problems often arise from probabilistic modeling, such as estimation of posterior distributions. Non-convexity makes the problems intractable, and poses various obstacles for us to design efficient algorithms. In…

Machine Learning · Computer Science 2013-12-18 Khoat Than , Tu Bao Ho

Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way…

Computation and Language · Computer Science 2020-10-08 Suzanna Sia , Ayush Dalmia , Sabrina J. Mielke

Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words.…

Artificial Intelligence · Computer Science 2023-12-18 Han Wang , Nirmalendu Prakash , Nguyen Khoi Hoang , Ming Shan Hee , Usman Naseem , Roy Ka-Wei Lee

Extracting topics from text has become an essential task, especially with the rapid growth of unstructured textual data. Most existing works rely on highly computational methods to address this challenge. In this paper, we argue that…

Computation and Language · Computer Science 2025-11-07 Salma Mekaoui , Hiba Sofyan , Imane Amaaz , Imane Benchrif , Arsalane Zarghili , Ilham Chaker , Nikola S. Nikolov

We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a…

Systems and Control · Electrical Eng. & Systems 2022-07-13 Kush Grover , Jan Křetínský , Tobias Meggendorfer , Maximilian Weininger

The number of topics might be the most important parameter of a topic model. The topic modelling community has developed a set of various procedures to estimate the number of topics in a dataset, but there has not yet been a sufficiently…

Computation and Language · Computer Science 2024-07-31 Victor Bulatov , Vasiliy Alekseev , Konstantin Vorontsov

Topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus. It has been increasingly widely adopted as a tool in the social sciences, including political science, digital humanities and sociological…

Information Retrieval · Computer Science 2022-01-12 Zheng Fang , Yulan He , Rob Procter

Many complex multi-target prediction problems that concern large target spaces are characterised by a need for efficient prediction strategies that avoid the computation of predictions for all targets explicitly. Examples of such problems…

Information Retrieval · Computer Science 2018-03-06 Michiel Stock , Krzysztof Dembczynski , Bernard De Baets , Willem Waegeman

Topic modelling is a text mining technique for identifying salient themes from a number of documents. The output is commonly a set of topics consisting of isolated tokens that often co-occur in such documents. Manual effort is often…

Computation and Language · Computer Science 2024-04-26 Lowri Williams , Eirini Anthi , Laura Arman , Pete Burnap

Topic models are in widespread use in natural language processing and beyond. Here, we propose a new framework for the evaluation of probabilistic topic modeling algorithms based on synthetic corpora containing an unambiguously defined…

Computation and Language · Computer Science 2019-01-29 Hanyu Shi , Martin Gerlach , Isabel Diersen , Doug Downey , Luis A. N. Amaral
‹ Prev 1 2 3 10 Next ›