English
Related papers

Related papers: Topic subject creation using unsupervised learning…

200 papers

Aviation safety is paramount in the modern world, with a continuous commitment to reducing accidents and improving safety standards. Central to this endeavor is the analysis of aviation accident reports, rich textual resources that hold…

Computation and Language · Computer Science 2024-03-11 Aziida Nanyonga , Hassan Wasswa , Graham Wild

Topic modeling is a technique for organizing and extracting themes from large collections of unstructured text. Non-negative matrix factorization (NMF) is a common unsupervised approach that decomposes a term frequency-inverse document…

Machine Learning · Computer Science 2024-07-30 Selma Wanna , Ryan Barron , Nick Solovyev , Maksim E. Eren , Manish Bhattarai , Kim Rasmussen , Boian S. Alexandrov

Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents. A number of foundational…

Machine Learning · Computer Science 2012-04-13 Sanjeev Arora , Rong Ge , Ankur Moitra

Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and…

Information Retrieval · Computer Science 2018-12-07 Hamed Jelodar , Yongli Wang , Chi Yuan , Xia Feng , Xiahui Jiang , Yanchao Li , Liang Zhao

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In…

Machine Learning · Computer Science 2016-05-30 Ke Jiang , Suvrit Sra , Brian Kulis

A text mining approach is proposed based on latent Dirichlet allocation (LDA) to analyze the Consumer Financial Protection Bureau (CFPB) consumer complaints. The proposed approach aims to extract latent topics in the CFPB complaint…

Information Retrieval · Computer Science 2018-07-20 Kaveh Bastani , Hamed Namavari , Jeffry Shaffer

Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document…

Machine Learning · Computer Science 2025-11-04 Biyi Fang , Truong Vo , Kripa Rajshekhar , Diego Klabjan

In this work, automatic analysis of themes contained in a large corpora of judgments from public procurement domain is performed. The employed technique is unsupervised latent Dirichlet allocation (LDA). In addition, it is proposed, to use…

Computation and Language · Computer Science 2014-12-18 Michał Łopuszyński

Non-negative matrix factorization with the generalized Kullback-Leibler divergence (NMF) and latent Dirichlet allocation (LDA) are two popular approaches for dimensionality reduction of non-negative data. Here, we show that NMF with…

Machine Learning · Computer Science 2024-06-03 Benedikt Geiger , Peter J. Park

Latent Dirichlet Allocation (LDA) is a prominent generative probabilistic model used for uncovering abstract topics within document collections. In this paper, we explore the effectiveness of augmenting topic models with Large Language…

Computation and Language · Computer Science 2025-07-14 Mengze Hong , Chen Jason Zhang , Di Jiang

Non-negative matrix factorization (NMF) based topic modeling is widely used in natural language processing (NLP) to uncover hidden topics of short text documents. Usually, training a high-quality topic model requires large amount of textual…

Computation and Language · Computer Science 2022-05-27 Shijing Si , Jianzong Wang , Ruiyi Zhang , Qinliang Su , Jing Xiao

Topic modeling is admittedly a convenient way to monitor markets trend. Conventionally, Latent Dirichlet Allocation, LDA, is considered a must-do model to gain this type of information. By given the merit of deducing keyword with token…

Computation and Language · Computer Science 2023-09-19 Ching-Hsun Tseng , Shin-Jye Lee , Po-Wei Cheng , Chien Lee , Chih-Chieh Hung

Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the…

Machine Learning · Statistics 2020-01-08 Iryna Korshunova , Hanchen Xiong , Mateusz Fedoryszak , Lucas Theis

This paper presents an intertemporal bimodal network to analyze the evolution of the semantic content of a scientific field within the framework of topic modeling, namely using the Latent Dirichlet Allocation (LDA). The main contribution is…

Computation and Language · Computer Science 2020-02-13 Luigi Di Caro , Marco Guerzoni , Massimiliano Nuccio , Giovanni Siragusa

Nowadays, data analysis has become a problem as the amount of data is constantly increasing. In order to overcome this problem in textual data, many models and methods are used in natural language processing. The topic modeling field is one…

Computation and Language · Computer Science 2021-10-22 Zekeriya Anil Guven , Banu Diri , Tolgahan Cakaloglu

The exponential growth of online social network platforms and applications has led to a staggering volume of user-generated textual content, including comments and reviews. Consequently, users often face difficulties in extracting valuable…

Computation and Language · Computer Science 2023-08-23 Anusuya Krishnan

The tremendous growth of social media content on the Internet has inspired the development of the text analytics to understand and solve real-life problems. Leveraging statistical topic modelling helps researchers and practitioners in…

Social and Information Networks · Computer Science 2016-08-09 Marina Sokolova , Kanyi Huang , Stan Matwin , Joshua Ramisch , Vera Sazonova , Renee Black , Chris Orwa , Sidney Ochieng , Nanjira Sambuli

A popular approach to topic modeling involves extracting co-occurring n-grams of a corpus into semantic themes. The set of n-grams in a theme represents an underlying topic, but most topic modeling approaches are not able to label these…

Computation and Language · Computer Science 2017-05-19 Justin Wood , Patrick Tan , Wei Wang , Corey Arnold

Standard LDA model suffers the problem that the topic assignment of each word is independent and word correlation hence is neglected. To address this problem, in this paper, we propose a model called Word Related Latent Dirichlet Allocation…

Computation and Language · Computer Science 2014-11-11 Xun Wang

We propose a geometric algorithm for topic learning and inference that is built on the convex geometry of topics arising from the Latent Dirichlet Allocation (LDA) model and its nonparametric extensions. To this end we study the…

Machine Learning · Statistics 2016-10-31 Mikhail Yurochkin , XuanLong Nguyen
‹ Prev 1 2 3 10 Next ›