Related papers: Determining Research Priorities Using Machine Lear…
We summarize the first exploratory investigation into whether Machine Learning techniques can augment science strategic planning. We find that an approach based on Latent Dirichlet Allocation using abstracts drawn from high impact astronomy…
Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and…
Current daily paper releases are becoming increasingly large and areas of research are growing in diversity. This makes it harder for scientists to keep up to date with current state of the art and identify relevant work within their lines…
Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics,…
Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent…
Social scientists employ latent Dirichlet allocation (LDA) to find highly specific topics in large corpora, but they often struggle in this task because (1) LDA, in general, takes a significant amount of time to fit on large corpora; (2)…
Latent Dirichlet Allocation (LDA) is a three-level hierarchical Bayesian model for topic inference. In spite of its great success, inferring the latent topic distribution with LDA is time-consuming. Motivated by the transfer learning…
Latent Dirichlet Allocation (LDA) is a prominent generative probabilistic model used for uncovering abstract topics within document collections. In this paper, we explore the effectiveness of augmenting topic models with Large Language…
In this paper, we investigate mathematical content representations suitable for the automated classification of and the similarity search in STEM documents using standard machine learning algorithms: the Latent Dirichlet Allocation (LDA)…
Robotic Process Automation (RPA) has rapidly evolved into a widely recognized and influential software technology. Its growing relevance has sparked diverse research efforts across various disciplines. This study aims to map the scientific…
Latent Dirichlet allocation (LDA) is a popular topic modeling technique in academia but less so in industry, especially in large-scale applications involving search engine and online advertising systems. A main underlying reason is that the…
Background: Radiography (X-rays) is the dominant modality in orthopedics, and improving the interpretation of radiographs is clinically relevant. Machine learning (ML) has revolutionized data analysis and has been applied to medicine, with…
We present LDAExplore, a tool to visualize topic distributions in a given document corpus that are generated using Topic Modeling methods. Latent Dirichlet Allocation (LDA) is one of the basic methods that is predominantly used to generate…
This paper presents an intertemporal bimodal network to analyze the evolution of the semantic content of a scientific field within the framework of topic modeling, namely using the Latent Dirichlet Allocation (LDA). The main contribution is…
The problem of searching for experts in a given academic field is hugely important in both industry and academia. We study exactly this issue with respect to a database of authors and their publications. The idea is to use Latent Semantic…
Recommendation systems have an important place to help online users in the internet society. Recommendation Systems in computer science are of very practical use these days in various aspects of the Internet portals, such as social…
Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document…
Latent Dirichlet Allocation (LDA) is a foundational model for discovering latent thematic structure in discrete data, but its Dirichlet prior cannot represent the rich correlations and hierarchical relationships often present among topics.…
Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the…
In this paper we demonstrate the applicability of latent Dirichlet allocation (LDA) for classifying large Web document collections. One of our main results is a novel influence model that gives a fully generative model of the document…