English
Related papers

Related papers: Component models for large networks

200 papers

In this paper we demonstrate the applicability of latent Dirichlet allocation (LDA) for classifying large Web document collections. One of our main results is a novel influence model that gives a fully generative model of the document…

Information Retrieval · Computer Science 2010-06-28 István Bíró , Jácint Szabó

Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent…

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a…

Machine Learning · Statistics 2018-07-20 Martin Gerlach , Tiago P. Peixoto , Eduardo G. Altmann

Social scientists employ latent Dirichlet allocation (LDA) to find highly specific topics in large corpora, but they often struggle in this task because (1) LDA, in general, takes a significant amount of time to fit on large corpora; (2)…

Methodology · Statistics 2025-12-23 Kohei Watanabe

Mixed-membership (MM) models such as Latent Dirichlet Allocation (LDA) have been applied to microbiome compositional data to identify latent subcommunities of microbial species. These subcommunities are informative for understanding the…

Applications · Statistics 2022-05-18 Patrick LeBlanc , Li Ma

Latent Dirichlet Allocation (LDA) is a foundational model for discovering latent thematic structure in discrete data, but its Dirichlet prior cannot represent the rich correlations and hierarchical relationships often present among topics.…

Machine Learning · Computer Science 2026-02-24 Zheng Wang , Nizar Bouguila

This paper presents an intertemporal bimodal network to analyze the evolution of the semantic content of a scientific field within the framework of topic modeling, namely using the Latent Dirichlet Allocation (LDA). The main contribution is…

Computation and Language · Computer Science 2020-02-13 Luigi Di Caro , Marco Guerzoni , Massimiliano Nuccio , Giovanni Siragusa

Despite many years of research into latent Dirichlet allocation (LDA), applying LDA to collections of non-categorical items is still challenging. Yet many problems with much richer data share a similar structure and could benefit from the…

Machine Learning · Statistics 2020-01-08 Iryna Korshunova , Hanchen Xiong , Mateusz Fedoryszak , Lucas Theis

Compositional Data Analysis (CoDa) has gained popularity in recent years. This type of data consists of values from disjoint categories that sum up to a constant. Both Dirichlet regression and logistic-normal regression have become popular…

Methodology · Statistics 2024-06-25 Joaquín Martínez-Minaya , Haavard Rue

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics,…

Applications · Statistics 2009-09-29 David M. Blei , John D. Lafferty

With the emergence of social networking services, researchers enjoy the increasing availability of large-scale heterogenous datasets capturing online user interactions and behaviors. Traditional analysis of techno-social systems data has…

Social and Information Networks · Computer Science 2017-03-07 Yoon-Sik Cho , Greg Ver Steeg , Emilio Ferrara , Aram Galstyan

The evolution of communities in dynamic (time-varying) network data is a prominent topic of interest. A popular approach to understanding these dynamic networks is to embed the dyadic relations into a latent metric space. While methods for…

Methodology · Statistics 2020-03-18 Joshua Daniel Loyal , Yuguo Chen

As electronically stored data grow in daily life, obtaining novel and relevant information becomes challenging in text mining. Thus people have sought statistical methods based on term frequency, matrix algebra, or topic modeling for text…

Information Retrieval · Computer Science 2019-07-04 Clint P. George , Wei Xia , George Michailidis

Topic modeling, a method for extracting the underlying themes from a collection of documents, is an increasingly important component of the design of intelligent systems enabling the sense-making of highly dynamic and diverse streams of…

Information Retrieval · Computer Science 2019-10-07 Chris Gropp , Alexander Herzog , Ilya Safro , Paul W. Wilson , Amy W. Apon

We consider the estimation of Dirichlet Process Mixture Models (DPMMs) in distributed environments, where data are distributed across multiple computing nodes. A key advantage of Bayesian nonparametric models such as DPMMs is that they…

Machine Learning · Statistics 2017-09-20 Ruohui Wang , Dahua Lin

Recommendation systems have an important place to help online users in the internet society. Recommendation Systems in computer science are of very practical use these days in various aspects of the Internet portals, such as social…

Information Retrieval · Computer Science 2018-12-21 Hamed Jelodar , Yongli Wang , Mahdi Rabbani , Ru-xin Zhao , Seyedvalyallah Ayobi , Peng Hu , Isma Masood

Network data are observed in various applications where the individual entities of the system interact with or are connected to each other, and often these interactions are defined by their associated strength or importance. Clustering is a…

Methodology · Statistics 2025-06-02 Iuliia Promskaia , Adrian O'Hagan , Michael Fop

Latent Dirichlet Allocation (LDA) is a prominent generative probabilistic model used for uncovering abstract topics within document collections. In this paper, we explore the effectiveness of augmenting topic models with Large Language…

Computation and Language · Computer Science 2025-07-14 Mengze Hong , Chen Jason Zhang , Di Jiang

Latent Dirichlet allocation (LDA) is a popular topic modeling technique in academia but less so in industry, especially in large-scale applications involving search engine and online advertising systems. A main underlying reason is that the…

Information Retrieval · Computer Science 2015-12-08 Yi Wang , Xuemin Zhao , Zhenlong Sun , Hao Yan , Lifeng Wang , Zhihui Jin , Liubin Wang , Yang Gao , Ching Law , Jia Zeng

Individual events at high-energy colliders like the LHC can be represented by a sequence of measurements, or 'point patterns' in an observable space. Starting from this data representation, we build a simple Bayesian probabilistic model for…

High Energy Physics - Phenomenology · Physics 2020-12-17 Darius A. Faroughy
‹ Prev 1 2 3 10 Next ›