English

Vectorized Bayesian Inference for Latent Dirichlet-Tree Allocation

Machine Learning 2026-02-24 v1 Machine Learning

Abstract

Latent Dirichlet Allocation (LDA) is a foundational model for discovering latent thematic structure in discrete data, but its Dirichlet prior cannot represent the rich correlations and hierarchical relationships often present among topics. We introduce the framework of Latent Dirichlet-Tree Allocation (LDTA), a generalization of LDA that replaces the Dirichlet prior with an arbitrary Dirichlet-Tree (DT) distribution. LDTA preserves LDA's generative structure but enables expressive, tree-structured priors over topic proportions. To perform inference, we develop universal mean-field variational inference and Expectation Propagation, providing tractable updates for all DT. We reveal the vectorized nature of the two inference methods through theoretical development, and perform fully vectorized, GPU-accelerated implementations. The resulting framework substantially expands the modeling capacity of LDA while maintaining scalability and computational efficiency.

Keywords

Cite

@article{arxiv.2602.18795,
  title  = {Vectorized Bayesian Inference for Latent Dirichlet-Tree Allocation},
  author = {Zheng Wang and Nizar Bouguila},
  journal= {arXiv preprint arXiv:2602.18795},
  year   = {2026}
}

Comments

Submitted to JMLR, under review