English

Dirichlet-vMF Mixture Model

Computation and Language 2017-02-27 v1

Abstract

This document is about the multi-document Von-Mises-Fisher mixture model with a Dirichlet prior, referred to as VMFMix. VMFMix is analogous to Latent Dirichlet Allocation (LDA) in that they can capture the co-occurrence patterns acorss multiple documents. The difference is that in VMFMix, the topic-word distribution is defined on a continuous n-dimensional hypersphere. Hence VMFMix is used to derive topic embeddings, i.e., representative vectors, from multiple sets of embedding vectors. An efficient Variational Expectation-Maximization inference algorithm is derived. The performance of VMFMix on two document classification tasks is reported, with some preliminary analysis.

Keywords

Cite

@article{arxiv.1702.07495,
  title  = {Dirichlet-vMF Mixture Model},
  author = {Shaohua Li},
  journal= {arXiv preprint arXiv:1702.07495},
  year   = {2017}
}

Comments

5 pages

R2 v1 2026-06-22T18:27:12.207Z