English

Efficient Dynamic WFST Decoding for Personalized Language Models

Computation and Language 2019-10-24 v1 Machine Learning

Abstract

We propose a two-layer cache mechanism to speed up dynamic WFST decoding with personalized language models. The first layer is a public cache that stores most of the static part of the graph. This is shared globally among all users. A second layer is a private cache that caches the graph that represents the personalized language model, which is only shared by the utterances from a particular user. We also propose two simple yet effective pre-initialization methods, one based on breadth-first search, and another based on a data-driven exploration of decoder states using previous utterances. Experiments with a calling speech recognition task using a personalized contact list demonstrate that the proposed public cache reduces decoding time by factor of three compared to decoding without pre-initialization. Using the private cache provides additional efficiency gains, reducing the decoding time by a factor of five.

Keywords

Cite

@article{arxiv.1910.10670,
  title  = {Efficient Dynamic WFST Decoding for Personalized Language Models},
  author = {Jun Liu and Jiedan Zhu and Vishal Kathuria and Fuchun Peng},
  journal= {arXiv preprint arXiv:1910.10670},
  year   = {2019}
}

Comments

5 pages, 4 figures

R2 v1 2026-06-23T11:52:49.646Z