English

Automata and Graph Compression

Information Theory 2015-02-26 v1 Data Structures and Algorithms Formal Languages and Automata Theory math.IT

Abstract

We present a theoretical framework for the compression of automata, which are widely used in speech processing and other natural language processing tasks. The framework extends to graph compression. Similar to stationary ergodic processes, we formulate a probabilistic process of graph and automata generation that captures real world phenomena and provide a universal compression scheme LZA for this probabilistic model. Further, we show that LZA significantly outperforms other compression techniques such as gzip and the UNIX compress command for several synthetic and real data sets.

Keywords

Cite

@article{arxiv.1502.07288,
  title  = {Automata and Graph Compression},
  author = {Mehryar Mohri and Michael Riley and Ananda Theertha Suresh},
  journal= {arXiv preprint arXiv:1502.07288},
  year   = {2015}
}

Comments

15 pages

R2 v1 2026-06-22T08:38:02.446Z