Automata and Graph Compression
Abstract
We present a theoretical framework for the compression of automata, which are widely used in speech processing and other natural language processing tasks. The framework extends to graph compression. Similar to stationary ergodic processes, we formulate a probabilistic process of graph and automata generation that captures real world phenomena and provide a universal compression scheme LZA for this probabilistic model. Further, we show that LZA significantly outperforms other compression techniques such as gzip and the UNIX compress command for several synthetic and real data sets.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2015
- DOI:
- 10.48550/arXiv.1502.07288
- arXiv:
- arXiv:1502.07288
- Bibcode:
- 2015arXiv150207288M
- Keywords:
-
- Computer Science - Information Theory;
- Computer Science - Data Structures and Algorithms;
- Computer Science - Formal Languages and Automata Theory
- E-Print:
- 15 pages