An exploration of dynamic Markov compression

Type of content
Theses / Dissertations
Publisher's DOI/URI
Thesis discipline
Computer Science
Degree name
Master of Science
Publisher
University of Canterbury. Computer Science
Journal Title
Journal ISSN
Volume Title
Language
Date
1994
Authors
Whitehead, Robert Fletcher
Abstract

Data Compression is the process of removing redundancy from data. Dynamic Markov Compression (DMC), developed by Cormack and Horspool, is a method for performing statistical data compression of a binary source. DMC generates a finite context state model by adaptively generating a Finite State Machine (FSM) that captures symbol frequencies within the source message. Traditionally DMC has operated on a binary alphabet creating a FSM of binary transitions between states. The compression of DMC is comparable with PPM, which is one of the best compression methods in use today. However because DMC is a bit-wise modeller, it is considerably slower than PPM and requires large amounts of memory to achieve this compression. This thesis extends the bit-wise modeller DMC to a finite character alphabet, which promises to improve compression speed. Considerable attention is given to efficient data structures capable of representing individual states of the FSM. While the performance of Character DMC fails to match bit-wise DMC or PPM, a significant improvement in compression speed can be achieved. Evolving Character DMC to capture variable length phrases was the next approach taken. Several inherent problems involving locating words within each state have surfaced. Again, the comparable compression with PPM was disappointing, and the compression speed was less than half that of Character DMC. A final implementation attempted to avoid the large resource requirements of DMC It achieves this by restricting both the length of the source message, and the amount of memory available for the FSM. Reasonable compression is achieved, at very high speeds. Such an implementation is ideal for network packets, or hard disk block compression.

Description
Citation
Keywords
Ngā upoko tukutuku/Māori subject headings
ANZSRC fields of research
Rights
Copyright Robert Fletcher Whitehead