Patent 5,488,364 on compression of random data

This document is an analysis of patent 5,488,364 on the compression of random data. Read first the analysis of patent 5,533,051 on the same subject if you have not yet done so.
  Patent 5,488,364 "Recursive data compression".
  Filed Feb. 28, 1994; granted Jan. 30, 1996
  Inventor Michael L. Cole
The full text of the patent is available here (uncompressed or compressed). Thanks to Gregory Aharonian for providing this text.

The patented method is claimed to be able to compress random data, and thus to be applicable recursively to already compressed data. However it merely transforms a random stream to another almost equally random stream:

The patent summary states:
Averaged over a statistically significant sample of random input data, the first block will have a percentage of adjacent bits of the "1" state that is high relative to that of the original input data, and the second block will have a percentage of adjacent bits of the "0" state that is high relative to that of the original input data.
This is the flaw. For random input data, the first block will have by construction a simple majority of "1" bits, but not a high percentage of "1" bits. The very small reduction in the entropy of each block is cancelled by the bits required to specify the "bit-reversal key" and the "keyword".

The patent claims themselves are harmless. They describe a compression method which will actually slightly expand random data instead of compressing it. It is therefore useless for recursive compression.

Jean-loup Gailly

Back to the gzip home page
back to Jean-loup's page