Universal data compression

From Wikipedia, the free encyclopedia

This article does not cite any references or sources. (January 2007)
Please help improve this article by adding citations to reliable sources. Unverifiable material may be challenged and removed.

The introduction to this article provides insufficient context for those unfamiliar with the subject.
Please help improve the article with a good introductory style.

Universal data compression is impossible, though some claim to have found a way to do it. Any data compression algorithm necessarily leaves some input data uncompressed. A simple counting argument illustrates this.

Suppose there is an algorithm A which compresses all data sequences of length b (in bits) to shorter length data sequences. Let c be the length of the longest compressed sequence. Since A compresses all data sequences it must be the case that c < b. Now, there are $2 b$ sequences of length b, but only $2 c$ sequences of length c. Since $2 c < 2 b$ , it must be the case that at least two sequences, say $U 1$ and $U 2$ compress to the same value, say $C$ . As a result, when uncompressing $C$ it is not possible to determine whether $U 1$ or $U 2$ should be produced. Therefore, there is no way to undo algorithm A, and A is not correct.

Categories: Data compression

Hidden categories: Articles lacking sources from January 2007 | All articles lacking sources | Wikipedia articles needing context | Wikipedia introduction cleanup

Universal data compression

From Wikipedia, the free encyclopedia

Views

Navigation

Interaction

Search