Problem 03

Problem 03

Let S be the set of all finite binary strings. Denote by |x| the length of a string x. A strong compression algorithm is a mapping f : S→S with the properties ∀x∈S: |f(x)| ≤ |x| (no bit string is longer after compression) and ∃x∈S: |f(x)| < |x| (at least one string is shorter after compression). Show that every strong compression algorithm yields data loss, i.e. that f cannot be injective.


Let f be a strong compression algorithm and let y be a binary string which is shorter after compression. Let T = {x∈S : |x| < |y|}. Since no string is longer after compression, T∪{y} is mapped to T by f, and contains strictly more elements (since y∉T). The pigeonhole principle yields that there are at least two strings of T∪{y} mapped to the same string of T, hence f is not injective.


This problem is particularly nice for computer science students, as it can make them realize from the start that these abstract mathematical definitions (which are often hard and confusing for them) are actually relevant for their field.