Pearson hashing
From Wikipedia, the free encyclopedia
| This article is orphaned as few or no other articles link to it. Please help introduce links in articles on related topics. (June 2008) |
Pearson hashing[1] is a hash function designed for fast execution on processors with 8-bit registers. Given an input consisting of any number of bytes, it produces as output a single byte that is strongly dependent[1] on every byte of the input. Its implementation requires only a few instructions, plus a 256-byte lookup table containing a permutation of the values 0 through 255.
This hash function is a CBC-MAC that uses an 8-bit random block cipher implemented via the permutation table. An 8-bit block cipher has negligible cryptographic security, so the Pearson hash function is not cryptographically strong; but it offers these benefits:
- It is extremely simple.
- It executes quickly on resource-limited processors.
- There is no simple class of inputs for which collisions (identical outputs) are especially likely.
- Given a small, privileged set of inputs (e.g., reserved words for a compiler), the permutation table can be adjusted so that those inputs yield distinct hash values, producing what is called a perfect hash function.
The algorithm was originally described by the following pseudocode, which computes the hash of message C using the permutation table T and the auxiliary array h:
h[0] := 0
for i in 1..n loop
index := h[i-1] xor C[i]
h[i] := T[index]
end loop
return h[n]
In the Python programming language, the hash algorithm can be implemented as follows (assuming that permutation_table is defined externally):
def hash(input): h = 0 for ch in input: h = permutation_table[h ^ ord(ch)] return h
[edit] References
- ^ a b "Fast Hashing of Variable-Length Text Strings". Peter K. Pearson, Communications of the ACM 33(6), 677 (1990) — ACM full text (requires subscription)

