Decoding methods

From Wikipedia, the free encyclopedia

This article may require cleanup to meet Wikipedia's quality standards.
Please improve this article if you can. (November 2007)

In communication theory and coding theory, decoding is the process of translating received messages into codewords of a given code. This article discusses common methods of mapping messages to codewords. These methods are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.

1 Notation
2 Ideal observer decoding
- 2.1 Decoding conventions
3 Maximum likelihood decoding
4 Minimum distance decoding
5 Syndrome decoding
6 See also
7 References

[edit] Notation

Henceforth $C \subset \mathbb{F}_2^n$ shall be a code of length $n$ ; $x, y$ shall be elements of $\mathbb{F}_2^n$ ; and $d (x, y)$ shall represent the Hamming distance between $x, y$ . Note that $C$ is not necessarily linear.

[edit] Ideal observer decoding

Given a received message $x \in \mathbb{F}_2^n$ , ideal observer decoding picks a codeword $y \in C$ to maximise:

$\mathbb{P}(y \mbox{ sent} \mid x \mbox{ received})$

i.e. choose the codeword $y$ that is most likely to be received as the message $x$ after transmission.

[edit] Decoding conventions

Note that the probability for each codeword may not be unique: there may be more than one codeword with an equal likelihood of mutating into the received message. In such a case, the sender and receiver(s) must agree on a decoding convention. Popular conventions include:

Request that the codeword be resent
Choose any random codeword from the set of most likely codewords

[edit] Maximum likelihood decoding

Given a received codeword $x \in \mathbb{F}_2^n$ maximum likelihood decoding picks a codeword $y \in C$ to maximise:

$\mathbb{P}(x \mbox{ received} \mid y \mbox{ sent})$

i.e. choose the codeword $y$ that was most likely to have been sent given that $x$ was received. Note that if all codewords are equally likely to be sent during ordinary use, then this scheme is equivalent to ideal observer decoding:

$\begin{align} \mathbb{P}(x \mbox{ received} \mid y \mbox{ sent}) & {} = \frac{ \mathbb{P}(x \mbox{ received} , y \mbox{ sent}) }{\mathbb{P}(y \mbox{ sent} )} \\ & {} = \mathbb{P}(y \mbox{ sent} \mid x \mbox{ received}) \cdot \frac{\mathbb{P}(x \mbox{ received})}{\mathbb{P}(y \mbox{ sent})} \\ & {} = \mathbb{P}(y \mbox{ sent} \mid x \mbox{ received}). \end{align}$

As with ideal observer decoding, a convention must be agreed to for non-unique decoding.

[edit] Minimum distance decoding

Given a received codeword $x \in \mathbb{F}_2^n$ , minimum distance decoding picks a codeword $y \in C$ to minimise the Hamming distance :

$d(x,y) = \# \{i : x_i \not = y_i \}$

i.e. choose the codeword $y$ that is as close as possible to $x$ .

Note that if the probability of error on a discrete memoryless channel $p$ is strictly less than one half, then minimum distance decoding is equivalent to maximum likelihood decoding, since if

$d(x,y) = d,\,$

then:

$\begin{align} \mathbb{P}(y \mbox{ received} \mid x \mbox{ sent}) & {} = (1-p)^{n-d} \cdot p^d \\ & {} = (1-p)^n \cdot \left( \frac{p}{1-p}\right)^d \\ \end{align}$

which (since p is less than one half) is maximised by minimising d.

Minimum distance decoding is also known as nearest neighbour decoding. It can be assisted or automated by using a standard array. Minimum distance decoding is a reasonable decoding method when the following conditions are met:

The probability $p$ that an error occurs is independent of the position of the symbol
Errors are independent events - an error at one position in the message does not affect other positions

These assumptions may be reasonable for transmissions over a binary symmetric channel. They may be unreasonable for other media, such as a DVD, where a single scratch on the disk can cause an error in many neighbouring symbols or codewords.

As with other decoding methods, a convention must be agreed to for non-unique decoding.

[edit] Syndrome decoding

Syndrome decoding is a highly efficient method of decoding a linear code over a noisy channel - ie one on which errors are made. In essence, syndrome decoding is minimum distance decoding using a reduced lookup table. It is the linearity of the code which allows for the lookup table to be reduced in size.

Suppose that $C\subset \mathbb{F}_2^n$ is a linear code of length $n$ and minimum distance $d$ with parity-check matrix $H$ . Then clearly $C$ is capable of correcting up to

$t = \left\lfloor\frac{d-1}{2}\right\rfloor$

errors made by the channel (since if no more than $t$ errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).

Now suppose that a codeword $x \in \mathbb{F}_2^n$ is sent over the channel and the error pattern $e \in \mathbb{F}_2^n$ occurs. Then $z = x + e$ is received. Ordinary minimum distance decoding would lookup the vector $z$ in a table of size $| C |$ for the nearest match - ie an element (not necessarily unique) $c \in C$ with

$d(c,z) \leq d(y,z)$

for all $y \in C$ . Syndrome decoding takes advantage of the property of the parity matrix that:

H x = 0

for all $x \in C$ . The syndrome of the received $z = x + e$ is defined to be:

H z = H (x + e) = H x + H e = 0 + H e = H e

Under the assumption that no more than $t$ errors were made during transmission the receiver looks up the value $H e$ in a table of size

$\begin{matrix} \sum_{i=0}^t \binom{n}{i} < |C| \\ \end{matrix}$

(for a binary code) against pre-computed values of $H e$ for all possible error patterns $e \in \mathbb{F}_2^n$ . Knowing what $e$ is, it is then trivial to decode $x$ as: