Pseudorandom generator

From Wikipedia, the free encyclopedia

This article may require cleanup to meet Wikipedia's quality standards.
Please improve this article if you can. (July 2007)

Let G be a deterministic polynomial time function from N^<ω to N^<ω with stretch function

l: N → N,

so that if x has length n then G(x) has length l(n). Then let G_n be the distribution on strings of length l(n) defined by the output of G on a randomly selected string of length n selected by the uniform distribution.

Then we say G is a pseudorandom generator if

{G_n}_{n ∈ N}

is pseudorandom.

In effect, G translates a random input of length n to a pseudorandom output of length l(n). Assuming

l(n) > n,

this expands a random sequence (and can be applied multiple times, since G_n can be replaced by the distribution of G(G(x))).

Often, we are concerned not with the behavior of G on all strings, but only on strings of some prescribed length. This case allows a slightly easier definition:

A function $G_l: \left \{0,1\right\}^n \rightarrow \left \{0,1\right\}^m\,$ with $n < m\,$ is a pseudorandom generator if

$G_l\,$ can be computed in $poly(n)\,$ time, and
$G_l(x)\,$ is pseudorandom.

It is an open problem whether or not pseudorandom generators exist. It is known that if one-way functions or hard-core predicates exist, then pseudorandom generators exist. It is also known that if

l(n) > n,

there is some other pseudorandom generator with

l(n) > p(n)

for any polynomial, p(n). This follows from the following theorem:

Theorem: If there is a pseudorandom generator

$G_l: \left \{0,1\right\}^{n} \rightarrow \left \{0,1\right\}^{n+1}\,$

then for any $m = poly(n) \,$ , there is a pseudorandom generator

$G_l: \left \{0,1\right\}^n \rightarrow \left \{0,1\right\}^m\,$

Pseudorandom generators have numerous applications. In cryptography, a simple application is providing an efficient analog of `one time pads'. It is well known that in order to encrypt a message m in a way that the cipher text provides no information on the plaintext, the key k used should be random over strings of length |m|. Then m can be encrypted via $c=k\oplus m$ . This operation is very costly in terms of key length. Key length can be reduced if we compromise on semantic security. Then, given G, which expands by a polynomial $n c + 1$ , then a sequence of $n c$ messages of length n can be encrypted by xor-ing each with the corresponding area of G(k) (inspired the idea of stream ciphers). Pseudorandom generators may also be used to construct symmetric key cryptosystems, where any polynomial number of messages can be `safely' encrypted under the same key, that is, the polynomial $n c$ is not apriority known at time of key generation. Such a construction can be based on a generalization of pseudo random generators, called pseudorandom functions. A family of pseudorandom functions (PRF's) is a collection of efficiently computable keyed functions, which `act randomly' in the scene that no efficient algorithm can distinguish between an oracle to a function corresponding to a random key, and an oracle to a random function. It's known that if PRG's exist, than so do PRF's (for more details see pseudorandom function). One application of PRF's is to understanding learning theory. Loosely speaking, given a sequence of examples $(x_1,f(x_1)),(x_2,f(x_2)),\ldots,(x_m,f(x_m)))$ e.t.c, the goal is to efficiently find a succinct representation of a function f out of a given class of functions consistent with the examples. PRF families (if exist) are a natural example of a class of functions with small representation size, but are not learnable.

Another application is to derandomizing algorithms. A nice pseudorandom generator is a pseudorandom number generator,

$G:\{0,1\}^n\rightarrow\{0,1\}^m$

with

$n=O(\log m)\,$ .

If a nice pseudorandom generator exists, then P=BPP. In fact, this strong derandomization result follows assuming the existence of a weaker type of pseudorandom generators, Nisan-Wigderson type generator with exponential stretch. Their definition weakens the definition of PRG above in two essential ways. First, it allows $G l$ to run in exponential in n time. Another important difference is that the output distribution is only required to be indistinguishable from uniform for circuits of size S'(n) for some fixed exponential S' which is smaller than S, as opposed to generators as in the definition above. It's easy to see that the existence of nice pseudorandom generators of this kind for some polynomial S(n) is sufficient to imply P=BPP, and follows from plausible hardness assumptions (that some problems in EXP don't have sub exponential circuits). In a nutshell, the idea is to replace the randomness used by a BPP algorithm A, by G(s), where s is a short (O(log(n))) random string. By pseudorandomness of G, the behavior of A on any given x will not change much, so we can count the number of 1's output by A obtained iterating over the s, and answer according to the majority. That is, $A(x,\cdot)$ can be viewed as a non uniform distinguisher of proper size. For more details on this result and other derandomization results see BPP.

For more on these and other applications of PRG's, see chapters 10,17 in a draft of a book by Arora and Barak available at http://www.cs.princeton.edu/theory/complexity/