Multinomial distribution

From Wikipedia, the free encyclopedia

Multinomial
Probability mass function
Cumulative distribution function
Parameters n > 0 number of trials (integer)
p_1, \ldots p_k event probabilities (Σpi = 1)
Support X_i \in \{0,\dots,n\}
\Sigma X_i = n\!
Probability mass function (pmf) \frac{n!}{x_1!\cdots x_k!} p_1^{x_1} \cdots p_k^{x_k}
Cumulative distribution function (cdf)
Mean E{Xi} = npi
Median
Mode
Variance Var(Xi) = npi(1 − pi)
Cov(Xi,Xj) = − npipj (i\neq j)
Skewness
Excess kurtosis
Entropy
Moment-generating function (mgf) \left( \sum_{i=1}^k p_i e^{t_i} \right)^n
Characteristic function

In probability theory, the multinomial distribution is a generalization of the binomial distribution.

The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. In a multinomial distribution, each trial results in exactly one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk (so that pi ≥ 0 for i = 1, ..., k and \sum_{i=1}^k p_i = 1), and there are n independent trials. Then let the random variables Xi indicate the number of times outcome number i was observed over the n trials. X=(X_1,\ldots,X_k) follows a multinomial distribution with parameters n and p, where p = (p1, ..., pk).

Contents

[edit] Specification

[edit] Probability mass function

The probability mass function of the multinomial distribution is:

 \begin{align}
f(x_1,\ldots,x_k;n,p_1,\ldots,p_k) & {} = \Pr(X_1 = x_1\mbox{ and }\dots\mbox{ and }X_k = x_k) \\  \\
& {} = \begin{cases} { \displaystyle {n! \over x_1!\cdots x_k!}p_1^{x_1}\cdots p_k^{x_k}}, \quad &
\mbox{when } \sum_{i=1}^k x_i=n \\  \\
0 & \mbox{otherwise,} \end{cases}
\end{align}

for non-negative integers x1, ..., xk.

[edit] Properties

The expected value is

\operatorname{E}(X_i) = n p_i.

The covariance matrix is as follows. Each diagonal entry is the variance of a binomially distributed random variable, and is therefore

\operatorname{var}(X_i)=np_i(1-p_i).

The off-diagonal entries are the covariances:

\operatorname{cov}(X_i,X_j)=-np_i p_j

for i, j distinct.

All covariances are negative because for fixed N, an increase in one component of a multinomial vector requires a decrease in another component.

This is a k × k nonnegative-definite matrix of rank k − 1.

The off-diagonal entries of the corresponding correlation matrix are

\rho(X_i,X_j) = -\sqrt{\frac{p_i p_j}{ (1-p_i)(1-p_j)}}.

Note that the sample size drops out of this expression.

Each of the k components separately has a binomial distribution with parameters n and pi, for the appropriate value of the subscript i.

The support of the multinomial distribution is the set :\{(n_1,\dots,n_k)\in \mathbb{N}^{k}| n_1+\cdots+n_k=n\}. Its number of elements is

{n+k-1 \choose k} = \left\langle \begin{matrix}n \\ k \end{matrix}\right\rangle,

the number of n-combinations of a multiset with k types, or multiset coefficient.

[edit] Related distributions

[edit] See also

[edit] External links

[edit] References

Evans, Merran; Nicholas Hastings, Brian Peacock (2000). Statistical Distributions. New York: Wiley, 134-136. 3rd ed.. ISBN 0-471-37124-6.