Differential entropy
From Wikipedia, the free encyclopedia
| It has been suggested that Information entropy#Extending discrete entropy to the continuous case: differential entropy be merged into this article or section. (Discuss) |
Differential entropy (also referred to as continuous entropy) is a concept in information theory which tries to extend the idea of (Shannon) entropy, a measure of average surprisal of a random variable, to continuous probability distributions.
Contents |
[edit] Definition
Let X be a random variable with a probability density function f whose support is a set
. The differential entropy h(X) or h(f) is defined as
As with its discrete analog, the units of differential entropy depend on the base of the logarithm, which is usually 2 (i.e., the units are bits). See logarithmic units for logarithms taken in different bases. Related concepts such as joint, conditional differential entropy, and relative entropy are defined in a similar fashion. One must take care in trying to apply properties of discrete entropy to differential entropy, since probability density functions can be greater than 1. For example, Uniform(0,1/2) has differential entropy
.
The definition of differential entropy above can be obtained by partitioning the range of X into bins of length Δ with associated sample points iΔ within the bins, for X Riemann integrable. This gives a quantized version of X, defined by XΔ = iΔ if
. Then the entropy of XΔ is
.
The first term approximates the differential entropy, while the second term is approximately − log(Δ). Note that this procedure suggests that the differential entropy of a discrete random variable should be
.
Note that the continuous mutual information I(X;Y) has the distinction of retaining its fundamental significance as a measure of discrete information since it is actually the limit of the discrete mutual information of partitions of X and Y as these partitions become finer and finer. Thus it is invariant under linear transformations of X and Y, and still represents the amount of discrete information that can be transmitted over a channel that admits a continuous space of values.[1]
[edit] Properties of differential entropy
- For two densities f and g,
with equality if f = g almost everywhere. Similarly, for two random variables X and Y,
and
with equality if and only if X and Y are independent. - The chain rule for differential entropy holds as in the discrete case
.
- Differential entropy is translation invariant, ie, h(X + c) = h(X) for a constant c.
- Differential entropy is in general not invariant under arbitrary invertible maps. In particular, for a constant a,
. For a vector valued random variable X and a matrix A,
. - In general, for a transformation from a random vector X to a random vector with same dimension Y
, the corresponding entropies are related via
where
is the Jacobian of the transformation m. - If a random vector
has mean zero and covariance matrix K,
with equality if and only if X is jointly gaussian.
[edit] Example: Exponential distribution
Let X be an exponentially distributed random variable with parameter λ, that is, with probability density function
Its differential entropy is then
![]() |
![]() |
![]() |
|
![]() |
|
![]() |
Here, he(X) was used rather than h(X) to make it explicit that the logarithm was taken to base e, to simplify the calculation.
[edit] Differential entropies for various distributions
In the table below,
(the gamma function),
,
, and γ is Euler's constant.
| Distribution Name | Probability density function (pdf) | Entropy in nats |
|---|---|---|
| Uniform | for ![]() |
![]() |
| Normal | ![]() |
![]() |
| Exponential | ![]() |
![]() |
| Rayleigh | ![]() |
![]() |
| Beta | for ![]() |
![]() |
| Cauchy | ![]() |
![]() |
| Chi | ![]() |
![]() |
| Chi-squared | ![]() |
|
| Erlang | ![]() |
![]() |
| F | ![]() |
|
| Gamma | ![]() |
![]() |
| Laplace | ![]() |
![]() |
| Logistic | ![]() |
![]() |
| Lognormal | ![]() |
![]() |
| Maxwell-Boltzmann | ![]() |
![]() |
| Generalized normal | ![]() |
![]() |
| Pareto | ![]() |
![]() |
| Student's t | ![]() |
![]() |
| Triangular | ![]() |
![]() |
| Weibull | ![]() |
![]() |
| Multivariate normal | ![]() |
![]() |
[edit] See also
[edit] References
- ^ Fazlollah M. Reza (1961, 1994). An Introduction to Information Theory. Dover Publications, Inc., New York. ISBN 0-486-68210-2.
- Thomas M. Cover, Joy A. Thomas. Elements of Information Theory New York: Wiley, 1991. ISBN 0-471-06259-6
- Lazo, A. and P. Rathie. On the entropy of continuous probability distributions Information Theory, IEEE Transactions on, 1978. 24(1): p. 120-122.





![= -\log \lambda \int_0^\infty f(x)\,dx + \lambda E[X]](../../../../math/1/1/1/111f140bd026b08482d9840334508dfd.png)

for 







for 
![\ln B(p,q) - (p-1)[\psi(p) - \psi(p + q)] - (q-1)[\psi(q) - \psi(p + q)] \,](../../../../math/a/b/5/ab553119cbe12c724fa1955f79fbbc44.png)

































