User:Orimosenzon/notes

From Wikipedia, the free encyclopedia

Contents

[edit] Expectation

[edit] Expectation according to joint distribution equals single distribution Expectation


E_{p(x_1,x_2)}(X_1) = \sum_{x_1,x_2}{p(x_1,x_2)(x_1)} = \sum_{x_1} \sum_{x_2} p(x_1)p(x_2|x_1)x_1 =


\sum_{x_1}p(x_1)x_1 \sum_{x_2}p(x_2|x_1) = \sum_{x_1}p(x_1)x_1 \cdot 1 = E_{p(x_1)}(X_1)

Hence:

E_{p(x_1,x_2)}(X_1) = E_{p(x_1)}(X_1)

Also:

E_{p(x_1,x_2)}(f(X_1)) = E_{p(x_1)}(f(X_1))

[edit] linearity

 E(X_1+X_2) = \sum_{x_1,x_2}{p(x_1,x_2)(x_1+x_2)} =

 \sum_{x_1,x_2}{p(x_1,x_2)x_1}+\sum_{x_1,x_2}{p(x_1,x_2)x_2} =

 
\sum_{x_1}{p(x_1)x_1}+\sum_{x_2}{p(x_2)x_2} = E(X_1)+E(X_2)


hence:

E(X1 + X2) = E(X1) + E(X2)

EX) = p(xx = λ p(x)x = λE(X)
x x

hence:

EX) = λE(X)

[edit] Variance & Standard deviation

[edit] definitions

V(X) = def = E([XE(X)]2)

 \sigma(X) =def= \sqrt{V(X)}

[edit] The meaning of standard deviation

One way to look at standard deviation is as an approximation of the "expected drift" from the expectation. The "expected drift" could be defined as:

ED(X) = def? = E( | XE(X) | )

This value is probably not easy to manipulate.

Suppose that X can have only the two values k and k and that E(X) = 0. Then:

V(X) = def = E([XE(X)]2) = E(X2) = k2

and

 \sigma(X) = \sqrt{V(X)} = k

and

ED(X) = E( | XE(X) | ) = E( | X | ) = E(k) = k = σ(X)

V, σ and ED doesn't change by adding a constant so any random variable X that all its drifts are of the same absolute value k has σ(X) = ED(X).

Whenever the drift values are not the same, σ averages with bigger weights to bigger values while ED keep fair plane average. *todo*: show why

Example:Suppose you are performing the following experiment: you flip a coin, if it's head you go 5 meters to the left, if it is tail, you go 5 meters to the right. The variance in this case is 25 and the standard deviation is 5. The expected drift is also 5 (all the drift values are equal). More on that example, see here.

[edit] Alternative definition of variance

V(X) = def = E((XE(X))2) = E(X2 + E2(X) − 2XE(X)) =

E(X2) + E2(X) − 2E2(X) = E(X2) − E2(X)

hence:

V(X) = E(X2) − E2(X)

[edit] variance (and sd) doesn't change by adding a constant

V(X + c) = E([X + cE(X + c)]2) = E([X + cE(X) − E(c)]2) = E([XE(X)]2) = V(X)

[edit] variance of multiplication

VX) = E((λX)2) − E2X) = λ2E(X2) − λ2E2(X) = λ2(E(X2) − E2(X)) = λ2V(X)

hence:

VX) = λ2V(X)

[edit] SD of multiplication

 \sigma(\lambda X) = \sqrt{V(\lambda X)} = \sqrt{\lambda^2V(X)} = \lambda \sqrt{V(X)} = \lambda \sigma(X)

hence:

σ(λX) = λσ(X)

[edit] Variance of sum of random variables

V(X1 + X2) = E((X1 + X2)2) − E2(X1 + X2) =

E(X_1^2)+E(X_2^2)+2E(X_1 X_2) -
(E^2(X_1) + E^2(X_2) + 2E(X_1)E(X_2))= \cdot
V(X_1)+V(X_2)+2Cov(X_1,X_2) \cdot

hence:

V(X1 + X2) = V(X1) + V(X2) + 2Cov(X1,X2)

When X1 and X2 are independent, Cov(X1,X2) = 0 and hence:

X1,X2 Independent  \Rightarrow V(X_1+X_2) = V(X_1)+V(X_2)

When X1 and X2 are i.i.d (identically independent distributed) then:

X1,X2 i.i.d  \Rightarrow V(X_1+X_2) = V(X_1)+V(X_2) = 2V(X_1)

Or more generally:

 X_1, X_2, \ldots, X_n i.i.d  \Rightarrow V(\sum_{i=1}^n{X_i}) = \sum_{i=1}^n V(X_i) = n V(X_1)

hence:

 X_1, X_2, \ldots, X_n i.i.d  \Rightarrow \sigma (\sum_{i=1}^n{X_i}) = \sqrt{V(\sum_{i=1}^n{X_i})} = \sqrt{ n V(X_1) } = \sqrt{n} \cdot \sigma (X_1)


Note the difference from summing the variable with itself (identically distributed but not independent):

V(X1 + X1) = V(2X1) = 4 V(X1)

and

σ(X1 + X1) = σ(2X1) = 2 σ(X1)

[edit] more on the last result

We've showed that:

 X_1, X_2, \ldots, X_n i.i.d  \Rightarrow \sigma (\sum_{i=1}^n{X_i}) =  \sqrt{n} \cdot \sigma (X_1)

Why is this important?

σ is a measure for expected drift. The last result shows that the expected drift goes as square root (less than linear) with successive experiments... this means that the mean drift tends to zero:

 \lim_{n\to\infty}\frac{\sigma(\sum_{i=1}^n X_i)}{n}  = 
\lim_{n\to\infty}\frac{\sqrt{n} \cdot \sigma (X_1)}{n} = 0

Recall the example of the random walk +-5. Now, suppose You repeat the process n times. What is the expected drift?

The standard deviation, which can be considered as a measure to that drift is:  \sqrt{n} \cdot 5

The mean drift is:  5 \frac{\sqrt{n}}{n}

For example, for 10000 iterations, the mean drift is:  5 \frac{\sqrt{100000}}{10000} = 0.05 meter. Instead of 5 meter in each step it is 5 centimeter. The total drift is only 500 instead of 50,000.

  • todo:*...example of random walk +-5 gnuplot picture. the relation to the law of big numbers... the fact that frequent ration converges is an assumption in probability theory or a result?..

[edit] misc

 V(X) = 0 \Leftrightarrow E([X-E(X)]^2) = 0 \Leftrightarrow \forall{x}, x-E(X) = 0 \Leftrightarrow X is constant.

hence:

 V(X) = 0 \Leftrightarrow X is constant.

[edit] Covariance

[edit] Alternative definition

Cov(X1,X2) = E((X1E(X1))(X2E(X2))) = E(X1X2) + E(X1)E(X2) − 2E(X1)E(X2) = E(X1X2) − E(X1)E(X2)

hence:

Cov(X1,X2) = E(X1X2) − E(X1)E(X2)

A special case is a covariance of two of the same random variable: Cov(X,X) = E(XX) − E(X)E(X) = V(X)

[edit] Covariance of independent variables

Assume that X1 and X2 are independent:

 E(X_1 X_2) = \sum_{x_1,x_2}{p(x_1,x_2)x_1 x_2} = \sum_{x_1,x_2}{p(x_1) p(x_2)x_1 x_2}   \sum_{x_1} p(x_1) x_1 \sum_{x_2} p(x_2)x_2 = E(X_1) E(X_2)

And hence:

X1,X2 independent  \implies Cov(X_1,X_2) = 0

The contrary is not true, however. For example, if X is a constant random variable then

COV(X,X) = V(X) = 0

But of course, X and X are very much dependent.

[edit] Wiener processes

(also known as "Brownian motion")

Let Z be a stochastic process with the following properties: 1. The change δZ in a small period of time δt is

 \delta Z = \epsilon \cdot \sqrt{\delta t}

where:

ε˜φ(0,1)

[edit] Summary

[edit] Expectation

  • E_{p(x_1,x_2)}(X_1) = E_{p(x_1)}(X_1)
  • E(X1 + X2) = E(X1) + E(X2)
  • EX) = λE(X)

[edit] Variance and standard deviation

  • V(X) = E(X2) − E2(X)
  • VX) = λ2V(X)
  • σ(λX) = λσ(X)
  • V(X1 + X2) = V(X1) + V(X2) + 2Cov(X1,X2)
  • X1,X2 Independent  \Rightarrow V(X_1+X_2) = V(X_1)+V(X_2)
  •  X_1, X_2, \ldots, X_n i.i.d  \Rightarrow V(\sum_{i=1}^n{X_i}) = n V(X_1)
  •  X_1, X_2, \ldots, X_n i.i.d  \Rightarrow  \sigma (\sum_{i=1}^n{X_i}) = \sqrt{n} \cdot \sigma (X_1)

[edit] Covariance

  • Cov(X1,X2) = E(X1X2) − E(X1)E(X2)


[edit] Misc

int main() {
  cout << "hello lord\n";
}