Standard error (statistics)

From Wikipedia, the free encyclopedia

The standard error of a method of measurement or estimation is the estimated standard deviation of the error in that method. Specifically, it estimates the standard deviation of the difference between the measured or estimated values and the true values. Notice that the true value of the standard deviation is usually unknown and the use of the term standard error carries with it the idea that an estimate of this unknown quantity is being used. It also carries with it the idea that it measures, not the standard deviation of the estimate itself, but the standard deviation of the error in the estimate, and these can be very different.

In applications where a standard error is used, it would be good to be able to take proper account of the fact that the standard error is only an estimate. Unfortunately this is not often possible and it may then be better to use an approach that avoids using a standard error, for example by using maximum likelihood or a more formal approach to deriving confidence intervals. One well-known case where a proper allowance can be made arises where the Student's t-distribution is used to provide a confidence interval for an estimated mean or difference of means. In other cases, the standard error may usefully be used to provide an indication of the size of the uncertainty, but its formal or semi-formal use to provide confidence intervals or tests should be avoided unless the sample size is at least moderately large. Here "large enough" would depend on the particular quantities being analysed.

[edit] Standard error of the mean

The standard error of the mean (SEM) of a sample from a population is the standard deviation of the sample (sample standard deviation) divided by the square root of the sample size (assuming statistical independence of the values in the sample):

SE_\bar{x}\ = \frac{s}{\sqrt{n}}

where

s is the sample standard deviation (i.e. the sample based estimate of the standard deviation of the population), and
n is the size (number of items) of the sample.

This may be compared with the formula for the true standard deviation of the mean:

SD_\bar{x}\ = \frac{\sigma}{\sqrt{n}}

where

σ is the standard deviation of the population.


Note: Standard error may also be defined as the standard deviation of the residual error term. (Kenney and Keeping, p. 187; Zwillinger 1995, p. 626)


[edit] Assumptions and usage

If the data are assumed to be normally distributed, quantiles of the normal distribution and the sample mean and standard error can be used to calculate confidence intervals for the mean. The following expressions can be used to calculate the upper and lower 95% confidence limits, where x is equal to the sample mean, s is equal to the standard error for the sample mean, and 1.96 is the .975 quantile of the normal distribution.

Upper 95% Limit = x + (s*1.96)
Lower 95% Limit = x - (s*1.96)

In particular, the standard error of a sample statistic (such as sample mean) is the estimated standard deviation of the error in the process by which it was generated. In other words, it is the standard deviation of the sampling distribution of the sample statistic. The notation for standard error can be any one of SE, SEM (for standard error of measurement or mean), or SE.

Standard errors provide simple measures of uncertainty in a value and are often used because: