Hat matrix

From Wikipedia, the free encyclopedia

The hat matrix, H, is used in statistics to relate errors in residuals to experimental errors. Suppose that a linear least squares problem is being addressed. The model can be written as

$\mathbf{y^{calc}=Jp},$

where J is a matrix of coefficients and p is a vector of parameters. The solution to the un-weighted least-squares equations is given by

$\mathbf{p=\left(J^\top J \right)^{-1} J^\top y^{obs}}.$

The vector of un-weighted residuals, r, is given by

$\mathbf {r=y^{obs}-y^{calc}=y^{obs}-J \left(J^\top J \right)^{-1} J^\top y^{obs}}.$

The matrix $\mathbf {H = J \left(J^\top J \right)^{-1} J^\top }$ is known as the hat matrix. Thus, the residuals can be expressed simply as

$\mathbf{r=\left(I-H \right) y^{obs}}.$

The hat matrix corresponding to a linear model is symmetric and idempotent, that is, $\mathbf {HH=H}$ . However, this is not always the case; for example, the LOESS hat matrix is generally not symmetric nor idempotent.

The variance-covariance matrix of the residuals is, by error propagation, equal to $\mathbf{\left(I-H \right)^\top M\left(I-H \right) }$ , where M is the variance-covariance matrix of the errors (and by extension, the observations as well). Thus, the residual sum of squares is a quadratic form in the observations.