Linear model

Linear model

In statistics the linear model is given by

:Y = X eta + varepsilon

where "Y" is an "n"×1 column vector of random variables, "X" is an "n"×"p" matrix of "known" (i.e. observable and non-random) quantities, whose rows correspond to statistical units, β is a "p"×1 vector of (unobservable) parameters, and ε is an "n"×1 vector of "errors", which are uncorrelated random variables each with expected value 0 and variance σ2.

Much of the theory of linear models is associated with inferring the values of the parameters β and σ2. Typically this is done using the method of maximum likelihood, which in the case of normal errors is equivalent to the method of least squares.

Assumptions

Multivariate normal errors

Often one takes the components of the vector of errors to be independent and normally distributed, giving "Y" a multivariate normal distribution with mean "X"β and co-variance matrix σ2 "I", where "I" is the identity matrix. Having observed the values of "X" and "Y", the statistician must estimate β and σ2.

Rank of "X"

We usually assume that "X" is of full rank "p", which allows us to invert the "p" × "p" matrix X^{ op} X. The essence of this assumption is that the parameters are not linearly dependent upon one another, which would make little sense in a linear model. This also ensures the model is identifiable.

Methods of inference

Maximum likelihood

β

The log-likelihood function (for epsilon_i independent and normally distributed) is

:l(eta, sigma^2; Y) = -frac{n}{2} log (2 pi sigma^2) - frac{1}{2sigma^2} sum_{i=1}^n left(Y_i - x_i^{ op} eta ight)^2

where x_i^{ op} is the "i"th row of "X". Differentiating with respect to βj, we get

:frac{partial l}{partial eta_j} = frac{1}{sigma^2} sum_{i=1}^n x_{ij} left( Y_i - x_i^{ op} eta ight)

so setting this set of "p" equations to zero and solving for β gives

:X^{ op} X hat{eta} = X^{ op} Y.

Now, using the assumption that "X" has rank "p", we can invert the matrix on the left hand side to give the maximum likelihood estimate for β:

: hat{eta} = (X^{ op} X)^{-1} X^{ op} Y.

We can check that this is a maximum by looking at the Hessian matrix of the log-likelihood function.

σ2

By setting the right hand side of

: frac{partial l}{partial sigma^2} = -frac{n}{2sigma^2} + frac{1}{2 sigma^4} sum_{i=1}^n left(Y_i - x_i^{ op} eta ight)^2

to zero and solving for σ2 we find that

: hat{sigma}^2 = frac{1}{n} sum_{i=1}^n left(Y_i - x_i^{ op} hat{eta} ight)^2 = frac{1}{n} | Y - X hat{eta} |^2.

Accuracy of maximum likelihood estimation

Since we have that "Y" follows a multivariate normal distribution with mean "X"β and co-variance matrix σ2 "I", we can deduce the distribution of the MLE of β:

: hat{eta} = (X^{ op} X)^{-1} X^{ op} Y sim N_p (eta, (X^{ op}X)^{-1} sigma^2 ).

So this estimate is unbiased for β, and we can show that this variance achieves the Cramér-Rao bound.

A more complicated argumentA.C. Davidson "Statistical Models". Cambridge University Press (2003).] shows that

: hat{sigma}^2 sim frac{sigma^2}{n} chi^2_{n-p};

since a chi-squared distribution with "n" − "p" degrees of freedom has mean "n" − "p", this is only asymptotically unbiased.

Generalizations

Generalized least squares

If, rather than taking the variance of ε to be σ2"I", where "I" is the "n"×"n" identity matrix, one assumes the variance is σ2"Ω", where "Ω" is a known matrix other than the identity matrix, then one estimates β by the method of "generalized least squares", in which, instead of minimizing the sum of squares of the residuals, one minimizes a different quadratic form in the residuals — the quadratic form being the one given by the matrix "Ω"−1:

::{min_{etaleft(y-Xeta ight)'Omega^{-1}left(y-Xeta ight).

This has the effect of "de-correlating" normal errors, and leads to the estimator

::widehat{eta}=left(X'Omega^{-1}X ight)^{-1}X'Omega^{-1}y

which is the best linear unbiased estimator for eta. If all of the off-diagonal entries in the matrix "Ω" are 0, then one normally estimates β by the method of weighted least squares, with weights proportional to the reciprocals of the diagonal entries. The GLS estimator is also known as the Aitken estimator, after Alexander Aitken, the Professor in the University of Otago Statistics Department who pioneered it. [ [http://web.uvic.ca/econ/aitken.html Alexander Craig Aitken ] ]

Generalized linear models

Generalized linear models, for which rather than

: E("Y") = "X"β,

one has

:"g"(E("Y")) = "X"β,

where "g" is the "link function". The variance is also not restricted to being normal.

An example is the Poisson regression model, which states that

:"Y""i" has a Poisson distribution with expected value "e"γ+δ"x""i".The link function is the natural logarithm function.Having observed "x""i" and "Y""i" for"i" = 1, ..., "n", one can estimate γ and δ by the method of maximum likelihood.

General linear model

The general linear model (or multivariate regression model) is a linear model with multiple measurements per object. Each object may be represented in a vector.

References

ee also

* ANOVA, or analysis of variance, is historically a precursor to the development of linear models. Here the model parameters themselves are not computed, but "X" column contributions and their significance are identified using the ratios of within-group variances to the error variance and applying the F test.
* Linear regression
* Robust regression


Wikimedia Foundation. 2010.

Поможем студентам написать доклад

Look at other dictionaries:

  • Generalized linear model — In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary least squares regression. It relates the random distribution of the measured variable of the experiment (the distribution function ) to the systematic (non …   Wikipedia

  • General linear model — Not to be confused with generalized linear model. The general linear model (GLM) is a statistical linear model. It may be written as[1] where Y is a matrix with series of multivariate measurements, X is a matrix that might be a design matrix, B… …   Wikipedia

  • Proper linear model — In statistics, a proper linear model is a linear model in which the weights given to the predictor variables are chosen in such a way as to optimize the relationship between the prediction and the criterion. Simple regression analysis is the most …   Wikipedia

  • hierarchical linear-model — hierarchical linear model, hierarchical linear regression See multi level models …   Dictionary of sociology

  • Model selection — is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is …   Wikipedia

  • Model-based design — (MBD) is a mathematical and visual method of addressing problems associated with designing complex control,[1][2] signal processing[3] and communication systems. It is used in many motion control, industrial equipment, aerospace, and automotive… …   Wikipedia

  • Linear regression — Example of simple linear regression, which has one independent variable In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more explanatory variables denoted X. The case of one… …   Wikipedia

  • Linear least squares (mathematics) — This article is about the mathematics that underlie curve fitting using linear least squares. For statistical regression analysis using least squares, see linear regression. For linear regression on a single variable, see simple linear regression …   Wikipedia

  • Linear no-threshold model — The linear no threshold model (LNT) is a method for predicting the long term, biological damage caused by ionizing radiation and is based on the assumption that the risk is directly proportional to the dose at all dose levels. In other words, the …   Wikipedia

  • Model based design — The dawn of the electrical age brought with it various novel, innovative and advanced control systems. It was as early as 1920 s when the two strands of technology, control theory and control system, came together to produce large scale… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”