Mean and predicted response

In linear regression mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable. The values of these two responses are the same, but their calculated variances are different.

1 Straight line regression
2 Confidence intervals
3 General linear regression
4 References

Straight line regression

In straight line fitting the model is

$y_i=\alpha+\beta x_i +\epsilon_i\,$

where $y i$ is the response variable, $x i$ is the explanatory variable, ε_i is the random error, and $α$ and $β$ are parameters. The predicted response value for a given explanatory value, x_d, is given by

$\hat{y}_d=\hat\alpha+\hat\beta x_d ,$

while the actual response would be

$y_d=\alpha+\beta x_d +\epsilon_d \,$

Expressions for the values and variances of $\hat\alpha$ and $\hat\beta$ are given in linear regression.

Mean response is an estimate of the mean of the y population associated with x_d, that is $E(y | x_d)=\hat{y}_d\!$ . The variance of the mean response is given by

$\text{Var}\left(\hat{\alpha} + \hat{\beta}x_d\right) = \text{Var}\left(\hat{\alpha}\right) + \left(\text{Var} \hat{\beta}\right)x_d^2 + 2 x_d\text{Cov}\left(\hat{\alpha},\hat{\beta}\right) .$

This expression can be simplified to

$\text{Var}\left(\hat{\alpha} + \hat{\beta}x_d\right) =\sigma^2\left(\frac{1}{m} + \frac{\left(x_d - \bar{x}\right)^2}{\sum (x_i - \bar{x})^2}\right).$

To demonstrate this simplification, one can make use of the identity

$\sum (x_i - \bar{x})^2 = \sum x_i^2 - \frac{1}{m}\left(\sum x_i\right)^2 .$

The predicted response distribution is the predicted distribution of the residuals at the given point x_d. So the variance is given by

$\text{Var}\left(y_d - \left[\hat{\alpha} + \hat{\beta}x_d\right]\right) = \text{Var}\left(y_d\right) + \text{Var}\left(\hat{\alpha} + \hat{\beta}x_d\right) .$

The second part of this expression was already calculated for the mean response. Since $\text{Var}\left(y_d\right)=\sigma^2$ (a fixed but unknown parameter that can be estimated), the variance of the predicted response is given by

$\text{Var}\left(y_d - \left[\hat{\alpha} + \hat{\beta}x_d\right]\right) = \sigma^2 + \sigma^2\left(\frac{1}{m} + \frac{\left(x_d - \bar{x}\right)^2}{\sum (x_i - \bar{x})^2}\right) = \sigma^2\left(1+\frac{1}{m} + \frac{\left(x_d - \bar{x}\right)^2}{\sum (x_i - \bar{x})^2}\right) .$

Confidence intervals

The $100(1 - α)%$ confidence intervals are computed as $y_d \pm t_{\frac{\alpha }{2},m - n - 1} \sqrt{\text {Var}}$ . Thus, the confidence interval for predicted response is wider than the interval for mean response. This is expected intuitively – the variance population of $y$ values does not shrink when one samples from it, because the random variable ε_i does not decrease, but the variance mean of the $y$ does shrink with increased sampling, because the variance in $\hat \alpha$ and $\hat \beta$ decrease, so the mean response (predicted response value) becomes closer to $α + β x d$ .

This is analogous to the difference between the variance of a population and the variance of the sample mean of a population: the variance of a population is a parameter and does not change, but the variance of the sample mean decreases with increased samples.

General linear regression

The general linear model can be written as

$y_i=\sum_{j=1}^{j=n}X_{ij}\beta_j + \epsilon_i\,$

Therefore since $y_d=\sum_{j=1}^{j=n} X_{dj}\hat\beta_j$ the general expression for the variance of the mean response is

$\text{Var}\left(\sum_{j=1}^{j=n} X_{dj}\hat\beta_j\right)= \sum_{i=1}^{i=n}\sum_{j=1}^{j=n}X_{di}M_{ij}X_{dj},$

where M is the covariance matrix of the parameters, given by

$\mathbf{M}=\sigma^2\left(\mathbf{X^TX}\right)^{-1}$ .

References

Draper, N.R., Smith, H. (1998) Applied Regression Analysis. Wiley. ISBN 0-471-17082-8

Least squares and regression analysis

Computational statistics

Least squares · Linear least squares · Non-linear least squares · Iteratively reweighted least squares

Correlation and dependence

Pearson product-moment correlation · Rank correlation (Spearman's rho, Kendall's tau) · Partial correlation · Confounding variable

Regression analysis

Ordinary least squares · Partial least squares · Total least squares · Ridge regression

Regression as a
statistical model

Linear regression	Simple linear regression · Ordinary least squares · Generalized least squares · Weighted least squares · General linear model

Predictor structure	Polynomial regression · Growth curve · Segmented regression · Local regression

Non-standard	Nonlinear regression · Nonparametric · Semiparametric · Robust · Quantile · Isotonic

Non-normal errors	Generalized linear model · Binomial · Poisson · Logistic

Decomposition of variance

Analysis of variance · Analysis of covariance · Multivariate AOV

Model exploration

Mallows' Cp · Stepwise regression · Model selection · Regression model validation

Background

Mean and predicted response · Gauss–Markov theorem · Errors and residuals · Goodness of fit · Studentized residual · Minimum mean-square error

Design of experiments

Response surface methodology · Optimal design · Bayesian design

Numerical approximation

Numerical analysis · Approximation theory · Numerical integration · Gaussian quadrature · Orthogonal polynomials · Chebyshev polynomials · Chebyshev nodes

Applications

Curve fitting · Calibration curve · Numerical smoothing and differentiation · System identification · Moving least squares

Regression analysis category - Statistics category · Statistics portal · Statistics outline · Statistics topics

Categories:

Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

Errors and residuals in statistics — For other senses of the word residual , see Residual. In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its theoretical value . The error of a… … Wikipedia
Mean (song) — Mean … Wikipedia
Mean squared error — In statistics, the mean squared error (MSE) of an estimator is one of many ways to quantify the difference between values implied by a kernel density estimator and the true values of the quantity being estimated. MSE is a risk function,… … Wikipedia
Business and Industry Review — ▪ 1999 Introduction Overview Annual Average Rates of Growth of Manufacturing Output, 1980 97, Table Pattern of Output, 1994 97, Table Index Numbers of Production, Employment, and Productivity in Manufacturing Industries, Table (For Annual… … Universalium
selection response — The difference between the mean of the individuals selected to be parents and the mean of their offspring. Predicted response is calculated as the product of narrow sense heritability and selection differential … Glossary of Biotechnology
Global Energy and Water Cycle Experiment — The Global Energy and Water Cycle Experiment (GEWEX) is a research program of the World Climate Research Programme intended to observe, comprehend and models the Earth s water cycle. The experiment also observes how much energy the Earth receives … Wikipedia
Ordinary least squares — This article is about the statistical properties of unweighted linear regression analysis. For more general regression analysis, see regression analysis. For linear regression on a single variable, see simple linear regression. For the… … Wikipedia
Mathematics and Physical Sciences — ▪ 2003 Introduction Mathematics Mathematics in 2002 was marked by two discoveries in number theory. The first may have practical implications; the second satisfied a 150 year old curiosity. Computer scientist Manindra Agrawal of the… … Universalium
Media and Publishing — ▪ 2007 Introduction The Frankfurt Book Fair enjoyed a record number of exhibitors, and the distribution of free newspapers surged. TV broadcasters experimented with ways of engaging their audience via the Internet; mobile TV grew; magazine… … Universalium
Computers and Information Systems — ▪ 2009 Introduction Smartphone: The New Computer. The market for the smartphone in reality a handheld computer for Web browsing, e mail, music, and video that was integrated with a cellular telephone continued to grow in 2008. According to… … Universalium

Academic Dictionaries and Encyclopedias

Mean and predicted response

Contents

Straight line regression

Confidence intervals

General linear regression

References

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Mean and predicted response

Contents

Straight line regression

Confidence intervals

General linear regression

References

Look at other dictionaries:

Share the article and excerpts

Direct link