Durbin–Watson statistic

Durbin–Watson statistic

In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation (a relationship between values separated from each other by a given time lag) in the residuals (prediction errors) from a regression analysis. It is named after James Durbin and Geoffrey Watson. However, the small sample distribution of this ratio was derived in a path-breaking article by John von Neumann (von Neumann, 1941). Durbin and Watson (1950, 1951) applied this statistic to the residuals from least squares regressions, and developed bounds tests for the null hypothesis that the errors are serially independent (not autocorrelated) against the alternative that they follow a first order autoregressive process. Later, John Denis Sargan and Alok Bhargava developed several von Neumann–Durbin–Watson type test statistics for the null hypothesis that the errors on a regression model follow a process with a unit root against the alternative hypothesis that the errors follow a stationary first order autoregression (Sargan and Bhargava, 1983).

Contents

Computing and interpreting the Durbin–Watson statistic

If et is the residual associated with the observation at time t, then the test statistic is

d = {\sum_{t=2}^T (e_t - e_{t-1})^2 \over {\sum_{t=1}^T e_t^2}},

where T is the number of observations. Since d is approximately equal to 2(1 − r), where r is the sample autocorrelation of the residuals,[1] d = 2 indicates no autocorrelation. The value of d always lies between 0 and 4. If the Durbin–Watson statistic is substantially less than 2, there is evidence of positive serial correlation. As a rough rule of thumb, if Durbin–Watson is less than 1.0, there may be cause for alarm. Small values of d indicate successive error terms are, on average, close in value to one another, or positively correlated. If d > 2 successive error terms are, on average, much different in value to one another, i.e., negatively correlated. In regressions, this can imply an underestimation of the level of statistical significance.

To test for positive autocorrelation at significance α, the test statistic d is compared to lower and upper critical values (dL,α and dU,α):

  • If d < dL,α, there is statistical evidence that the error terms are positively autocorrelated.
  • If d > dU,α, there is statistical evidence that the error terms are not positively autocorrelated.
  • If dL,α < d < dU,α, the test is inconclusive.

Positive serial correlation is serial correlation in which a positive error for one observation increases the chances of a positive error for another observation.

Although positive serial correlation does not affect the consistency of the estimated regression coefficients, it does affect our ability to conduct valid statistical tests. First, the F-statistic to test for overall significance of the regression may be inflated because the mean squared error (MSE) will tend to underestimate the population error variance. Second, positive serial correlation typically causes the ordinary least squares (OLS) standard errors for the regression coefficients to underestimate the true standard errors. As a consequence, if positive serial correlation is present in the regression, standard linear regression analysis will typically lead us to compute artificially small standard errors for the regression coefficient. These small standard errors will cause the estimated t-statistic to be inflated, suggesting significance where perhaps there is none. The inflated t-statistic, may in turn, lead us to incorrectly reject null hypotheses, about population values of the parameters if the regression model more often than we would if the standard errors were correctly estimated. This Type I error could lead to improper investment recommendations.

To test for negative autocorrelation at significance α, the test statistic (4 − d) is compared to lower and upper critical values (dL,α and dU,α):

  • If (4 − d) < dL,α, there is statistical evidence that the error terms are negatively autocorrelated.
  • If (4 − d) > dU,α, there is statistical evidence that the error terms are not negatively autocorrelated.
  • If dL,α < (4 − d) < dU,α, the test is inconclusive.

Negative serial correlation implies that a positive error for one observation increases the chance of a negative error for another observation and a negative error for one observation increases the chances of a positive error for another.

The critical values, dL,α and dU,α, vary by level of significance (α), the number of observations, and the number of predictors in the regression equation. Their derivation is complex—statisticians typically obtain them from the appendices of statistical texts.

An important note is that the Durbin–Watson statistic, while displayed by many regression analysis programs, is not relevant in many situations. For instance, if the error distribution is not normal, if there is higher-order autocorrelation, or if the dependent variable is in a lagged form as an independent variable, this is not an appropriate test for autocorrelation. A suggested test that does not have these limitations is the Breusch–Godfrey (serial correlation LM) Test.

Durbin h-statistic

The Durbin–Watson statistic is biased for autoregressive moving average models, so that autocorrelation is underestimated. But for large samples one can easily compute the unbiased normally distributed h-statistic:

h = \left( 1 - \frac {1} {2} d \right) \sqrt{\frac {T}  {1-T \cdot \widehat {\operatorname{Var}}(\widehat\beta_1\,)}},

using the Durbin–Watson statistic d and the estimated variance

\widehat Var (\widehat\beta_1)

of the regression coefficient of the lagged dependent variable, provided

T \cdot \widehat{Var}(\widehat\beta_1)<1. \,

Durbin–Watson test for panel data

For panel data this statistic was generalized as follows by Alok Bhargava et al. (1982):

If ei, t is the residual from an OLS regression with fixed effects for each panel i, associated with the observation in panel i at time t, then the test statistic is
d_{pd}=\frac{\sum_{i=1}^N \sum_{t=2}^T (e_{i,t} - e_{i,t-1})^2}  {\sum_{i=1}^N \sum_{t=1}^T e_{i,t}^2}.

This statistic can be compared with tabulated rejection values [see Alok Bhargava et al. (1982), page 537]. These values are calculated dependent on T (length of the balanced panel—time periods the individuals were surveyed), K (number of regressors) and N (number of individuals in the panel). This test statistic can also be used for testing the null hypothesis of a unit root against stationary alternatives in fixed effects models using another set of bounds (Tables V and VI) tabulated by Alok Bhargava et al. (1982).

Implementations in statistics packages

  1. R: the dwtest function in the lmtest package, and durbin.watson function in the car package.
  2. MATLAB: the dwtest function in the Statistics Toolbox.
  3. Mathematica: the Durbin–Watson (d) statistic is included as an option in the LinearModelFit function.
  4. SAS: Is a standard output when using proc model and is an option (dw) when using proc reg.
  5. Stata: the command -estat dwatson-, following -regress- in times series data. Engle's LM test for autoregressive conditional heteroskedasticity (ARCH), a test for time-dependent volatility, the Breusch–Godfrey test, and Durbin's alternative test for serial correlation are also available. All (except -dwatson-) tests separately for higher-order serial correlations. The Breusch–Godfrey test and Durbin's alternative test also allow regressors that are not strictly exogenous.
  6. EXCEL: although Microsoft Excel 2007 does not have a specific Durbin–Watson function, the d-statistic may be calculated using "=SUMXMY2(x_array,y_array)/SUMSQ(array)"
  7. Minitab: the option to report the statistic in the Session window can be found under the "Options" box under Regression and via the "Results" box under General Regression.

See also

  • Time-series regression
  • ACF / PACF

Notes

  1. ^ Gujarati (2003) p. 469

References

  • Bhargava, Alok, Franzini, L., Narendranathan, W. (1982): "Serial Correlation and the Fixed Effects Model". Review of Economic Studies, 49, p. 533–549.
  • Durbin, J., and Watson, G. S. (1950) "Testing for Serial Correlation in Least Squares Regression, I." Biometrika 37, 409–428.
  • Durbin, J., and Watson, G. S. (1951) "Testing for Serial Correlation in Least Squares Regression, II." Biometrika 38, 159–179.
  • Gujarati, D.N. (2003) Basic econometrics, 4th ed., Boston, McGraw–Hill
  • Gujarati, Damodar N. (1995): Basic Econometrics, 3. ed., New York et al.: McGraw–Hill, 1995, page 605f.
  • Sargan, J.D. and Alok Bhargava (1983). "Testing residuals from least squares regression for being generated by the Gaussian random walk". Econometrica, 51, p. 153–174.
  • Verbeek, Marno (2004): A Guide to Modern Econometrics, 2. ed., Chichester: John Wiley & Sons, 2004, Seite 102f.
  • von Neumann, John. (1941). "Distribution of the ratio of the mean square successive difference to the variance". Annals of Mathematical Statistics, 12, 367–395.
  • Multiple regression and issues in regression analysis, Richard A DeFusco, CFA, Denis W. Mc. Leavey, CFA, Jerald E. Pinto, CFA and David E. Runkle, CFA, CFA Curriculum Level II

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Durbin Watson Statistic — A number that tests for autocorrelation in the residuals from a statistical regression analysis. The Durbin Watson statistic is always between 0 and 4. A value of 2 means that there is no autocorrelation in the sample. Values approaching 0… …   Investment dictionary

  • James Durbin — Infobox Scientist image width = 150px name = James Durbin box width = birth date = 1923 birth place = death date = death place = residence = United Kingdom citizenship = United Kingdom field = statistics, econometrics work institutions = London… …   Wikipedia

  • Geoffrey Watson — Geoffrey Stuart Watson (3 December 1921 – 3 January 1998) was an Australian statistician.Watson was born in Bendigo, Victoria in 1921. He studied at the University of Melbourne, and received his PhD at the North Carolina State University in 1951 …   Wikipedia

  • Breusch–Godfrey test — In statistics, the Breusch Godfrey serial correlation LM test is a robust test for autocorrelation in the residuals from a regression analysis and is considered more general than the standard Durbin–Watson statistic (or Durbin s h… …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Ordinary least squares — This article is about the statistical properties of unweighted linear regression analysis. For more general regression analysis, see regression analysis. For linear regression on a single variable, see simple linear regression. For the… …   Wikipedia

  • John von Neumann — Von Neumann redirects here. For other uses, see Von Neumann (disambiguation). The native form of this personal name is Neumann János. This article uses the Western name order. John von Neumann …   Wikipedia

  • Autocorrelation — is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal which has been buried under noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies. It is used… …   Wikipedia

  • List of mathematics articles (D) — NOTOC D D distribution D module D D Agostino s K squared test D Alembert Euler condition D Alembert operator D Alembert s formula D Alembert s paradox D Alembert s principle Dagger category Dagger compact category Dagger symmetric monoidal… …   Wikipedia

  • Анатольев — Анатольев, Станислав Анатольевич Станислав Анатольевич Анатольев Дата рождения: 19 сентября Гражданство:  Российская Федерация Научная сфера …   Википедия

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”