# Analysis of covariance

Analysis of covariance

Analysis of covariance (ANCOVA) is a general linear model with one continuous outcome variable and one or more factors. ANCOVA is a merger of ANOVA and regression for continuous variables. ANCOVA tests whether certain factors have an effect on the outcome variable after removing the variance for which quantitative predictors (covariates) account. The inclusion of covariates can increase statistical power because it accounts for some of the variability.

Assumptions

As any statistical procedure, ANCOVA makes certain assumptions about the data entered into the model. Only if these assumptions are met, at least approximately, will ANCOVA yield valid results. Specifically, ANCOVA, just like ANOVA, assumes that the residuals [http://www.statsoft.com/textbook/stbasic.html] are normally distributed and homoscedastic. Further, since ANCOVA is a method based in linear regression, the relationship of the dependent variable to the independent variable(s) must be linear in the parameters.

Power considerations

While the inclusion of a covariate into an ANOVA generally increases statistical power by accounting for some of the variance in the dependent variable and thus increasing the ratio of variance explained by the independent variables, adding a covariate into ANOVA also reduces the degrees of freedom (see below). Accordingly, adding a covariate which accounts for very little variance in the dependent variable might actually reduce power.

Equations

One-factor ANCOVA analysis

One factor analysis is appropriate when dealing with more than 3 populations; "k" populations. The single factor has "k" levels equal to the "k" populations. "n" samples from each population are chosen randomly from their respective population.

Calculating the sum of squared deviates for the independent variable "X" and the dependent variable "Y"

The sum of squared deviates (SS): $SST_y$, $SSTr_y$, and $SSE_y$ must be calculated using the following equations for the dependent variable, "Y". The SS for the covariate must also be calculated, the two necessary values are $SST_x$ and $SSE_x$.

The total sum of squares determines the variability of all the samples. $n_T$ represents the total number of samples:

:$SST_y=sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kY_\left\{ij\right\}^2-frac\left\{left\left(sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kY_\left\{ij\right\} ight\right)^2\right\}\left\{n_T\right\}$

The sum of squares for treatments determines the variability between populations or factors. $n_k$ represents the number of factors

:$SSTr_y=sum_\left\{i=1\right\}^nleft\left(frac\left\{sum_\left\{j=1\right\}^kY_\left\{ij\right\}^2\right\}\left\{n_k\right\} ight\right)-frac\left\{left\left(sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kY_\left\{ij\right\} ight\right)^2\right\}\left\{n_T\right\}$

The sum of squares for error determines the variability within each population or factor. $n_n$ represents the number of samples with a given population:

:$SSE_y=sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kY_\left\{ij\right\}^2-sum_\left\{i=1\right\}^nleft\left(frac\left\{sum_\left\{j=1\right\}^kY_\left\{ij\right\}^2\right\}\left\{n_k\right\} ight\right)$

The total sum of squares is equal to the sum of squares for treatments and the sum of squares for error:

:$SST_y=SSTr_y+SSE_y,$

Calculating the covariance of "X" and "Y"

The total sum of square covariates determines the covariance of "X" and "Y" within all the data samples:

:$SCT=sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kX_\left\{ij\right\}Y_\left\{ij\right\}-frac\left\{left\left(sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kX_\left\{ij\right\} ight\right)left\left(sum_\left\{i=1\right\}^nsum_\left\{j=1\right\}^kY_\left\{ij\right\} ight\right)\right\}\left\{n_T\right\}$

:$SCE=sum_\left\{j=1\right\}^kleft\left(sum_\left\{i=1\right\}^nX_\left\{ij\right\}Y_\left\{ij\right\}-frac\left\{sum_\left\{i=1\right\}^n\left(X_\left\{ij\right\}Y_\left\{ij\right\}\right)\right\}\left\{n_n\right\} ight\right)$

The correlation between "X" and "Y" is $r_T^2$.

:$r_T^2=frac\left\{SCT^2\right\}\left\{SST_xSST_y\right\}$

:$r_n^2=frac\left\{SCE^2\right\}\left\{SSE_xSSE_y\right\}$

The proportion of covariance is subtracted from the dependent, $SS_y$ values:

:$SST_\left\{yadj\right\}=SST_y-r_T^2,$

:$SSE_\left\{yadj\right\}=SSE_y-r_n^2,$

:$SSTr_\left\{yadj\right\}=SST_\left\{yadj\right\}-SSE_\left\{yadj\right\}$

Adjusting the means of each population "k"

The mean of each population is adjusted in the following manner:

:$M_\left\{y_iadj\right\}=M_\left\{y_i\right\}-frac\left\{SCE_y\right\}\left\{SCE_x\right\}\left(M_\left\{x_i\right\}-M_\left\{x_T\right\}\right)$

Analysis using adjusted sum of squares values

Mean squares for treatments where $df_\left\{Tr\right\}$ is equal to $N_T-k-1$. $df_\left\{Tr\right\}$ is one less than in ANOVA to account for the covariance and $df_E=k-1$:

:$MSTr=frac\left\{SSTr\right\}\left\{df_\left\{Tr$

:$MSE=frac\left\{SSE\right\}\left\{df_E\right\}$

The F statistic is

:$F_\left\{df_E,df_mathrm\left\{Tr=frac\left\{mathrm\left\{MSTr\left\{mathrm\left\{MSE.$

ee also

* [http://www.southampton.ac.uk/~cpd/anovas/datasets/index.htm Examples of all ANOVA and ANCOVA models with up to three treatment factors, including randomized block, split plot, repeated measures, and Latin squares]
* [http://faculty.vassar.edu/lowry/ch17pt1.html One-Way Analysis of Covariance for Independent Samples]

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• analysis of covariance — (ANCOVA) a statistical procedure used with one dependent variable and multiple independent variables of both categorical (ordinal, dichotomous, or nominal) and continuous types; it is a variation of analysis of variance that adjusts for… …   Medical dictionary

• Ковариационный анализ (analysis of covariance) — К. а. тесно связанный с дисперсионным анализом статистический метод, в к ром зависимая переменная статистически корректируется на основе связанной с ней дополнительной информ., с тем чтобы устранить вносимую извне изменчивость и т. о. повысить… …   Психологическая энциклопедия

• Covariance — This article is about the measure of linear relation between random variables. For other uses, see Covariance (disambiguation). In probability theory and statistics, covariance is a measure of how much two variables change together. Variance is a …   Wikipedia

• Analysis of variance — In statistics, analysis of variance (ANOVA) is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of… …   Wikipedia

• Covariance matrix — A bivariate Gaussian probability density function centered at (0,0), with covariance matrix [ 1.00, .50 ; .50, 1.00 ] …   Wikipedia

• Covariance and contravariance of vectors — For other uses of covariant or contravariant , see covariance and contravariance. In multilinear algebra and tensor analysis, covariance and contravariance describe how the quantitative description of certain geometric or physical entities… …   Wikipedia

• Covariance function — In probability theory and statistics, covariance is a measure of how much two variables change together and the covariance function describes the variance of a random variable process or field. For a random field or stochastic process Z(x) on a… …   Wikipedia

• Principal component analysis — PCA of a multivariate Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the (0.878, 0.478) direction and of 1 in the orthogonal direction. The vectors shown are the eigenvectors of the covariance matrix scaled by… …   Wikipedia

• Multivariate analysis of variance — (MANOVA) is a generalized form of univariate analysis of variance (ANOVA). It is used when there are two or more dependent variables. It helps to answer : 1. do changes in the independent variable(s) have significant effects on the dependent …   Wikipedia

• Meta-analysis — In statistics, a meta analysis combines the results of several studies that address a set of related research hypotheses. In its simplest form, this is normally by identification of a common measure of effect size, for which a weighted average… …   Wikipedia