Cochran-Armitage test for trend

The Cochran-Armitage test for trend^[1]^[2], named for William Cochran and Peter Armitage, is used in categorical data analysis when the aim is to assess for the presence of an association between a variable with two categories and a variable with k categories. It modifies the chi-square test to incorporate a suspected ordering in the effects of the k categories of the second variable. For example, doses of a treatment can be ordered as 'low', 'medium', and 'high', and we may suspect that the treatment benefit cannot become smaller as the dose increases. The trend test is often used as a genotype-based test for case-control genetic association studies^[3].

1 Introduction
2 Interpretation and role
3 Application to genetics
4 See also
5 References

Introduction

The trend test is applied when the data take the form of a 2 × k contingency table. For example, if k = 3 we have

	B=1	B=2	B=3
A=1	N₁₁	N₁₂	N₁₃
A=2	N₂₁	N₂₂	N₂₃

This table can be completed with the marginal totals of the two variables

	B=1	B=2	B=3	Sum
A=1	N₁₁	N₁₂	N₁₃	R₁
A=2	N₂₁	N₂₂	N₂₃	R₂
Sum	C₁	C₂	C₃	N

where R₁ = N₁₁ + N₁₂ + N₁₃, and C₁ = N₁₁ + N₂₁, etc.

The trend test statistic is

$T \equiv \sum_{i=1}^k t_i (N_{1i} R_2 - N_{2i} R_1),$

where the t_i are weights, and the difference N_1iR₂ −N_2iR₁ can be seen as the difference between N_1i and N_2i after reweighting the rows to have the same total.

The hypothesis of no association (the null hypothesis) can be expressed as:

$\Pr(A=1| B=1) = \cdots = \Pr(A=1| B=k)$ .

Assuming this holds, then, using iterated expectation,

$\operatorname{E}(T) = \operatorname{E} \left( \operatorname{E}(T|R_1,R_2) \right) = \operatorname{E} (0) = 0.$

The variance can be computed by decomposition, yielding

${\rm Var}(T) = \frac{R_1R_2}{N} \left(\sum_{i=1}^kt_i^2C_i(N-C_i) - 2\sum_{i=1}^{k-1}\sum_{j=i+1}^kt_it_jC_iC_j\right),$

and as a large sample approximation,

$\frac{T }{\sqrt{\mathrm{Var}(T)}} \sim \mathrm{N}(0,1)$ .

The weights t_i can be chosen such that the trend test becomes locally most powerful for detecting particular types of associations. For example, if k = 3 and we suspect that B = 1 and B = 2 have similar frequencies (within each row), but that B = 3 has a different frequency, then the weights t = (1,1,0) should be used. If we suspect a linear trend in the frequencies, then the weights t = (0,1,2) should be used. These weights are also often used when the frequencies are suspected to change monotonically with B, even if the trend is not necessarily linear.

Interpretation and role

The trend test will have higher power than the chi-square test when the suspected trend is correct, but the ability to detect unsuspected trends is sacrificed. This is an example of a general technique of directing hypothesis tests toward narrow alternatives. The trend test exploits the suspected effect direction to increase power, but this does not affect the sampling distribution of the test statistic under the null hypothesis. Thus, the suspected trend in effects is not an assumption that must hold in order for the test results to be meaningful.

Application to genetics

Suppose that there are three possible genotypes at some locus, and we refer to these as aa, Aa and AA. The distribution of genotype counts can be put in a 2 × 3 contingency table. For example, consider the following data, in which the genotype frequencies vary linearly in the cases and are constant in the controls:

	Genotype aa	Genotype Aa	Genotype AA	Sum
Controls	20	20	20	60
Cases	10	20	30	60
Sum	30	40	50	120

In genetics applications, the weights are selected according to the suspected mode of inheritance. For example, in order to test whether allele a is dominant over allele A, the choice t = (1, 1, 0) is locally optimal. To test whether allele a is recessive to allele A, the optimal choice is t = (0, 0, 1). To test whether alleles a and A are codominant, the choice t = (0, 1, 2) is locally optimal. For complex diseases, the underlying genetic model is often unknown. In genome-wide association studies, the additive (or codominant) version of the test is often used.

In the numerical example, the standardized test statistics for various weight vectors are

Weights	Standardized test statistic
1,1,0	1.85
0,1,1	-2.1
0,1,2	-2.3

and the Pearson chi-square test gives a standardized test statistic of 2. Thus, we obtain a stronger significance level if the weights corresponding to additive (codominant) inheritance are used. Note that for the significance level to give a p-value with the usual probabilistic interpretation, the weights must be specified before examining the data, and only one set of weights may be used.

References

Agresti, Alan (2002). Categorical Data Analysis (Second Edition). Wiley. ISBN 0-471-36093-7.

Sasieni, P (1997). "From genotypes to genes: doubling the sample size". Biometrics (International Biometric Society) 53 (4): 1253–61. doi:10.2307/2533494. JSTOR 2533494. PMID 9423247.

statgen.org (2007). "A derivation for Armitage’s trend test for the 2 x 3 genotype table". http://www.statgen.org/main/images/www_statgen_org/downloads/Dana/armitage.pdf. Retrieved 2009-02-06. -

^ Cochran, WG (1954). "Some methods for strengthening the common chi-square tests". Biometrics (International Biometric Society) 10 (4): 417–451. doi:10.2307/3001616. JSTOR 3001616.
^ Armitage, P (1955). "Tests for Linear Trends in Proportions and Frequencies". Biometrics (International Biometric Society) 11 (3): 375–386. doi:10.2307/3001775. JSTOR 3001775.
^ Purcell S, Neale B, Todd-Brown K, et al. (September 2007). "PLINK: a tool set for whole-genome association and population-based linkage analyses". Am. J. Hum. Genet. 81 (3): 559–75. doi:10.1086/519795. PMC 1950838. PMID 17701901. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1950838.

Categories:

Statistical tests
Categorical data

Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… … Wikipedia
List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… … Wikipedia
Ronald Fisher — R. A. Fisher Born 17 February 1890(1890 02 17) East Finchley, London … Wikipedia
United States — a republic in the N Western Hemisphere comprising 48 conterminous states, the District of Columbia, and Alaska in North America, and Hawaii in the N Pacific. 267,954,767; conterminous United States, 3,022,387 sq. mi. (7,827,982 sq. km); with… … Universalium
performing arts — arts or skills that require public performance, as acting, singing, or dancing. [1945 50] * * * ▪ 2009 Introduction Music Classical. The last vestiges of the Cold War seemed to thaw for a moment on Feb. 26, 2008, when the unfamiliar strains … Universalium
Randomized controlled trial — Flowchart of four phases (enrollment, intervention allocation, follow up, and data analysis) of a parallel randomized trial of two groups, modified from the CONSORT (Consolidated Standards of Reporting Trials) 2010 Statement[1] … Wikipedia

Academic Dictionaries and Encyclopedias

Cochran-Armitage test for trend

Contents

Introduction

Interpretation and role

Application to genetics

See also

References

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Cochran-Armitage test for trend

Contents

Introduction

Interpretation and role

Application to genetics

See also

References

Look at other dictionaries:

Share the article and excerpts

Direct link