- False discovery rate
False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for
multiple comparisons . In a list of rejected hypotheses, FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors). [Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". "Journal of the Royal Statistical Society . Series B (Methodological)" 57 (1), 289–300. [http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_hochberg1995.pdf School of Mathematical Sciences] ] It is a less conservative procedure for comparison, with greater power thanfamilywise error rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors. [Shaffer J.P. (1995) Multiple hypothesis testing, Annual Rview of Psychology 46:561-584, [http://dx.doi.org/10.1146/annurev.ps.46.020195.003021 Annual Reviews] ]The q value is defined to be the FDR analogue of the p-value. The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant. One approach is to directly estimate q-values rather than fixing a level at which to control the FDR.
Classification of "m" hypothesis tests
The following table defines some random variables related to the m hypothesis tests.
* is the number of true null hypotheses
* is the number of false null hypotheses
* is the number of true negatives
* is the number of false positives
* is the number of false negatives
* is the number of true positives
* the null hypotheses being tested
* In "m" hypothesis tests of which "m0" are true null hypotheses, "R" is an observable random variable, and "S", "T", "U", and "V" are unobservablerandom variable s.The false discovery rate is given by and one wants to keep this value below a threshold .
( is defined to be 0 when )
Controlling procedures
Independent tests
The "Simes" procedure ensures that its
expected value is less than a given (Benjamini and Hochberg 1995). This procedure is valid when the tests are independent. Let be the null hypotheses and their correspondingp-value s. Order these values in increasing order and denote them by . For a given , find the largest such thatThen reject (i.e. declare positive) all for .
...Note, the mean for these tests is which could be used as a rough FDR (RFDR) or " adjusted for indep. tests."
NOTE: The RFDR calculation shown here is not part of the Benjamini and Hochberg method.
Dependent tests
The "Benjamini and Yekutieli" procedure controls the false discovery rate under dependence assumptions. This refinement modifies the threshold and finds the largest such that:
:
* If the tests are independent: (same as above)
* If the tests are positively correlated:
* If the tests are negatively correlated:In the case of negative correlation, can be approximated by using the
Euler-Mascheroni constant :
Using RFDR above, an approximate FDR (AFDR) is the min(mean ) for dependent tests = RFDR / ( ln()+ 0.57721...).
References
*cite journal
author = Benjamini, Yoav; Hochberg, Yosef
year = 1995
title = Controlling the false discovery rate: a practical and powerful approach to multiple testing
journal =Journal of the Royal Statistical Society , Series B (Methodological)
volume = 57
issue = 1
pages = 289–300
id = MathSciNet | id = 1325392
url = http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_hochberg1995.pdf
*cite journal
author = Benjamini, Yoav; Yekutieli, Daniel
year = 2001
title = The control of the false discovery rate in multiple testing under dependency
journal = Annals of Statistics
volume = 29
issue = 4
pages = 1165–1188
url = http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_yekutieli_ANNSTAT2001.pdf
id = MathSciNet | id = 1869245
doi = 10.1214/aos/1013699998
*cite journal
author = Storey, John D.
year = 2002
title = A direct approach to false discovery rates
journal =Journal of the Royal Statistical Society , Series B (Methodological)
volume = 64
issue = 3
pages = 479–498
url = http://www.blackwell-synergy.com/links/doi/10.1111%2F1467-9868.00346
id = MathSciNet | id = 1924302
doi = 10.1111/1467-9868.00346
*cite journal
author = Storey, John D.
year = 2003
title = The positive false discovery rate: A Bayesian interpretation and the "q"-value
journal = Annals of Statistics
volume = 31
issue = 6
pages = 2013–2035
id = MathSciNet | id = 2036398
doi = 10.1214/aos/1074290335
url = http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aos/1074290335External links
* [http://strimmerlab.org/notes/fdr.html False Discovery Rate Analysis in R] - Lists links with popular R packages
Wikimedia Foundation. 2010.