- False discovery rate
**False discovery rate**(**FDR**) control is a statistical method used in multiple hypothesis testing to correct formultiple comparisons . In a list of rejected hypotheses, FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors). [*Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". "*] It is a less conservative procedure for comparison, with greater power thanJournal of the Royal Statistical Society . Series B (Methodological)"**57**(1), 289–300. [*http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_hochberg1995.pdf School of Mathematical Sciences*]familywise error rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors. [*Shaffer J.P. (1995) Multiple hypothesis testing, Annual Rview of Psychology 46:561-584, [*]*http://dx.doi.org/10.1146/annurev.ps.46.020195.003021 Annual Reviews*]The q value is defined to be the FDR analogue of the p-value. The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant. One approach is to directly estimate q-values rather than fixing a level at which to control the FDR.

**Classification of "m" hypothesis tests**The following table defines some random variables related to the m hypothesis tests.

* $m\_0$ is the number of true null hypotheses

* $m\; -\; m\_0$ is the number of false null hypotheses

* $U$ is the number of true negatives

* $V$ is the number of false positives

* $T$ is the number of false negatives

* $S$ is the number of true positives

* $H\_1\; ...\; H\_m$ the null hypotheses being tested

* In "m" hypothesis tests of which "m_{0}" are true null hypotheses, "R" is an observable random variable, and "S", "T", "U", and "V" are unobservablerandom variable s.The false discovery rate is given by $mathrm\{E\}!left\; [frac\{V\}\{V+S\}\; ight\; ]\; =\; mathrm\{E\}!left\; [frac\{V\}\{R\}\; ight\; ]$ and one wants to keep this value below a threshold $alpha$.

($frac\{V\}\{R\}$ is defined to be 0 when $R\; =\; 0$)

**Controlling procedures****Independent tests**The "Simes" procedure ensures that its

expected value $mathrm\{E\}!left\; [\; frac\{V\}\{V\; +\; S\}\; ight]\; ,$ is less than a given $alpha$ (Benjamini and Hochberg 1995). This procedure is valid when the $m$ tests are independent. Let $H\_1\; ldots\; H\_m$ be the null hypotheses and $P\_1\; ldots\; P\_m$ their correspondingp-value s. Order these values in increasing order and denote them by $P\_\{(1)\}\; ldots\; P\_\{(m)\}$. For a given $alpha$, find the largest $k$ such that $P\_\{(k)\}\; leq\; frac\{k\}\{m\}\; alpha.$Then reject (i.e. declare positive) all $H\_\{(i)\}$ for $i\; =\; 1,\; ldots,\; k$.

...Note, the mean $alpha$ for these $m$ tests is $frac\{alpha(m+1)\}\{2m\}$ which could be used as a rough FDR (RFDR) or "$alpha$ adjusted for $m$ indep. tests."

NOTE: The RFDR calculation shown here is not part of the Benjamini and Hochberg method.

**Dependent tests**The "Benjamini and Yekutieli" procedure controls the false discovery rate under dependence assumptions. This refinement modifies the threshold and finds the largest $k$ such that:

:$P\_\{(k)\}\; leq\; frac\{k\}\{m\; cdot\; c(m)\}\; alpha$

* If the tests are independent: $c(m)\; =\; 1$ (same as above)

* If the tests are positively correlated: $c(m)\; =\; 1$

* If the tests are negatively correlated: $c(m)\; =\; sum\; \_\{i=1\}\; ^m\; frac\{1\}\{i\}$In the case of negative correlation, $c(m)$ can be approximated by using the

Euler-Mascheroni constant :$sum\; \_\{i=1\}\; ^m\; frac\{1\}\{i\}\; approx\; ln(m)\; +\; gamma.$

Using RFDR above, an approximate FDR (AFDR) is the min(mean $alpha$) for $m$ dependent tests = RFDR / ( ln($m$)+ 0.57721...).

**References***cite journal

author = Benjamini, Yoav; Hochberg, Yosef

year = 1995

title = Controlling the false discovery rate: a practical and powerful approach to multiple testing

journal =Journal of the Royal Statistical Society , Series B (Methodological)

volume = 57

issue = 1

pages = 289–300

id = MathSciNet | id = 1325392

url = http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_hochberg1995.pdf

*cite journal

author = Benjamini, Yoav; Yekutieli, Daniel

year = 2001

title = The control of the false discovery rate in multiple testing under dependency

journal = Annals of Statistics

volume = 29

issue = 4

pages = 1165–1188

url = http://www.math.tau.ac.il/~ybenja/MyPapers/benjamini_yekutieli_ANNSTAT2001.pdf

id = MathSciNet | id = 1869245

doi = 10.1214/aos/1013699998

*cite journal

author = Storey, John D.

year = 2002

title = A direct approach to false discovery rates

journal =Journal of the Royal Statistical Society , Series B (Methodological)

volume = 64

issue = 3

pages = 479–498

url = http://www.blackwell-synergy.com/links/doi/10.1111%2F1467-9868.00346

id = MathSciNet | id = 1924302

doi = 10.1111/1467-9868.00346

*cite journal

author = Storey, John D.

year = 2003

title = The positive false discovery rate: A Bayesian interpretation and the "q"-value

journal = Annals of Statistics

volume = 31

issue = 6

pages = 2013–2035

id = MathSciNet | id = 2036398

doi = 10.1214/aos/1074290335

url = http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aos/1074290335**External links*** [

*http://strimmerlab.org/notes/fdr.html False Discovery Rate Analysis in R*] - Lists links with popular R packages

*Wikimedia Foundation.
2010.*