False discovery rate
From Wikipedia, the free encyclopedia
False discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. In a list of rejected hypotheses, FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors).[1] It is a less conservative procedure for comparison, with greater power than familywise error rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.[2]
The q value is defined to be the FDR analogue of the p-value. The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant. One approach is to directly estimate q-values rather than fixing a level at which to control the FDR.
Contents |
[edit] Classification of m hypothesis tests
The following table defines some random variables related to the m hypothesis tests.
| # declared non-significant | # declared significant | Total | |
|---|---|---|---|
| # true null hypotheses | U | V | m0 |
| # non-true null hypotheses | T | S | m − m0 |
| Total | m − R | R | m |
- m0 is the number of true null hypotheses
- m − m0 is the number of false null hypotheses
- U is the number of true negatives
- V is the number of false positives
- T is the number of false negatives
- S is the number of true positives
- H1...Hm the null hypotheses being tested
- In m hypothesis tests of which m0 are true null hypotheses, R is an observable random variable, and S, T, U, and V are unobservable random variables.
The false discovery rate is given by
and one wants to keep this value below a threshold α.
(
is defined to be 0 when R = 0)
[edit] Controlling procedures
[edit] Independent tests
The Simes procedure ensures that its expected value
is less than a given α (Benjamini and Hochberg 1995). This procedure is valid when the m tests are independent. Let
be the null hypotheses and
their corresponding p-values. Order these values in increasing order and denote them by
. For a given α, find the largest k such that 
Then reject (i.e. declare positive) all H(i) for
. ...Note, the mean α for these m tests is
which could be used as a rough FDR (RFDR) or "α adjusted for m indep. tests."
[edit] Dependent tests
The Benjamini and Yekutieli procedure controls the false discovery rate under dependence assumptions. This refinement modifies the threshold and finds the largest k such that:
- Failed to parse (Cannot write to or create math output directory): P_{(k)} \leq \frac{k}{m \cdot c(m)} \alpha
- If the tests are independent: c(m) = 1 (same as above)
- If the tests are positively correlated: c(m) = 1
- If the tests are negatively correlated:

In the case of negative correlation, c(m) can be approximated by using the Euler-Mascheroni constant
Using RFDR above, an approximate FDR (AFDR) is the min(mean α) for m dependent tests = RFDR / ( ln(m)+ 0.57721...).
[edit] References
- ^ Benjamini, Y., and Hochberg Y. (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society. Series B (Methodological) 57 (1), 289–300. School of Mathematical Sciences
- ^ Shaffer J.P. (1995) Multiple hypothesis testing, Annual Rview of Psychology 46:561-584, Annual Reviews
- Benjamini, Yoav; Hochberg, Yosef (1995). "Controlling the false discovery rate: a practical and powerful approach to multiple testing". Journal of the Royal Statistical Society, Series B (Methodological) 57 (1): 289–300. MR1325392.
- Benjamini, Yoav; Yekutieli, Daniel (2001). "The control of the false discovery rate in multiple testing under dependency". Annals of Statistics 29 (4): 1165–1188. doi:. MR1869245.
- Storey, John D. (2002). "A direct approach to false discovery rates". Journal of the Royal Statistical Society, Series B (Methodological) 64 (3): 479–498. doi:. MR1924302.
- Storey, John D. (2003). "The positive false discovery rate: A Bayesian interpretation and the q-value". Annals of Statistics 31 (6): 2013–2035. doi:. MR2036398.
[edit] External links
- False Discovery Rate Analysis in R - Lists links with popular R packages


