Abstract
Large numbers of mRNA transcripts, proteins, metabolites, and single nucleotide polymorphisms can be measured in a single tissue sample using new molecular biological techniques. Accordingly, the interpretation of ensuing hypothesis tests should manage the number of comparisons. For example, cDNA microarray experiments generate large multiplicity problems in which thousands of hypotheses are tested simultaneously. In this context, the false discovery rate (FDR) and false non-discovery rate (FNR) are used to account for multiple comparisons. In this study, we propose non-parametric estimates of FDR and FNR that are conceptually and computationally straightforward. Additionally, to illustrate their properties and use in a procedure for an optimum subset of significant tests, an example from a functional genomics study is presented.
Similar content being viewed by others
References
Allison DB, Cui X, Page GP, Sabripour M (2006) Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7: 55–65.
Allison DB, Gadbury GL, Heo M, Fernandez JR, Lee C-K, Prolla TA, Weindruch R (2002) A mixture model approach for the analysis of microarray gene expression data. Comput. Stat. Data Anal. 39: 1–20.
Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD (1972) Statistical inference under order restrictions. John Wiley & Sons, New York.
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57: 289–300.
Bohning D (1999) Computer-assisted analysis of mixtures and applications: meta-analysis, disease mapping and others. Chapman & Hall, Boca Raton, FL.
Bowyer JF, Harris AJ, Delongchamp RR, Jakab R, Miller DB, Little AR, O’callaghan JP (2004) Selective changes in gene expression in cortical regions sensitive to amphetamine during the neurodegenerative process. Neurotoxicology 25: 555–572.
Delongchamp RR, Bowyer JF, Chen JJ, Kodell RL (2004) Multiple-testing strategy for analyzing cDNA array data on gene expression. Biometrics 60: 774–782.
Delongchamp RR, Harris AJ, Bowyer JF (2003) A statistical approach in using cDNA array analysis to finding modest, 2-fold or less, changes in several brain regions after neurotoxic insult. Ann. N. Y. Acad. Sci. 993: 363–376.
Dudoit S, Shaffer JP, Boldrick JC (2003) Multiple hypothesis testing in microarray experiments. Stat. Sci. 18: 71–103.
Hsueh H-M, Chen JJ, Kodell RL (2003) Comparison of methods for estimating the number of true hypotheses in multiplicity testing. J. Biopharm. Stat. 13: 675–689.
Lay JO, Borgmann S, Liyanage R, Wilkins CL (2006) Problems with the “omics”. Trends Analyt. Chem. 25: 1046–1056.
Pounds S, Morris SW (2003) Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics 19: 1236–1242.
Schweder T, Spjotvoll E (1982) Plots of p-values to evaluate many tests simultaneously. Biometrika 69: 493–502.
Storey JD (2002) A direct approach to false discovery rates. J. R. Stat. Soc. Series B Stat. Methodol. 64: 479–498.
Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation, and simultaneous conservative consistency of false discovery rates: A unified approach. J. R. Stat. Soc. Series B Stat. Methodol. 66: 187–205.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Delongchamp, R.R., Razzaghi, M. & Lee, T. Estimating false discovery rate and false non-discovery rate using the empirical cumulative distribution function of p-values in ‘omics’ studies. Genes Genom 33, 461–466 (2011). https://doi.org/10.1007/s13258-011-0052-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13258-011-0052-y