The online version of this article (doi:10.1186/1471-2288-14-26) contains supplementary material, which is available to authorized users.
The authors declare that they have no competing interests.
AZ, MK, OK implemented the confidence intervals. AZ, MK performed the simulation study and wrote the article. OK revised the manuscript. All authors read and approved the final manuscript.
The area under the receiver operating characteristic (ROC) curve, referred to as the AUC, is an appropriate measure for describing the overall accuracy of a diagnostic test or a biomarker in early phase trials without having to choose a threshold. There are many approaches for estimating the confidence interval for the AUC. However, all are relatively complicated to implement. Furthermore, many approaches perform poorly for large AUC values or small sample sizes.
The AUC is actually a probability. So we propose a modified Wald interval for a single proportion, which can be calculated on a pocket calculator. We performed a simulation study to compare this modified Wald interval (without and with continuity correction) with other intervals regarding coverage probability and statistical power.
The main result is that the proposed modified Wald intervals maintain and exploit the type I error much better than the intervals of Agresti-Coull, Wilson, and Clopper-Pearson. The interval suggested by Bamber, the Mann-Whitney interval without transformation and also the interval of the binormal AUC are very liberal. For small sample sizes the Wald interval with continuity has a comparable coverage probability as the LT interval and higher power. For large sample sizes the results of the LT interval and of the Wald interval without continuity correction are comparable.
If individual patient data is not available, but only the estimated AUC and the total sample size, the modified Wald intervals can be recommended as confidence intervals for the AUC. For small sample sizes the continuity correction should be used.
Additional file 3: Figure S1. Box plot of the interval length for n = (40,100,200) with a 1:1 case-control ratio and AUC 0 = (0.7,0.8,0.9) (cross = median, box = 25%-75%, whiskers = min - max). (TIFF 857 KB)
Additional file 4: Figure S2. Box plot of the coverage probability for continuous data and for ordinal data with five categories (n = (40,100,200) and AUC 0 = (0.7,0.8,0.9), cross = median, box = 25%-75%, whiskers = min - max). (TIFF 857 KB)
Additional file 5: Figure S3. Box plot of the interval length for increasing variance of the cases σ 1 (variance of the controls σ 0 = 1,n=(40,100,200) and AUC 0 = (0.7,0.8,0.9), cross = median, box = 25%-75% quantile, whiskers = min - max). (TIFF 857 KB)
EMA: Guideline on clinical evaluation of diagnostic agents. Doc. Ref. CPMP/EWP/1119/98/Rev. 1. 2010
Wang L, Fahim M, Hayen A, Mitchell R, Baines L, Lord S, Craig J, Webster A: Cardiac testing for coronary artery disease in potential kidney transplant recipients. Cochrane Database Syst Rev. 2011, 12: 1-105. CrossRef
Ziegler A, König I, Schulz-Knappe M: Challenges in planning and conducting diagnostic studies with molecular biomarkers. Dtsch Med Wochenschr. 2013, 138: 2-13. CrossRef
Ostroff R, Mehan M, Stewart A, Ayers D, Brody E, Williams S, Levin S, Black B, Harbut M, Carbone M, Gobaraju C, Pass H: Early detection of malignant pleural mesothelioma in asbestos-exposed individuals with a non-invasive proteomics-based surveillance tool. PLoS One. 2012, 7 (10): 46091-101371. 10.1371/journal.pone.0046091. CrossRef
Bamber D: The area above the ordinal dominance graph and the area below receiver operating characteristic graph. J Math Psychol. 12: 387-415.
Qin G, Hotilovac L: Comparison of non-parametric confidence intervals for the area under the roc curve of a continuous-scale diagnostic test. Stat Methods Med Res. 2008, 17: 207-221.
Pepe M: The Statistical Evaluation of Medical Tests for Classification and Prediction. 2003, Oxford: Oxford University Press
Brunner E, Puri M: Nonparametric methods in factorial designs. Stat Papers. 2001, 42: 1-52. 10.1007/s003620000039. CrossRef
Newcombe R: Confidence Intervals for Proportions and Related Measures of Effect Size. 2013, London: Chapman & Hall/CRC Biostatistics Series
He X, Wu S: Confidence intervals for the binomial proportion with zero frequency. Pharma SUG. 2009, 10-2009.
Agresti A, Coull B: Approximate is better than "exact" for interval estimations of binomial proportions. Am Stat. 1998, 52 (2): 119-126.
Ruymgaart F: A unified approach to the asymptotic distribution theory of certain midrank statistics. Lecture Notes on Mathematics, Statistique Non Parametrique Asymptotique, No 821. 1980, Berlin: Springer, 1-18. CrossRef
Brunner E, Munzel U, Puri M: The multivariate nonparametric behrens-fisher problem. J Stat Plan Inference. 2002, 108: 37-53. 10.1016/S0378-3758(02)00269-0. CrossRef
Inc SI: SAS/STAT®;9.3 User’s Guide. 2011, Cary, North Carolina: SAS Institute Inc.
Wilson E: Probable inference, the law of succession, and statistical inference. J Am Stat Assoc. 1927, 22: 209-212. 10.1080/01621459.1927.10502953. CrossRef
Clopper C, Pearson E: The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika. 1934, 26 (4): 404-413. 10.1093/biomet/26.4.404. CrossRef
Birnbaum Z, Klose O: Bounds for the variance of the mann-whitney statistic. Ann Math Stat. 1957, 38: 933-945. CrossRef
Wieand S, Gail M, James B, James K: A family of non-parametric statistics for comparing diagnostic markers with paired and unpaired data. Biometrika. 1989, 76: 585-592. 10.1093/biomet/76.3.585. CrossRef
- A modified Wald interval for the area under the ROC curve (AUC) in diagnostic case-control studies
- BioMed Central
Neu im Fachgebiet AINS
Meistgelesene Bücher aus dem Fachgebiet AINS
Mail Icon II