Abstract
Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011, and included all studies evaluating simulation-based education for health professionals in comparison with no intervention or another simulation intervention. Reviewers working in duplicate abstracted information to calculate standardized mean differences (SMD’s). We included 897 original research studies. Among the 627 no-intervention-comparison studies the median sample size was 25. Only two studies (0.3 %) had ≥80 % power to detect a small difference (SMD > 0.2 standard deviations) and 136 (22 %) had power to detect a large difference (SMD > 0.8). 110 no-intervention-comparison studies failed to find a statistically significant difference, but none excluded a small difference and only 47 (43 %) excluded a large difference. Among 297 studies comparing alternate simulation approaches the median sample size was 30. Only one study (0.3 %) had ≥80 % power to detect a small difference and 79 (27 %) had power to detect a large difference. Of the 128 studies that did not detect a statistically significant effect, 4 (3 %) excluded a small difference and 91 (71 %) excluded a large difference. In conclusion, most education research studies are powered only to detect effects of large magnitude. For most studies that do not reach statistical significance, the possibility of large and important differences still exists.
Similar content being viewed by others
References
Borenstein, M. (2009). Effect sizes for continuous data. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis (2nd ed., pp. 221–235). New York: Sage.
Brody, B. A., Ashton, C. M., Liu, D., Xiong, Y., Yao, X., & Wray, N. P. (2013). Are surgical trials with negative results being interpreted correctly? Journal of the American College of Surgeons, 216(1), 158–166.
Chan, A. W., & Altman, D. G. (2005). Epidemiology and reporting of randomised trials published in PubMed journals. Lancet, 365, 1159–1162.
Charles, P., Giraudeau, B., Dechartres, A., Baron, G., & Ravaud, P. (2009). Reporting of sample size calculation in randomised controlled trials: Review. British Medical Journal, 338, b1732.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
Congressional Budget Office. (2007). Research on the comparative effectiveness of medical treatments: Issues and options for an expanded federal role. Congress of the United States.
Cook, D. A. (2012). If you teach them, they will learn: Why medical education needs comparative effectiveness research. Advances in Health Sciences Education, 17, 305–310.
Cook, D. A., Bordage, G., & Schmidt, H. G. (2008a). Description, justification, and clarification: A framework for classifying the purposes of research in medical education. Medical Education, 42, 128–133.
Cook, D. A., Brydges, R., Hamstra, S. J., Zendejas, B., Szostek, J. H., Wang, A. T., et al. (2012). Comparative effectiveness of technology-enhanced simulation versus other instructional methods: A systematic review and meta-analysis. Simulation in Healthcare, 7, 308–320.
Cook, D. A., Hamstra, S. J., Brydges, R., Zendejas, B., Szostek, J. H., Wang, A. T., et al. (2013). Comparative effectiveness of instructional design features in simulation-based education: Systematic review and meta-analysis. Medical Teacher, 35, e867–e898.
Cook, D. A., Hatala, R., Brydges, R., Zendejas, B., Szostek, J. H., Wang, A. T., et al. (2011a). Technology-enhanced simulation for health professions education: A systematic review and meta-analysis. The Journal of the American Medical Association, 306, 978–988.
Cook, D. A., Levinson, A. J., & Garside, S. (2011b). Method and reporting quality in health professions education research: A systematic review. Medical Education, 45, 227–238.
Cook, D. A., Levinson, A. J., Garside, S., Dupras, D. M., Erwin, P. J., & Montori, V. M. (2008b). Internet-based learning in the health professions: A meta-analysis. The Journal of the American Medical Association, 300, 1181–1196.
Cook, D. A., Levinson, A. J., Garside, S., Dupras, D. M., Erwin, P. J., & Montori, V. M. (2010). Instructional design variations in internet-based learning for health professions education: A systematic review and meta-analysis. Academic Medicine, 85, 909–922.
Cook, D. A., Thompson, W. G., & Thomas, K. G. (2014). Test-enhanced web-based learning: Optimizing the number of questions (a randomized crossover trial). Academic Medicine, 89, 169–175.
Curtin, F., Altman, D. G., & Elbourne, D. (2002). Meta-analysis combining parallel and cross-over clinical trials. I: Continuous outcomes. Statistics in Medicine, 21, 2131–2144.
Goodman, S. N., & Berlin, J. A. (1994). The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Annals of Internal Medicine, 121, 200–206.
Hansen, W. B., & Collins, L. M. (1994). Seven ways to increase power without increasing N. NIDA Research Monograph, 142, 184–195.
Hoenig, J. M., & Heisey, D. M. (2001). The abuse of power. The American Statistician, 55, 19–24.
Hulley, S. B., Cummings, S. R., Browner, W. S., Grady, D., Hearst, N., & Newman, T. B. (2001). Designing clinical research: An epidemiologic approach (2nd ed.). Philadelphia: Lippincott Williams & Wilkins.
Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in research findings. Thousand Oaks, CA: Sage.
Issenberg, S. B., McGaghie, W. C., Petrusa, E. R., Lee Gordon, D., & Scalese, R. J. (2005). Features and uses of high-fidelity medical simulations that lead to effective learning: A BEME systematic review. Medical Teacher, 27, 10–28.
Le Henanff, A., Giraudeau, B., Baron, G., & Ravaud, P. (2006). Quality of reporting of noninferiority and equivalence randomized trials. The Journal of the American Medical Association, 295, 1147–1151.
Lineberry, M., Walwanis, M., & Reni, J. (2013). Comparative research on training simulators in emergency medicine: A methodological review. Simulation in Healthcare, 8, 253–261.
Michalczyk, A. E., & Lewis, L. A. (1980). Significance alone is not enough. Journal of Medical Education, 55, 834–838.
Mills, E. J., Wu, P., Gagnier, J., & Devereaux, P. J. (2005). The quality of randomized trial reporting in leading medical journals since the revised CONSORT statement. Contemporary Clinical Trials, 26, 480–487.
Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. Annals of Internal Medicine, 151, 264–269.
Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychological Methods, 7, 105–125.
Piaggio, G., Elbourne, D. R., Pocock, S. J., Evans, S. W., Altman, D. G., for the CONSORT Group. (2012). Reporting of noninferiority and equivalence randomized trials: Extension of the consort 2010 statement. The Journal of the American Medical Association, 308, 2594–2604.
Schulz, K. F., Altman, D. G., & Moher, D. (2010). CONSORT 2010 statement: Updated guidelines for reporting parallel group randomized trials. Annals of Internal Medicine, 152, 726–732.
Sox, H. C., Greenfield, S., & for the Institute of Medicine Committee on Comparative Effectiveness Research Prioritization. (2009). Initial national priorities for comparative effectiveness research: Report brief. Washington, DC: National Academies Press.
Vavken, P., Heinrich, K. M., Koppelhuber, C., Rois, S., & Dorotka, R. (2009). The use of confidence intervals in reporting orthopaedic research findings. Clinical Orthopaedics and Related Research, 467(12), 3334–3339.
Woolley, T. W. (1983). A comprehensive power-analytic investigation of research in medical education. Journal of Medical Education, 58, 710–715.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cook, D.A., Hatala, R. Got power? A systematic review of sample size adequacy in health professions education research. Adv in Health Sci Educ 20, 73–83 (2015). https://doi.org/10.1007/s10459-014-9509-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10459-014-9509-5