nach oben

Quality of Life Research

Erschienen in:

13.07.2016 | Review

A review of empirical research related to the use of small quantitative samples in clinical outcome scale development

verfasst von: Carrie R. Houts, Michael C. Edwards, R. J. Wirth, Linda S. Deal

Erschienen in: Quality of Life Research | Ausgabe 11/2016

Einloggen, um Zugang zu erhalten

Abstract

Introduction

There has been a notable increase in the advocacy of using small-sample designs as an initial quantitative assessment of item and scale performance during the scale development process. This is particularly true in the development of clinical outcome assessments (COAs), where Rasch analysis has been advanced as an appropriate statistical tool for evaluating the developing COAs using a small sample.

Methods

We review the benefits such methods are purported to offer from both a practical and statistical standpoint and detail several problematic areas, including both practical and statistical theory concerns, with respect to the use of quantitative methods, including Rasch-consistent methods, with small samples.

Conclusions

The feasibility of obtaining accurate information and the potential negative impacts of misusing large-sample statistical methods with small samples during COA development are discussed.

The parameters from this article were selected simply as representative of “real-world” values from a recently published COA analysis. Their use here is one of convenience and should not be taken as a judgement of the analyses conducted or obtained parameter estimates, which were psychometrically sound and found using a sample of over 200 observations.

Patrick, D. L., Burke, L. B., Gwaltney, C. J., Leidy, N. K., Martin, M. L., Molsen, E., et al. (2011). Content validity—establishing and reporting the evidence in newly developed patient-reported outcomes (PRO) instruments for medical product evaluation: ISPOR PRO good research practices task force report: Part 2—Assessing respondent understanding. Value in Health, 14, 978–988.CrossRefPubMed

Stansbury, J. P. (2013). Mixed methods to enhance content validity of measures for use in drug-development trials. In A. Slagle (Eds.), Mixed methods—FDA perspective: Incorporating mixed methods to enhance content validity in drug-development tools. Panel conducted at the patient reported outcome (PRO) consortium workshop. Silver springs, MD. Retrieved from http://c-path.org/PROSlides/Workshop3/2012_PROConsortium_PanelSession2.pdf.

Gorecki, C., Lamping, D. L., Nixon, J., Brown, J. M., & Cano, S. (2012). Applying mixed methods to pretest the Pressure Ulcer Quality of Life (PU_QOL) instrument. Quality of Life Research, 21, 441–451.CrossRefPubMed

Cappelleri, J. C. (2012). Classical test theory and item response theory: A brief overview. In J. Lundy (Eds.), Mixed methods approach to assuring content validity. Panel conducted at the patient reported outcome (PRO) consortium workshop, Silver springs, MD. Retrieved from http://c-path.org/wp-content/uploads/2013/09/2012_PROConsortium_PanelSession2.pdf.

Lord, F. M. (1983). Small N justifies Rasch model. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing. New York: Academic Press.

Linacre, J. M. (1994). Sample size and item calibration stability. Rasch Measurement Transactions, 7, 328. Retrieved from http://www.rasch.org/rmt/rmt74m.htm.

Cappelleri, J. C. (2013). Mixed method approach to evaluating content validity: review and update In A. Slagle (Eds.), Mixed methods—industry and academic experience. panel conducted at the patient reported outcome (PRO) consortium workshop, Silver springs, MD. Retrieved from http://c-path.org/wp-content/uploads/2013/09/PRO_Consortium_PanelDiscussion1.pdf.

Petrillo, J., Cano, S. J., Mcleod, L. D., & Coon, C. D. (2015). Using classical test theory, item response theory, and Rasch measurement theory to evaluate patient-reported outcome measures: A comparison of worked examples. Value in Health, 18, 25–34.CrossRefPubMed

Lee, O. K. (1992). Variance in mathematics and reading across grades: Grade equivalents and logits. Rasch Measurement Transactions, 6, 222–223. Retrieved from http://www.rasch.org/rmt/rmt62f.htm.

10.

Linacre, J. M. (1999). Investigating rating scale category utility. Journal of Outcome Measurement, 3, 103–122.PubMed

11.

Takane, Y., & de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika, 52, 393–408.CrossRef

12.

Costello, A., B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Recommendations for getting the most from your analysis. Practical Assessment, Research, and Evaluation, 10,7. Retrieved from http://pareonline.net/getvn.asp?v=10&n=7.

13.

Anthoine, E., Moret, L., Regnault, A., Sbille, V., & Hardouin, J.-B. (2014). Sample size used to validate a scale: A review of publications on newly-developed patient reported outcome measures. Health and Quality of Life Outcomes, 12(1), 30–46.CrossRef

14.

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4, 84–99.CrossRef

15.

Choi, S., Cook, K., & Dodd, B. (1997). Parameter recovery for the partial credit model using MULTILOG. Journal of Outcome Measurement, 1, 114–142.PubMed

16.

DeMars, C. E. (2002). Recovery of graded response and partial credit parameters in MULTILOG and PARSCALE. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago, IL.

17.

French, G., & Dodd, B. (1999). Parameter recovery for the rating scale model using PARSCALE. Journal of Outcome Measurement, 3, 176–199.PubMed

18.

Goldman, S. H., & Raju, N. S. (1986). Recovery of one-and two-parameter logistic item parameters: An empirical study. Educational and Psychological Measurement, 46, 11–21.CrossRef

19.

Guyer, R., & Thompson, N. (2011). Item response theory parameter recovery using Xcalibre 4.1 (Technical Report). St. Paul, MN: Assessment Systems Corporation. Retrieved from http://www.assess.com/docs/Xcalibre_4.1_tech_report.pdf.

20.

He, Q., & Wheadon, C. (2008). The effect of sample size on item parameter estimation for the partial credit model. Centre for Education and Research Policy. Retrieved from https://cerp.aqa.org.uk/sites/default/files/pdf_upload/CERP_RP_QH_11122008.pdf.

21.

Le, L. T., & Adams, R. J. (2013). Accuracy of Rasch model item parameter estimation. Retrieved from Australian Council for Educational Research. http://research.acer.edu.au/cgi/viewcontent.cgi?article=1013&context=ar_misc.

22.

Meyer, J. P., & Hailey, E. (2012). A study of Rasch, partial credit, and rating scale model parameter recovery in WINSTEPS and jMetrik. Journal of Applied Measurement, 13, 248–258.PubMed

23.

Preinerstorfer, D., & Formann, A. K. (2012). Parameter recovery and model selection in mixed Rasch models. British Journal of Mathematical and Statistical Psychology, 65, 251–262.CrossRefPubMed

24.

Wang, W.-C., & Chen, C.-T. (2005). Item parameters recovery, standard error estimates, and fit statistics of the WINSTEPS program for the family of Rasch models. Educational and Psychological Measurement, 65, 376–404.CrossRef

25.

Green, K. E., & Frantom, C. G. (2002). Survey development and validation with the Rasch model. Paper presented at the International Conference on Questionnaire Development, Evaluation, and Testing, Charleston, SC.

26.

Wright, B. D. (1977). Misunderstanding the Rasch model. Journal of Educational Measurement, 14, 219–225.CrossRef

27.

Stone, M., & Yumoto, F. (2004). The effect of sample size for estimation Rasch/IRT parameters with dichotomous items. Journal of Applied Measurement, 5, 48–61.PubMed

28.

Chen, W.-H., Lenderking, W., Jin, Y., Wyrwich, K. W., Gelhorn, H., & Revicki, D. A. (2014). Is Rasch model analysis applicable in small sample pilot studies for assessing preliminary item characteristics? An example using PROMIS pain behavior item bank data. Quality of Life Research, 23, 485–493.CrossRefPubMed

29.

Smith, R. M. (1996). Polytomous mean-square fit statistics. Rasch Measurement Transactions, 10, 516–517. Retrieved from http://www.rasch.org/rmt/rmt103a.htm.

30.

Karabatsos, G. (2000). A critique of Rasch residual fit statistics. Journal of Applied Measurement, 1, 152–176.PubMed

31.

Wright, B. D., Linacre, J. M., Gustafson, J.-E., & Martin-Löf, P. (1994). Reasonable mean-square fit values. Rasch Measurement Transactions, 8, 370. Retrieved from http://www.rasch.org/rmt/rmt83b.htm.

32.

Smith, R. M., Schumacker, R. E., & Bush, M. J. (1998). Using item mean squares to evaluate fit to the Rasch model. Journal of Outcome Measurement, 2, 66–78.PubMed

33.

Smith, R. M. (1996). A comparison of the Rasch separate calibration and between-fit methods of detecting item bias. Educational and Psychological Measurement, 56, 403–418.CrossRef

34.

Linacre, J. M. (2000). Item discrimination and infit mean-squares. Rasch Measurement Transactions, 14, 743. Retrieved from http://www.rasch.org/rmt/rmt142a.htm.

Titel: A review of empirical research related to the use of small quantitative samples in clinical outcome scale development
verfasst von: Carrie R. Houts
Michael C. Edwards
R. J. Wirth
Linda S. Deal
Publikationsdatum: 13.07.2016
Verlag: Springer International Publishing
Erschienen in: Quality of Life Research / Ausgabe 11/2016
Print ISSN: 0962-9343
Elektronische ISSN: 1573-2649
DOI: https://doi.org/10.1007/s11136-016-1364-9

Springer Medizin

Abstract

Introduction

Methods

Conclusions

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Weitere Artikel der Ausgabe 11/2016

The health-related quality of life of ankylosing spondylitis patients assessed by SF-36: a systematic review and meta-analysis

The association between health literacy and self-management abilities in adults aged 75 and older, and its moderators

Health-related quality of life in patients with psoriasis: a systematic review of the European literature

Adaptation and validation of the “tolerability and quality of life” (TOOL) questionnaire in Chinese bipolar patients

Prospective associations of objectively assessed physical activity at different intensities with subjective well-being in older adults

Patient-reported outcomes and surgical triage: A gap in patient-centered care?