Skip to main content
Erschienen in: Quality of Life Research 3/2017

01.12.2016

Impact of IRT item misfit on score estimates and severity classifications: an examination of PROMIS depression and pain interference item banks

verfasst von: Yue Zhao

Erschienen in: Quality of Life Research | Ausgabe 3/2017

Einloggen, um Zugang zu erhalten

Abstract

Purpose

In patient-reported outcome research that utilizes item response theory (IRT), using statistical significance tests to detect misfit is usually the focus of IRT model-data fit evaluations. However, such evaluations rarely address the impact/consequence of using misfitting items on the intended clinical applications. This study was designed to evaluate the impact of IRT item misfit on score estimates and severity classifications and to demonstrate a recommended process of model-fit evaluation.

Methods

Using secondary data sources collected from the Patient-Reported Outcome Measurement Information System (PROMIS) wave 1 testing phase, analyses were conducted based on PROMIS depression (28 items; 782 cases) and pain interference (41 items; 845 cases) item banks. The identification of misfitting items was assessed using Orlando and Thissen’s summed-score item-fit statistics and graphical displays. The impact of misfit was evaluated according to the agreement of both IRT-derived T-scores and severity classifications between inclusion and exclusion of misfitting items.

Results

The examination of the presence and impact of misfit suggested that item misfit had a negligible impact on the T-score estimates and severity classifications with the general population sample in the PROMIS depression and pain interference item banks, implying that the impact of item misfit was insignificant.

Conclusions

Findings support the T-score estimates in the two item banks as robust against item misfit at both the group and individual levels and add confidence to the use of T-scores for severity diagnosis in the studied sample. Recommendations on approaches for identifying item misfit (statistical significance) and assessing the misfit impact (practical significance) are given.
Fußnoten
1
The overall alpha level of .05 was adjusted with the total number of items in the respective PROMIS item bank. The adjusted alpha values from the smallest to largest ranged from .0018 (.05/28) to .05 for the PROMIS-DEP and ranged from .0012 (.05/41) to .05 for the PROMIS-PI.
 
Literatur
1.
Zurück zum Zitat Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage.
2.
Zurück zum Zitat Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., Hays, R. D., on behalf of the PROMIS Cooperative Group. (2010). Initial item banks and first wave testing of the Patient-Reported Outcomes Measurement Information System (PROMIS) network: 2005–2008. Journal of Clinical Epidemiology, 63, 1179–1194. doi:10.1016/j.jclinepi.2010.04.011.CrossRef Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., Hays, R. D., on behalf of the PROMIS Cooperative Group. (2010). Initial item banks and first wave testing of the Patient-Reported Outcomes Measurement Information System (PROMIS) network: 2005–2008. Journal of Clinical Epidemiology, 63, 1179–1194. doi:10.​1016/​j.​jclinepi.​2010.​04.​011.CrossRef
3.
Zurück zum Zitat Swaminathan, H., Hambleton, R. K., & Rogers, H. J. (2007). Assessing fit in item response models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Psychometrics. London: Elsevier. Swaminathan, H., Hambleton, R. K., & Rogers, H. J. (2007). Assessing fit in item response models. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics: Psychometrics. London: Elsevier.
4.
Zurück zum Zitat Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., et al. (2007). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the patient-reported outcome measurement information system (PROMIS). Medical Care, 45(5), S22–S31. doi:10.1097/01.mlr.0000250483.85507.04.CrossRefPubMed Reeve, B. B., Hays, R. D., Bjorner, J. B., Cook, K. F., Crane, P. K., Teresi, J. A., et al. (2007). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the patient-reported outcome measurement information system (PROMIS). Medical Care, 45(5), S22–S31. doi:10.​1097/​01.​mlr.​0000250483.​85507.​04.CrossRefPubMed
5.
Zurück zum Zitat Hambleton, R. K., & Han, N. (2005). Assessing the fit of IRT models to educational and psychological test data: A five step plan and several graphical displays. In W. R. Lenderking & D. Revicki (Eds.), Advances in health outcomes research methods, measurement, statistical analysis, and clinical applications (pp. 57–78). Washington: Degnon Associates. Hambleton, R. K., & Han, N. (2005). Assessing the fit of IRT models to educational and psychological test data: A five step plan and several graphical displays. In W. R. Lenderking & D. Revicki (Eds.), Advances in health outcomes research methods, measurement, statistical analysis, and clinical applications (pp. 57–78). Washington: Degnon Associates.
6.
Zurück zum Zitat Box, G. E. P., & Draper, N. R. (1987). Empirical model building and response surfaces. New York, NY: Wiley. Box, G. E. P., & Draper, N. R. (1987). Empirical model building and response surfaces. New York, NY: Wiley.
7.
Zurück zum Zitat Sinharay, S., & Haberman, S. J. (2014). How often is the misfit of item response theory models practically significant? Educational Measurement: Issues and Practice, 33(1), 23–35. doi:10.1111/emip.12024.CrossRef Sinharay, S., & Haberman, S. J. (2014). How often is the misfit of item response theory models practically significant? Educational Measurement: Issues and Practice, 33(1), 23–35. doi:10.​1111/​emip.​12024.CrossRef
8.
Zurück zum Zitat Zhao, Y. (2008). Approaches for addressing the fit of item response theory models to educational test data. Dissertation Abstract International, 69, 12A. (UMI No. 3337019). Zhao, Y. (2008). Approaches for addressing the fit of item response theory models to educational test data. Dissertation Abstract International, 69, 12A. (UMI No. 3337019).
10.
Zurück zum Zitat Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): Depression, anxiety, and anger. Assessment, 18(3), 263–283. doi:10.1177/1073191111411667.CrossRefPubMedPubMedCentral Pilkonis, P. A., Choi, S. W., Reise, S. P., Stover, A. M., Riley, W. T., & Cella, D. (2011). Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): Depression, anxiety, and anger. Assessment, 18(3), 263–283. doi:10.​1177/​1073191111411667​.CrossRefPubMedPubMedCentral
15.
Zurück zum Zitat Cleeland, C. S., Gonin, R., Hatfield, A. K., Edmonson, J. H., Blum, R. H., Stewart, J. A., et al. (1994). Pain and its treatment in outpatients with metastatic cancer. New England Journal of Medicine, 330(9), 592–596. doi:10.1056/NEJM199403033300902.CrossRefPubMed Cleeland, C. S., Gonin, R., Hatfield, A. K., Edmonson, J. H., Blum, R. H., Stewart, J. A., et al. (1994). Pain and its treatment in outpatients with metastatic cancer. New England Journal of Medicine, 330(9), 592–596. doi:10.​1056/​NEJM199403033300​902.CrossRefPubMed
16.
17.
Zurück zum Zitat Muthén, L. K., & Muthén, B. O. (2006). Mplus [Computer software]. Los Angeles, CA: Muthén & Muthén. Muthén, L. K., & Muthén, B. O. (2006). Mplus [Computer software]. Los Angeles, CA: Muthén & Muthén.
20.
Zurück zum Zitat Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. doi:10.1080/10705519909540118.CrossRef Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. doi:10.​1080/​1070551990954011​8.CrossRef
21.
Zurück zum Zitat Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria what did they really say? Organizational Research Methods, 9(2), 202–220. doi:10.1177/1094428105284919.CrossRef Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria what did they really say? Organizational Research Methods, 9(2), 202–220. doi:10.​1177/​1094428105284919​.CrossRef
23.
Zurück zum Zitat Cai, L., Thissen, D., & du Toit, S. (2015). IRTPRO [Computer software]. Lincolnwood, IL: Scientific Software International. Cai, L., Thissen, D., & du Toit, S. (2015). IRTPRO [Computer software]. Lincolnwood, IL: Scientific Software International.
24.
Zurück zum Zitat Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57, 289–300. doi:10.2307/2346101. Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57, 289–300. doi:10.​2307/​2346101.
26.
Zurück zum Zitat Thissen, D., Chen, W.-H., & Bock, R. D. (2003). Multilog 7.03 [Computer software]. Lincolnwood, IL: Scientific Software International. Thissen, D., Chen, W.-H., & Bock, R. D. (2003). Multilog 7.03 [Computer software]. Lincolnwood, IL: Scientific Software International.
28.
Zurück zum Zitat Kim, S., & Kolen, M. J. (2004). STUIRT: A computer program for scale transformation under unidimensional item response theory models (Version 1.0). Iowa Testing Programs, University of Iowa. Kim, S., & Kolen, M. J. (2004). STUIRT: A computer program for scale transformation under unidimensional item response theory models (Version 1.0). Iowa Testing Programs, University of Iowa.
29.
Zurück zum Zitat Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
31.
Zurück zum Zitat Orlando, M., & Thissen, D. (2003). Further investigation of the performance of S-X2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27(4), 289–298. doi:10.1177/0146621603027004004.CrossRef Orlando, M., & Thissen, D. (2003). Further investigation of the performance of S-X2: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27(4), 289–298. doi:10.​1177/​0146621603027004​004.CrossRef
35.
Zurück zum Zitat Choi, S. W., Reise, S. P., Pilkonis, P. A., Hays, R. D., & Cella, D. (2010). Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Quality of Life Research : An International Journal of Quality of Life Aspects of Treatment, Care and Rehabilitation, 19(1), 125–136. doi:10.1007/s11136-009-9560-5.CrossRef Choi, S. W., Reise, S. P., Pilkonis, P. A., Hays, R. D., & Cella, D. (2010). Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Quality of Life Research : An International Journal of Quality of Life Aspects of Treatment, Care and Rehabilitation, 19(1), 125–136. doi:10.​1007/​s11136-009-9560-5.CrossRef
Metadaten
Titel
Impact of IRT item misfit on score estimates and severity classifications: an examination of PROMIS depression and pain interference item banks
verfasst von
Yue Zhao
Publikationsdatum
01.12.2016
Verlag
Springer International Publishing
Erschienen in
Quality of Life Research / Ausgabe 3/2017
Print ISSN: 0962-9343
Elektronische ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-016-1467-3

Weitere Artikel der Ausgabe 3/2017

Quality of Life Research 3/2017 Zur Ausgabe