Skip to main content
Erschienen in: Quality of Life Research 1/2007

01.08.2007 | Original Paper

IRT health outcomes data analysis project: an overview and summary

verfasst von: Karon F. Cook, Cayla R. Teal, Jakob B. Bjorner, David Cella, Chih-Hung Chang, Paul K. Crane, Laura E. Gibbons, Ron D. Hays, Colleen A. McHorney, Katja Ocepek-Welikson, Anastasia E. Raczek, Jeanne A. Teresi, Bryce B. Reeve

Erschienen in: Quality of Life Research | Sonderheft 1/2007

Einloggen, um Zugang zu erhalten

Abstract

Background

In June 2004, the National Cancer Institute and the Drug Information Association co-sponsored the conference, “Improving the Measurement of Health Outcomes through the Applications of Item Response Theory (IRT) Modeling: Exploration of Item Banks and Computer-Adaptive Assessment.” A component of the conference was presentation of a psychometric and content analysis of a secondary dataset.

Objectives

A thorough psychometric and content analysis was conducted of two primary domains within a cancer health-related quality of life (HRQOL) dataset.

Research design

HRQOL scales were evaluated using factor analysis for categorical data, IRT modeling, and differential item functioning analyses. In addition, computerized adaptive administration of HRQOL item banks was simulated, and various IRT models were applied and compared.

Subjects

The original data were collected as part of the NCI-funded Quality of Life Evaluation in Oncology (Q-Score) Project. A total of 1,714 patients with cancer or HIV/AIDS were recruited from 5 clinical sites.

Measures

Items from 4 HRQOL instruments were evaluated: Cancer Rehabilitation Evaluation System–Short Form, European Organization for Research and Treatment of Cancer Quality of Life Questionnaire, Functional Assessment of Cancer Therapy and Medical Outcomes Study Short-Form Health Survey.

Results and conclusions

Four lessons learned from the project are discussed: the importance of good developmental item banks, the ambiguity of model fit results, the limits of our knowledge regarding the practical implications of model misfit, and the importance in the measurement of HRQOL of construct definition. With respect to these lessons, areas for future research are suggested. The feasibility of developing item banks for broad definitions of health is discussed.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Chang, C.-H., & Cella, D. (1997). Equating health-related quality of life instruments in applied oncology settings. Physical Medicine and Rehabilitation: States of the Art Reviews, 11, 397–406. Chang, C.-H., & Cella, D. (1997). Equating health-related quality of life instruments in applied oncology settings. Physical Medicine and Rehabilitation: States of the Art Reviews, 11, 397–406.
2.
Zurück zum Zitat Ganz, P. A., Schag, C. A., Lee, J. J., & Sim, M. S. (1992). The CARES: A generic measure of health-related quality of life for patients with cancer. Quality of Life Research, 1, 19–29.PubMedCrossRef Ganz, P. A., Schag, C. A., Lee, J. J., & Sim, M. S. (1992). The CARES: A generic measure of health-related quality of life for patients with cancer. Quality of Life Research, 1, 19–29.PubMedCrossRef
3.
Zurück zum Zitat Schag, C. A., Ganz, P. A., & Heinrich, R. L. (1991). CAncer Rehabilitation Evaluation System-short form (CARES-SF). A cancer specific rehabilitation and quality of life instrument. Cancer, 68, 1406–1413.PubMedCrossRef Schag, C. A., Ganz, P. A., & Heinrich, R. L. (1991). CAncer Rehabilitation Evaluation System-short form (CARES-SF). A cancer specific rehabilitation and quality of life instrument. Cancer, 68, 1406–1413.PubMedCrossRef
4.
Zurück zum Zitat Aaronson, N. K., Ahmedzai, S., Bergman, B., Bullinger, M., Cull, A., Duez, N. J., Filiberti, A., Flechtner, H., Fleishman, S. B., & de Haes, J. C. (1993). The European organization for research and treatment of cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. Journal of the National Cancer Institute, 85, 365–376.PubMedCrossRef Aaronson, N. K., Ahmedzai, S., Bergman, B., Bullinger, M., Cull, A., Duez, N. J., Filiberti, A., Flechtner, H., Fleishman, S. B., & de Haes, J. C. (1993). The European organization for research and treatment of cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. Journal of the National Cancer Institute, 85, 365–376.PubMedCrossRef
5.
Zurück zum Zitat Cella, D. F., & Bonomi, A. E. (1995). Measuring quality of life: 1995 update. Oncology (Williston Park), 9, 47–60. Cella, D. F., & Bonomi, A. E. (1995). Measuring quality of life: 1995 update. Oncology (Williston Park), 9, 47–60.
6.
Zurück zum Zitat Cella, D. F., Tulsky, D. S., Gray, G., Sarafian, B., Linn, E., Bonomi, A., Silberman, M., Yellen, S. B., Winicour, P., Brannon, J., & et al. (1993). The Functional Assessment of Cancer Therapy Scale: Development and validation of the general measure. Journal of Clinical Oncology, 11, 570–579.PubMed Cella, D. F., Tulsky, D. S., Gray, G., Sarafian, B., Linn, E., Bonomi, A., Silberman, M., Yellen, S. B., Winicour, P., Brannon, J., & et al. (1993). The Functional Assessment of Cancer Therapy Scale: Development and validation of the general measure. Journal of Clinical Oncology, 11, 570–579.PubMed
7.
Zurück zum Zitat Hays, R. D., Sherbourne, C. D., & Mazel, R. M. (1993). The RAND 36-Item Health Survey 1.0. Health Economics, 2, 217–227.PubMedCrossRef Hays, R. D., Sherbourne, C. D., & Mazel, R. M. (1993). The RAND 36-Item Health Survey 1.0. Health Economics, 2, 217–227.PubMedCrossRef
8.
Zurück zum Zitat Ware, J. E., Jr., & Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30, 473–483.PubMedCrossRef Ware, J. E., Jr., & Sherbourne, C. D. (1992). The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selection. Medical Care, 30, 473–483.PubMedCrossRef
9.
Zurück zum Zitat Nandakumar, R. (2004). Traditional dimensionality versus essential dimensionality. Journal of Educational Measurement, 28, 99–117.CrossRef Nandakumar, R. (2004). Traditional dimensionality versus essential dimensionality. Journal of Educational Measurement, 28, 99–117.CrossRef
10.
Zurück zum Zitat Smith, E. V., Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3, 205–231.PubMed Smith, E. V., Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3, 205–231.PubMed
11.
Zurück zum Zitat Muthen, B. O., & Muthen, L. K. (2001). Mplus User’s Guide. Version 2. Los Angeles, CA: Muthen & Muthen. Muthen, B. O., & Muthen, L. K. (2001). Mplus User’s Guide. Version 2. Los Angeles, CA: Muthen & Muthen.
12.
Zurück zum Zitat Hu, L., & Bentler, P. M. (1995). Evaluating model fit. In: R. H. Hoyle (Ed.), Structural equation modeling: concepts, issues and applications (pp. 76–79). Thousand Oaks, CA: Sage Publications. Hu, L., & Bentler, P. M. (1995). Evaluating model fit. In: R. H. Hoyle (Ed.), Structural equation modeling: concepts, issues and applications (pp. 76–79). Thousand Oaks, CA: Sage Publications.
13.
Zurück zum Zitat Bentler, P. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107, 238–246.PubMedCrossRef Bentler, P. (1990). Comparative fit indices in structural models. Psychological Bulletin, 107, 238–246.PubMedCrossRef
14.
Zurück zum Zitat Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models. Newbury Park, CA: Sage Publications. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In: K. A. Bollen, & J. S. Long (Eds.), Testing structural equation models. Newbury Park, CA: Sage Publications.
15.
Zurück zum Zitat Kline, R. B. (1998). Principles and practice of structural equation modeling. New York, NY: The Guilford Press. Kline, R. B. (1998). Principles and practice of structural equation modeling. New York, NY: The Guilford Press.
16.
Zurück zum Zitat McDonald, R. P. (1999). Test theory: A unified treatment. Mahway, NJ: Lawrence Earlbaum. McDonald, R. P. (1999). Test theory: A unified treatment. Mahway, NJ: Lawrence Earlbaum.
17.
Zurück zum Zitat Hu, L. T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424–453.CrossRef Hu, L. T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424–453.CrossRef
18.
Zurück zum Zitat Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–173.CrossRef Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–173.CrossRef
19.
Zurück zum Zitat Muraki, E. (1992). A generalized partial credit model: Application of an EM-algorithm. Applied Psychological Measurement, 16, 159.CrossRef Muraki, E. (1992). A generalized partial credit model: Application of an EM-algorithm. Applied Psychological Measurement, 16, 159.CrossRef
20.
Zurück zum Zitat Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, No. 17. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, No. 17.
21.
Zurück zum Zitat Muraki, E., & Bock, R. D. (1997). PARSCALE 3: IRT based test scoring and item analysis for graded items and rating scales. Chicago, IL: Scientific Software International, Inc. Muraki, E., & Bock, R. D. (1997). PARSCALE 3: IRT based test scoring and item analysis for graded items and rating scales. Chicago, IL: Scientific Software International, Inc.
22.
Zurück zum Zitat Linacre, J. M. (2002). WINSTEPS: Rasch-model computer program. Version 3.36. Chicago: MESA Press. Linacre, J. M. (2002). WINSTEPS: Rasch-model computer program. Version 3.36. Chicago: MESA Press.
23.
Zurück zum Zitat Verhelst, N. D., & Glas, C. A. W. (1995). The one parameter-logistic model. New York: Springer-Verlag. Verhelst, N. D., & Glas, C. A. W. (1995). The one parameter-logistic model. New York: Springer-Verlag.
24.
Zurück zum Zitat Stone, C. A., & Zhang, B. (2003). Assessing goodness of fit of item response theory models: A comparison of traditional and alternative procedures. Journal of Educational Measurement, 4, 331–352.CrossRef Stone, C. A., & Zhang, B. (2003). Assessing goodness of fit of item response theory models: A comparison of traditional and alternative procedures. Journal of Educational Measurement, 4, 331–352.CrossRef
25.
Zurück zum Zitat Stone, C. A. (2000). Monte Carlo based null distribution for an alternative goodness-of-fit test statistic in IRT models. Journal of Educational Measurement, 37(1), 58–75.CrossRef Stone, C. A. (2000). Monte Carlo based null distribution for an alternative goodness-of-fit test statistic in IRT models. Journal of Educational Measurement, 37(1), 58–75.CrossRef
26.
Zurück zum Zitat Stone, C. A. (2003). Empirical power and type I error rates for an IRT fit statistic that considers the precision of ability estimates. Educational and Psychological Measurement, 63, 566–586.CrossRef Stone, C. A. (2003). Empirical power and type I error rates for an IRT fit statistic that considers the precision of ability estimates. Educational and Psychological Measurement, 63, 566–586.CrossRef
27.
Zurück zum Zitat Glas, C. A. W. (1999). Modification indices for the 2-PL and the nominal response model. Psychometrika, 64, 273–294.CrossRef Glas, C. A. W. (1999). Modification indices for the 2-PL and the nominal response model. Psychometrika, 64, 273–294.CrossRef
28.
Zurück zum Zitat Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50–64.CrossRef Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24, 50–64.CrossRef
29.
Zurück zum Zitat Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: Mesa Press. Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: Mesa Press.
30.
Zurück zum Zitat Wright, B. D. (1994). Reasonable mean-square fit. Rasch Measurement Transactions, 8, 370. Wright, B. D. (1994). Reasonable mean-square fit. Rasch Measurement Transactions, 8, 370.
31.
Zurück zum Zitat Smith, R. M., & Suh, K. K. (2003). Rasch fit statistics as a test of the invariance of item parameter estimates. Journal of Applied Measurement, 4, 153–163.PubMed Smith, R. M., & Suh, K. K. (2003). Rasch fit statistics as a test of the invariance of item parameter estimates. Journal of Applied Measurement, 4, 153–163.PubMed
32.
Zurück zum Zitat Groenvold, M., Bjorner, J. B., Klee, M. C., & Kreiner, S. (1995). Test for item bias in a quality of life questionnaire. Journal of Clinical Epidemiology, 48, 805–816.PubMedCrossRef Groenvold, M., Bjorner, J. B., Klee, M. C., & Kreiner, S. (1995). Test for item bias in a quality of life questionnaire. Journal of Clinical Epidemiology, 48, 805–816.PubMedCrossRef
33.
Zurück zum Zitat Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.CrossRef Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.CrossRef
34.
Zurück zum Zitat Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Canada: Directorate of Human Resources Research and Evaluation, Department of National Defense. Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, Canada: Directorate of Human Resources Research and Evaluation, Department of National Defense.
35.
Zurück zum Zitat Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage Publishers. Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage Publishers.
36.
Zurück zum Zitat Thissen, D. (1991). MULTILOG TM User’s Guide multiple, categorical item analysis and test scoring using item response theory. Chicago, IL: Scientific Software Inc. Thissen, D. (1991). MULTILOG TM User’s Guide multiple, categorical item analysis and test scoring using item response theory. Chicago, IL: Scientific Software Inc.
37.
Zurück zum Zitat Thissen, D. (2001). IRTLRDIF: Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning. Version 2.0b. Thissen, D. (2001). IRTLRDIF: Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning. Version 2.0b.
38.
Zurück zum Zitat Collins, W. C., Raju, N. S., & Edwards, J. E. (2000). Assessing differential functioning in a satisfaction scale. Journal of Applied Measurement, 85, 451–461. Collins, W. C., Raju, N. S., & Edwards, J. E. (2000). Assessing differential functioning in a satisfaction scale. Journal of Applied Measurement, 85, 451–461.
39.
Zurück zum Zitat Raju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19, 353–368.CrossRef Raju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19, 353–368.CrossRef
40.
Zurück zum Zitat STATA. (2004). College Station, TX: StataCorp LP STATA. (2004). College Station, TX: StataCorp LP
41.
Zurück zum Zitat Crane, P. K., Jolley, L., & van Belle, G. (2003). DIFdetect. Seattle, WA: University of Sashington. Crane, P. K., Jolley, L., & van Belle, G. (2003). DIFdetect. Seattle, WA: University of Sashington.
42.
Zurück zum Zitat Box, G., & Draper, N. (1987). Empirical model building and response surfaces. New York: John Wiley and Sons. Box, G., & Draper, N. (1987). Empirical model building and response surfaces. New York: John Wiley and Sons.
43.
Zurück zum Zitat Stewart, A. L., & Ware, J. E., Jr. (1992). Measuring functioning and well-being: The Medical Outcomes Study Approach. London: Duke University Press. Stewart, A. L., & Ware, J. E., Jr. (1992). Measuring functioning and well-being: The Medical Outcomes Study Approach. London: Duke University Press.
44.
Zurück zum Zitat Gardner, W., Kelleher, K. J., & Pajer, K. A. (2002). Multidimensional adaptive testing for mental health problems in primary care. Medical Care, 40, 812–823.PubMedCrossRef Gardner, W., Kelleher, K. J., & Pajer, K. A. (2002). Multidimensional adaptive testing for mental health problems in primary care. Medical Care, 40, 812–823.PubMedCrossRef
45.
Zurück zum Zitat Petersen, M. A., Groenvold, M., Aaronson, N., Fayers, P., Sprangers, M., & Bjorner, J. B. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluations. Quality of Life Research, 15, 315–329.PubMedCrossRef Petersen, M. A., Groenvold, M., Aaronson, N., Fayers, P., Sprangers, M., & Bjorner, J. B. (2006). Multidimensional computerized adaptive testing of the EORTC QLQ-C30: Basic developments and evaluations. Quality of Life Research, 15, 315–329.PubMedCrossRef
Metadaten
Titel
IRT health outcomes data analysis project: an overview and summary
verfasst von
Karon F. Cook
Cayla R. Teal
Jakob B. Bjorner
David Cella
Chih-Hung Chang
Paul K. Crane
Laura E. Gibbons
Ron D. Hays
Colleen A. McHorney
Katja Ocepek-Welikson
Anastasia E. Raczek
Jeanne A. Teresi
Bryce B. Reeve
Publikationsdatum
01.08.2007
Verlag
Springer Netherlands
Erschienen in
Quality of Life Research / Ausgabe Sonderheft 1/2007
Print ISSN: 0962-9343
Elektronische ISSN: 1573-2649
DOI
https://doi.org/10.1007/s11136-007-9177-5

Weitere Artikel der Sonderheft 1/2007

Quality of Life Research 1/2007 Zur Ausgabe