Patient-reported outcome measures (PROMs) are frequently used in heterogeneous patient populations. PROM scores may lead to biased inferences when sources of heterogeneity (e.g., gender, ethnicity, and social factors) are ignored. Latent variable mixture models (LVMMs) can be used to examine measurement invariance (MI) when sources of heterogeneity in the population are not known a priori. The goal of this article is to discuss the use of LVMMs to identify invariant items within the context of test construction.
The Draper-Lindely-de Finetti (DLD) framework for the measurement of latent variables provides a theoretical context for the use of LVMMs to identify the most invariant items in test construction. In an expository analysis using 39 items measuring daily activities, LVMMs were conducted to compare 1- and 2-class item response theory models (IRT). If the 2-class model had better fit, item-level logistic regression differential item functioning (DIF) analyses were conducted to identify items that were not invariant. These items were removed and LVMMs and DIF testing repeated until all remaining items showed MI.
The 39 items had an essentially unidimensional measurement structure. However, a 1-class IRT model resulted in many statistically significant bivariate residuals, indicating suboptimal fit due to remaining local dependence. A 2-class LVMM had better fit. Through subsequent rounds of LVMMs and DIF testing, nine items were identified as being most invariant.
The DLD framework and the use of LVMMs have significant potential for advancing theoretical developments and research on item selection and the development of PROMs for heterogeneous populations.
Fayers, P. M., & Machin, D. (2016). Quality of life: The assessment, analysis and reporting of patient-reported outcomes (3rd ed.). Chichester, UK: Wiley.
Streiner, D. L., Norman, G. R., & Cairney, J. (2015). Health measurement scales: A practical guide to their development and use (5th ed.). Oxford: Oxford University Press. CrossRef
Finch, W. H., & Finch, M. E. H. (2013). Investigation of specific learning disability and testing accommodations based differential item functioning using a multilevel multidimensional mixture item response theory model. Educational and Psychological Measurement, 73(6), 973–993. doi: 10.1177/0013164413494776. CrossRef
Zumbo, B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In R. W. Lissitz (Ed.), The concept of validity: Revisions, new directions and applications (pp. 65–82). Charlotte, NC: Information Age Publishing.
Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao & S. Sinharay (Eds.), Handbook of statistics (Vol. 26, pp. 45–79). Amsterdam: Elsevier.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. London: Sage.
Embretson, S., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.
Sawatzky, R., Chan, E. K. H., Zumbo, B. D., Ahmed, S., Bartlett, S. J., Bingham III, C. O., et al. (2016). Modern perspectives of measurement validation emphasize justification of inferences based on patient-reported outcome scores: Seventh paper in a series on patient reported outcomes. Journal of Clinical Epidemiology. doi: 10.1016/j.jclinepi.2016.12.002. CrossRefPubMed
DeVellis, R. F. (2012). Scale development: Theory and applications (3ed., vol. 26). Newbury Park, CA: Sage.
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–69. CrossRef
Byrne, B. M. (1998). Structural equation modeling with LISREL, PRELIS, and SIMPLIS: Basic concepts, applications, and programming. Mahwah, NJ: L. Erlbaum.
Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: L. Erlbaum Associates.
Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zumbo, B. D. (2007). Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4(2), 223–233. CrossRef
Morales, L. S., Flowers, C., Gutierrez, P., Kleinman, M., & Teresi, J. A. (2006). Item and scale differential functioning of the mini-mental state exam assessed using the differential item and test functioning (DFIT) framework. Medical Care, 44(11 Suppl 3), S143–151. doi: 10.1097/01.mlr.0000245141.70946.29. CrossRefPubMedCentralPubMed
Samuelsen, K. M. (2008). Examining differential item functioning from a latent mixture perspective. In G. R. Hancock & K. M. Samuelsen (Eds.), Advances in latent variable mixture models (pp. 177–198). Charlotte, NC: Information Age Publishing.
Pohl, S., Südkamp, A., Hardt, K., Carstensen, C. H., & Weinert, S. (2016). Testing students with special educational needs in large-scale assessments—Psychometric properties of test scores and associations with test taking behavior. Frontiers in Psychology, 7, 154. doi: 10.3389/fpsyg.2016.00154. CrossRefPubMedCentralPubMed
Bernstein, A., Stickle, T. R., Zvolensky, M. J., Taylor, S., Abramowitz, J., & Stewart, S. (2010). Dimensional, categorical, or dimensional-categories: Testing the latent structure of anxiety sensitivity among adults using factor-mixture modeling. Behavior Therapy, 41(4), 515–529. doi: 10.1016/j.beth.2010.02.003. CrossRefPubMed
Clark, S. L., Muthén, B., Kaprio, J., D’Onofrio, B. M., Viken, R., & Rose, R. J. (2013). Models and strategies for factor mixture analysis: An example concerning the structure underlying psychological disorders. Structural Equation Modeling: A Multidisciplinary Journal, 20(4), 681–703. doi: 10.1080/10705511.2013.824786. CrossRef
Kopec, J. A., Sayre, E. C., Davis, A. M., Badley, E. M., Abrahamowicz, M., Sherlock, L., et al. (2006). Assessment of health-related quality of life in arthritis: Conceptualization and development of five item banks using item response theory. Health Quality of Life Outcomes, 4(1), 33. doi: 10.1186/1477-7525-4-33. CrossRefPubMedCentralPubMed
Muthén, B., & Muthén, L. (2015). MPlus (version 7.4). Los Angeles, CA: Statmodel.
IBM Corp. (2016). IBM SPSS Statistics for Windows, Version 24.0. Armonk, NY: IBM.
Slocum-Gori, S. L., & Zumbo, B. D. (2011). Assessing the unidimensionality of psychological scales: Using multiple criteria from factor analysis. Social Indicators Research, 102(3), 443–461. CrossRef
Kline, R. B. (2010). Principles and practice of structural equation modeling (3rd ed.). New York: Guilford.
Samejima, F. (1997). Graded response model. In W. J. Linden & R. K. Hambelton (Eds.), Handbook of modern item response theory (pp. 85–100). New York: Springer.
Wang, C. P., Brown, C. H., & Bandeen-Roche, K. (2005). Residual diagnostics for growth mixture models: Examining the impact of a preventive intervention on multiple trajectories of aggressive behavior. Journal of the American Statistical Association, 100, 1054–1076. doi: 10.1198/016214505000000501. CrossRef
Muthén, B., & Muthén, L. (2007, November 16). Wald test of mean equality for potential latent class predictors in mixture modeling. Los Angeles: Statmodel. Retrieved http://www.statmodel.com/download/MeanTest1.pdf
Scott, N. W., Fayers, P. M., Aaronson, N. K., Bottomley, A., de Graeff, A., Groenvold, M., et al. (2010). Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression. Health Quality of Life Outcomes, 8, 81. doi: 10.1186/1477-7525-8-81. PubMedCentralCrossRefPubMed
Clark, S. L., Muthén, B., Kaprio, J., D’Onofrio, B. M., Viken, R., & Rose, R. J. (2013). Models and strategies for factor mixture analysis: An example concerning the structure underlying psychological disorders. Structural Equation Modeling, 20(4), 681–703. doi: 10.1080/10705511.2013.824786. CrossRef
Roussos, L. A., & Stout, W. (2004). Differential item functioning analysis: Detecting DIF items and testing. In D. Kaplan (Ed.), The SAGE handbook of quantitative methodology for the social sciences. Thousand Oaks, CA: SAGE Publications.
- The use of latent variable mixture models to identify invariant items in test construction
Lara B. Russell
Tolulope T. Sajobi
Lisa M. Lix
Bruno D. Zumbo
- Springer International Publishing