Original Article
The EORTC computer-adaptive tests measuring physical functioning and fatigue exhibited high levels of measurement precision and efficiency

https://doi.org/10.1016/j.jclinepi.2012.09.010Get rights and content

Abstract

Objectives

The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group is developing a computer-adaptive test (CAT) version of the EORTC Quality of Life Questionnaire (QLQ-C30). We evaluated the measurement properties of the CAT versions of physical functioning (PF) and fatigue (FA) and compared these with the corresponding QLQ-C30 scales.

Study Design and Setting

Based on international samples of more than 1,000 cancer patients, we simulated CAT administration of varying numbers of items and compared the resulting scores with those based on all items in the respective item pools. Furthermore, the relative validity (RV) of CATs was compared with that of the QLQ-C30 scales using known groups validity.

Results

For both dimensions, CATs of all lengths resulted in unbiased score estimates. CATs consisting of five or more items had reliability>0.90, correlated ≥0.97 with the full scale, and had root mean square error <0.25. The average RVs for these CATs ranged 1.02–1.33, indicating possible savings in sample size requirements of 3–42% using CAT.

Conclusion

The CAT versions of PF and FA exhibited high levels of measurement precision and efficiency. The potential savings in sample size requirements using CATs compared with those using the original QLQ-C30 scales were typically 20% or more.

Introduction

With the widespread access to and use of computers, tablets, smartphones, and the Internet, the assessment of patient-reported outcomes (PROs) is increasingly carried out electronically. Computer-adaptive testing (CAT) is a sophisticated method for assessing PROs electronically [1], [2]. CAT tailors the item set to the individual patient. This is achieved by repeatedly estimating the patient's symptom or functional level based on responses to previous questions and then selecting and presenting the most appropriate item for that symptom/functional level. CAT has several theoretical advantages including higher measurement precision and/or reduced response burden compared with traditional fixed-length measures requiring that all patients respond to the same set of questions.

The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group is currently developing a CAT version of the EORTC Quality of Life Questionnaire (QLQ-C30) [3], one of the most widely used health-related quality of life (HRQOL) questionnaires in cancer research [4], [5]. The aim was to construct a more precise, efficient, and flexible instrument that will allow for the precise measurement of individuals, adaptation to different patient populations, and so forth. Once this developmental work is completed, the resulting CAT version can be used as an alternative to QLQ-C30 during a transition period. In the long run, the CAT version may preempt the original QLQ-C30 as the primary core EORTC quality of life instrument. Note that, as the QLQ-C30 scales are short (mostly just one or two items), for most QLQ-C30 dimensions, we do not expect that the new instrument will result in shorter scales rather in better and more precise measurement. The first two EORTC CAT item banks that have been developed cover physical functioning (PF) and fatigue (FA) [6], [7], [8], [9].

Although, theoretically, CAT has clearly superior measurement properties compared with traditional measures such as the fixed-length sum scales of QLQ-C30, it will differ across instruments, dimensions, and patient populations how these superior measurement properties translate into practical advantages in conducting PRO research. Using CAT measurement, the number of items (the length of the questionnaire) is selected for each study. This choice usually involves a trade-off between speed (few items) and precision (many items). Hence, information about measurement efficiency and precision (i.e., how reliable and valid the CAT is with a given number of items) is vital to be able to optimize CAT for a specific study. Furthermore, it may be of particular interest for many users of QLQ-C30 to know whether they can expect savings in study time and/or expenses using the CAT version rather than the existing and familiar fixed-length and fixed-format versions. Evaluation based on the CAT item banks for PF and FA may give valuable information about the measurement quality of the EORTC QLQ-C30 CAT and what may be gained from using CAT.

The aims of the present study were to assess the (1) measurement precision/efficiency of the CAT versions of the EORTC QLQ-C30 PF and FA scales, the first two CATs that have been developed for this questionnaire and (2) potential reduction in sample size requirements using various CAT versions compared with using the original QLQ-C30 PF and FA scales.

Section snippets

Development of the EORTC CAT item pools

The aim of EORTC CATs is to measure the same HRQOL dimensions as measured with QLQ-C30 but with higher efficiency and precision. For each dimension, the item pool development can be divided into four phases: (1) literature search to gain knowledge about the dimension and identify existing items used to measure the dimension; (2) based on (1) to formulate new items measuring the relevant aspects of the dimension and following the item style of QLQ-C30; (3) interviewing cancer patients from at

Evaluation of measurement precision

Fig. 1 shows the median and percentiles for the differences between θ’s estimated with CATs of 1,2, …, all −1 items, respectively, and the full-length θ. For both PF and FA, the median differences were very close to 0 for all CAT lengths. The percentiles indicated, however, that for very short CATs, there were some deviations for most patients. For example, when only two items were used, the CAT scores deviated about 0.2 or more for 50% of the patients. However, with five or more PF items, the

Discussion

One of the most important rationales advocating the use of CAT is that by tailoring the item set to the individual patient, more precise estimates of the patient's symptom burden, functional capacity, or health status can be obtained. To evaluate the precision and efficiency of the EORTC CAT measures of PF and FA, we compared scores obtained using CATs of varying lengths with the full-length scores based on all items. These evaluations confirmed that the CAT measures can be highly efficient and

References (15)

  • M.Aa. Petersen et al.

    Development of computerised adaptive testing (CAT) for the EORTC QLQ-C30 dimensions—general approach and initial results for physical functioning

    Eur J Cancer

    (2010)
  • W.J. van der Linden et al.

    Computerized adaptive testing: theory and practice

    (2000)
  • H. Wainer

    Computerized adaptive testing: a primer

    (2000)
  • N.K. Aaronson et al.

    The European Organization for Research and Treatment of Cancer QLQ-C30: a quality-of-life instrument for use in international clinical trials in oncology

    J Natl Cancer Inst

    (1993)
  • P. Fayers et al.

    Quality of life research within the EORTC-the EORTC QLQ-C30. European Organisation for Research and Treatment of Cancer

    Eur J Cancer

    (2002)
  • A. Garratt et al.

    Quality of life measurement: bibliographic study of patient assessed health outcome measures

    BMJ

    (2002)
  • M.Aa. Petersen et al.

    Development of computerized adaptive testing (CAT) for the EORTC QLQ-C30 physical functioning dimension

    Qual Life Res

    (2011)
There are more references available in the full text version of this article.

Cited by (31)

  • New insights into early recovery after robotic surgery for endometrial cancer

    2019, Gynecologic Oncology
    Citation Excerpt :

    Hence, the use of generic questionnaires is particularly challenging within this population. The European Organisation of Research and Treatment of Cancer has developed and validated a computer adaptive test core questionnaire (EORTC CAT Core) that enables adaptation to the individual while maintaining comparability across patients [7–11]. The objective of the present study was to assess the individual early recovery to baseline of physical health among women with early-stage endometrial cancer following RMIS.

  • Establishing the European Norm for the health-related quality of life domains of the computer-adaptive test EORTC CAT Core

    2019, European Journal of Cancer
    Citation Excerpt :

    In the item bank development process, different sources of information were collated, including literature reviews, qualitative input from various stakeholders and psychometric analyses of large international samples of cancer patients [14]. Item bank development for all domains was completed in 2016 [12,16–22]. To calibrate items of each bank, IRT models were estimated using data obtained from these clinical samples [22].

  • The EORTC CAT Core—The computer adaptive version of the EORTC QLQ-C30 questionnaire

    2018, European Journal of Cancer
    Citation Excerpt :

    The resulting data set formed the basis for the final psychometric evaluations. These included evaluating dimensionality using factor analysis for ordinal variables; calibrating the IRT model (the generalised partial credit model) and evaluating item fit; evaluating differential item functioning (DIF) to explore whether items function similarly across different groups of patients (e.g. men and women, patients from different countries) and evaluating the measurement precision of the CATs based on the resulting item banks using both observed and simulated data [14]. Items not fitting the unidimensional IRT model or exhibiting DIF were candidates for exclusion.

View all citing articles on Scopus

Funding: The study was funded by grants from the EORTC Quality of Life Group. The work of J.M.G. was funded by a grant from the Austrian Science Fund #502. The work of W.-C.C. was supported by grant National Science Council, Taiwan, No. 95-2314-B-002-266-MY2, 97-2314-B-002-020-MY3.

Conflict of interest statement: There were no financial relationships, personal relationships, academic competition, intellectual commitments, or other conflicts of interest that might have biased the work.

View full text