The EORTC computer-adaptive tests measuring physical functioning and fatigue exhibited high levels of measurement precision and efficiency

doi:10.1016/j.jclinepi.2012.09.010

Journal of Clinical Epidemiology

Volume 66, Issue 3, March 2013, Pages 330-339

https://doi.org/10.1016/j.jclinepi.2012.09.010 Get rights and content

Abstract

Objectives

The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group is developing a computer-adaptive test (CAT) version of the EORTC Quality of Life Questionnaire (QLQ-C30). We evaluated the measurement properties of the CAT versions of physical functioning (PF) and fatigue (FA) and compared these with the corresponding QLQ-C30 scales.

Study Design and Setting

Based on international samples of more than 1,000 cancer patients, we simulated CAT administration of varying numbers of items and compared the resulting scores with those based on all items in the respective item pools. Furthermore, the relative validity (RV) of CATs was compared with that of the QLQ-C30 scales using known groups validity.

Results

For both dimensions, CATs of all lengths resulted in unbiased score estimates. CATs consisting of five or more items had reliability>0.90, correlated ≥0.97 with the full scale, and had root mean square error <0.25. The average RVs for these CATs ranged 1.02–1.33, indicating possible savings in sample size requirements of 3–42% using CAT.

Conclusion

The CAT versions of PF and FA exhibited high levels of measurement precision and efficiency. The potential savings in sample size requirements using CATs compared with those using the original QLQ-C30 scales were typically 20% or more.

Introduction

With the widespread access to and use of computers, tablets, smartphones, and the Internet, the assessment of patient-reported outcomes (PROs) is increasingly carried out electronically. Computer-adaptive testing (CAT) is a sophisticated method for assessing PROs electronically [1], [2]. CAT tailors the item set to the individual patient. This is achieved by repeatedly estimating the patient's symptom or functional level based on responses to previous questions and then selecting and presenting the most appropriate item for that symptom/functional level. CAT has several theoretical advantages including higher measurement precision and/or reduced response burden compared with traditional fixed-length measures requiring that all patients respond to the same set of questions.

The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Group is currently developing a CAT version of the EORTC Quality of Life Questionnaire (QLQ-C30) [3], one of the most widely used health-related quality of life (HRQOL) questionnaires in cancer research [4], [5]. The aim was to construct a more precise, efficient, and flexible instrument that will allow for the precise measurement of individuals, adaptation to different patient populations, and so forth. Once this developmental work is completed, the resulting CAT version can be used as an alternative to QLQ-C30 during a transition period. In the long run, the CAT version may preempt the original QLQ-C30 as the primary core EORTC quality of life instrument. Note that, as the QLQ-C30 scales are short (mostly just one or two items), for most QLQ-C30 dimensions, we do not expect that the new instrument will result in shorter scales rather in better and more precise measurement. The first two EORTC CAT item banks that have been developed cover physical functioning (PF) and fatigue (FA) [6], [7], [8], [9].

Although, theoretically, CAT has clearly superior measurement properties compared with traditional measures such as the fixed-length sum scales of QLQ-C30, it will differ across instruments, dimensions, and patient populations how these superior measurement properties translate into practical advantages in conducting PRO research. Using CAT measurement, the number of items (the length of the questionnaire) is selected for each study. This choice usually involves a trade-off between speed (few items) and precision (many items). Hence, information about measurement efficiency and precision (i.e., how reliable and valid the CAT is with a given number of items) is vital to be able to optimize CAT for a specific study. Furthermore, it may be of particular interest for many users of QLQ-C30 to know whether they can expect savings in study time and/or expenses using the CAT version rather than the existing and familiar fixed-length and fixed-format versions. Evaluation based on the CAT item banks for PF and FA may give valuable information about the measurement quality of the EORTC QLQ-C30 CAT and what may be gained from using CAT.

The aims of the present study were to assess the (1) measurement precision/efficiency of the CAT versions of the EORTC QLQ-C30 PF and FA scales, the first two CATs that have been developed for this questionnaire and (2) potential reduction in sample size requirements using various CAT versions compared with using the original QLQ-C30 PF and FA scales.

Section snippets

Development of the EORTC CAT item pools

The aim of EORTC CATs is to measure the same HRQOL dimensions as measured with QLQ-C30 but with higher efficiency and precision. For each dimension, the item pool development can be divided into four phases: (1) literature search to gain knowledge about the dimension and identify existing items used to measure the dimension; (2) based on (1) to formulate new items measuring the relevant aspects of the dimension and following the item style of QLQ-C30; (3) interviewing cancer patients from at

Evaluation of measurement precision

Fig. 1 shows the median and percentiles for the differences between θ’s estimated with CATs of 1,2, …, all −1 items, respectively, and the full-length θ. For both PF and FA, the median differences were very close to 0 for all CAT lengths. The percentiles indicated, however, that for very short CATs, there were some deviations for most patients. For example, when only two items were used, the CAT scores deviated about 0.2 or more for 50% of the patients. However, with five or more PF items, the

Discussion

One of the most important rationales advocating the use of CAT is that by tailoring the item set to the individual patient, more precise estimates of the patient's symptom burden, functional capacity, or health status can be obtained. To evaluate the precision and efficiency of the EORTC CAT measures of PF and FA, we compared scores obtained using CATs of varying lengths with the full-length scores based on all items. These evaluations confirmed that the CAT measures can be highly efficient and

References (15)

M.Aa. Petersen et al.
Development of computerised adaptive testing (CAT) for the EORTC QLQ-C30 dimensions—general approach and initial results for physical functioning
Eur J Cancer
(2010)
W.J. van der Linden et al.
Computerized adaptive testing: theory and practice
(2000)
H. Wainer
Computerized adaptive testing: a primer
(2000)
N.K. Aaronson et al.
The European Organization for Research and Treatment of Cancer QLQ-C30: a quality-of-life instrument for use in international clinical trials in oncology
J Natl Cancer Inst
(1993)
P. Fayers et al.
Quality of life research within the EORTC-the EORTC QLQ-C30. European Organisation for Research and Treatment of Cancer
Eur J Cancer
(2002)
A. Garratt et al.
Quality of life measurement: bibliographic study of patient assessed health outcome measures
BMJ
(2002)
M.Aa. Petersen et al.
Development of computerized adaptive testing (CAT) for the EORTC QLQ-C30 physical functioning dimension
Qual Life Res
(2011)

There are more references available in the full text version of this article.

Cited by (31)

Patient-reported outcome measures in cancer care: Integration with computerized adaptive testing
2023, Asia-Pacific Journal of Oncology Nursing
Thresholds for clinical importance were defined for the European Organisation for Research and Treatment of Cancer Computer Adaptive Testing Core—an adaptive measure of core quality of life domains in oncology clinical practice and research
2020, Journal of Clinical Epidemiology
The aim of this article was to establish thresholds for clinical importance (TCIs) for the European Organisation for Research and Treatment of Cancer (EORTC) Computer Adaptive Testing (CAT) Core measure, the new adaptive version of the EORTC QLQ-C30.
For our diagnostic study, we recruited cancer patients with mixed diagnoses and treatments from six European countries. Patients completed the EORTC CAT Core and a questionnaire with anchor items assessing criteria for clinical importance (limitations in everyday life, need for help/care, and worries by the patient/family/partner) for each EORTC CAT Core domain. We used a binary variable summarizing the anchor items for determining TCIs and for calculating the area under the curve (AUC) in receiving operator characteristic analysis as a measure of diagnostic accuracy.
Using data from 498 cancer patients (mean age 60.4 years, 55.2% women), we established TCIs for the 14 domains of the EORTC CAT Core. Median AUC across domains was 0.93 (range 0.84–0.94). Median sensitivity and specificity of the TCIs were 0.91 (range 0.80–0.96) and 0.77 (range 0.66–0.84), respectively. TCIs and AUCs were largely consistent across patient groups.
We have generated TCIs for the 14 functional health and symptom domains of the EORTC CAT Core. The EORTC CAT Core showed high diagnostic accuracy in identifying clinically important symptoms and functional impairments.
New insights into early recovery after robotic surgery for endometrial cancer
2019, Gynecologic Oncology
Citation Excerpt :
Hence, the use of generic questionnaires is particularly challenging within this population. The European Organisation of Research and Treatment of Cancer has developed and validated a computer adaptive test core questionnaire (EORTC CAT Core) that enables adaptation to the individual while maintaining comparability across patients [7–11]. The objective of the present study was to assess the individual early recovery to baseline of physical health among women with early-stage endometrial cancer following RMIS.
To assess early recovery of physical health after robotic minimally invasive surgery (RMIS) for early-stage endometrial cancer using the European Organisation of Research and Treatment of Cancer Computer Adaptive Test Core questionnaire (EORTC CAT Core). The EORTC CAT Core provides individualised measurements while maintaining comparability. A hypothesis of individual complete recovery to baseline within three post-surgical weeks was evaluated.
Ninety-four women who underwent RMIS for early-stage endometrial cancer were included consecutively. The EORTC CAT Core was distributed before surgery and prospectively every week during the first post-operative month. Repeated measures models were fitted for each of the four domains (physical functioning, role function, fatigue, and pain) and tested for impact of age, ASA score, minor/major surgery, and the individual baseline scores (poorest, intermediate, best).
Women with the lowest physical functioning, lowest role function, highest fatigue level, and highest pain level at baseline all recovered within three weeks. Women with the highest physical functioning, highest role function, lowest level of fatigue, and lowest level of pain at baseline did not reach their individual baselines within the first post-operative month but had the most favourable domain-scores three weeks post-operatively.
The individual woman's physical health baseline score is predictive for her postoperative recovery following RMIS for early-stage endometrial cancer. Women with the best physical health had the best postoperative functions and lowest level of symptoms; however their recovery to baseline was prolonged. Computer adaptive testing may be a valuable tool for individualised pre-operative information and supportive care during surveillance.
Establishing the European Norm for the health-related quality of life domains of the computer-adaptive test EORTC CAT Core
2019, European Journal of Cancer
Citation Excerpt :
In the item bank development process, different sources of information were collated, including literature reviews, qualitative input from various stakeholders and psychometric analyses of large international samples of cancer patients [14]. Item bank development for all domains was completed in 2016 [12,16–22]. To calibrate items of each bank, IRT models were estimated using data obtained from these clinical samples [22].
The computer-adaptive test (CAT) of the European Organisation for Research and Treatment of Cancer (EORTC), the EORTC CAT Core, assesses the same 15 domains as the EORTC QLQ-C30 health-related quality of life questionnaire but with increased precision, efficiency, measurement range and flexibility. CAT parameters for estimating scores have been established based on clinical data from cancer patients. This study aimed at establishing the European Norm for each CAT domain based on general population data.
We collected representative general population data across 11 European Union (EU) countries, Russia, Turkey, Canada and the United States (n ≥ 1000/country; stratified by sex and age). We selected item subsets from each CAT domain for data collection (totalling 86 items). Differential item functioning (DIF) analyses were conducted to investigate cross-cultural measurement invariance. For each domain, means and standard deviations from the EU countries (weighted by country population, sex and age) were used to establish a T-metric with a European general population mean = 50 (standard deviation = 10).
A total of 15,386 respondents completed the online survey (n = 11,343 from EU countries). EORTC CAT Core norm scores for all 15 countries were calculated. DIF had negligible impact on scoring. Domain-specific T-scores differed significantly across countries with small to medium effect sizes.
This study establishes the official European Norm for the EORTC CAT Core. The European CAT Norm can be used globally and allows for meaningful interpretation of scores. Furthermore, CAT scores can be compared with sex- and age-adjusted norm scores at a national level within each of the 15 countries.
Assessing Emotional Functioning with Increased Power: Relative Validity of a Customized EORTC Short Form in the International ACTION Trial
2019, Value in Health
There is a need to improve the assessment of emotional functioning (EF). In the international Advance Care Planning: an Innovative Palliative Care Intervention to Improve Quality of Life in Cancer Patients - a Multi-Centre Cluster Randomized Clinical Trial (ACTION) trial involving patients with advanced cancer, EF was assessed by a customized 10-item short form (EF10). The EF10 is based on the European Organisation for Research and Treatment of Cancer (EORTC) EF item bank and has the potential for greater precision than the common EORTC Quality of Life Questionnaire Core 30 four-item scale (EF4). We assessed the relative validity (RV) of EF10 compared with EF4.
Patients from Belgium, Denmark, Italy, the Netherlands, Slovenia, and the United Kingdom completed EF10 and EF4, and provided data on generic quality of life, coping, self-efficacy, and personal characteristics. Based on clinical and sociodemographic variables and questionnaire responses, 53 “known groups” that were expected to differ were formed, for example, females versus males. The EF10 and EF4 were first independently compared within this known group, for example, the EF10 score of females vs the EF10 score of males. When these differences were significant, the RV was calculated for the comparison of the EF10 with the EF4.
A total of 1028 patients (57% lung, 43% colorectal cancer) participated. Forty-five of the 53 known-groups comparisons were significantly different and were used for calculating the RV. In 41 of 45 (91%) comparisons, the RV was more than 1, meaning that EF10 had a higher RV than EF4. The mean RV of EF10 compared with that of EF4 was 1.41, indicating superior statistical power of EF10 to detect differences in EF.
Compared with EF4, EF10 shows superior power, allowing a 20% to 34% smaller sample size without reducing power, when used as a primary outcome measure.
The EORTC CAT Core—The computer adaptive version of the EORTC QLQ-C30 questionnaire
2018, European Journal of Cancer
Citation Excerpt :
The resulting data set formed the basis for the final psychometric evaluations. These included evaluating dimensionality using factor analysis for ordinal variables; calibrating the IRT model (the generalised partial credit model) and evaluating item fit; evaluating differential item functioning (DIF) to explore whether items function similarly across different groups of patients (e.g. men and women, patients from different countries) and evaluating the measurement precision of the CATs based on the resulting item banks using both observed and simulated data [14]. Items not fitting the unidimensional IRT model or exhibiting DIF were candidates for exclusion.
To optimise measurement precision, relevance to patients and flexibility, patient-reported outcome measures (PROMs) should ideally be adapted to the individual patient/study while retaining direct comparability of scores across patients/studies. This is achievable using item banks and computerised adaptive tests (CATs). The European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire Core 30 (QLQ-C30) is one of the most widely used PROMs in cancer research and clinical practice. Here we provide an overview of the research program to develop CAT versions of the QLQ-C30's 14 functional and symptom domains.
The EORTC Quality of Life Group's strategy for developing CAT item banks consists of: literature search to identify potential candidate items; formulation of new items compatible with the QLQ-C30 item style; expert evaluations and patient interviews; field-testing and psychometric analyses, including factor analysis, item response theory calibration and simulation of measurement properties. In addition, software for setting up, running and scoring CAT has been developed.
Across eight rounds of data collections, 9782 patients were recruited from 12 countries for the field-testing. The four phases of development resulted in a total of 260 unique items across the 14 domains. Each item bank consists of 7–34 items. Psychometric evaluations indicated higher measurement precision and increased statistical power of the CAT measures compared to the QLQ-C30 scales. Using CAT, sample size requirements may be reduced by approximately 20–35% on average without loss of power.
The EORTC CAT Core represents a more precise, powerful and flexible measurement system than the QLQ-C30. It is currently being validated in a large independent, international sample of cancer patients.

View all citing articles on Scopus

: Funding: The study was funded by grants from the EORTC Quality of Life Group. The work of J.M.G. was funded by a grant from the Austrian Science Fund #502. The work of W.-C.C. was supported by grant National Science Council, Taiwan, No. 95-2314-B-002-266-MY2, 97-2314-B-002-020-MY3.

: Conflict of interest statement: There were no financial relationships, personal relationships, academic competition, intellectual commitments, or other conflicts of interest that might have biased the work.

View full text

Original ArticleThe EORTC computer-adaptive tests measuring physical functioning and fatigue exhibited high levels of measurement precision and efficiency

Abstract

Objectives

Study Design and Setting

Results

Conclusion

Introduction

Section snippets

Development of the EORTC CAT item pools

Evaluation of measurement precision

Discussion

Eur J Cancer

Computerized adaptive testing: theory and practice

Computerized adaptive testing: a primer

The European Organization for Research and Treatment of Cancer QLQ-C30: a quality-of-life instrument for use in international clinical trials in oncology

J Natl Cancer Inst

Quality of life research within the EORTC-the EORTC QLQ-C30. European Organisation for Research and Treatment of Cancer

Eur J Cancer

Quality of life measurement: bibliographic study of patient assessed health outcome measures

BMJ

Development of computerized adaptive testing (CAT) for the EORTC QLQ-C30 physical functioning dimension

Qual Life Res

Original Article
The EORTC computer-adaptive tests measuring physical functioning and fatigue exhibited high levels of measurement precision and efficiency