Introduction
Prostate cancer is the most common form of cancer among males in North America and Europe[
1], and its incidence rate has increased rapidly in Asia in the past few years[
2]. The rate rose from 1.78 per 100,000 persons in 1982 to 24.55 per 100,000 persons in 2008[
3], making prostate cancer a significant public health concern in Taiwan. Although the survival time for patients with prostate cancer has increased due to early detection and improved treatments[
4], prostate cancer related symptoms and treatment associated side effects (e.g., urinary, sexual, and bowel dysfunction) have been shown to significantly impact a patient’s health-related quality of life (HRQOL)[
4‐
6]. It is, therefore, important to collect and use reliable and valid patient-reported HRQOL information in order to document their responsiveness to any specific treatments or interventions and to facilitate and guide clinical decisions.
Patient-reported outcomes (PROs), including HRQOL, are becoming increasingly important in clinical research and practice, and therefore much effort has been directed toward the development of more objective methods of assessment[
7,
8]. For example, the European Organization for Research and Treatment of Cancer (EORTC) Quality of Life Study Group developed the Core Questionnaire (EORTC QLQ-C30) to measure generic aspects of HRQOL for patients with various types of cancer. In order provide more detailed information specific to prostate cancer, a 25-item supplementary module, EORTC QLQ-PR25, was further developed. The EORTC QLQ-PR25 has become a widely used HRQOL questionnaire for prostate cancer patients[
9‐
11]. It includes four domains assessing urinary symptoms, bowel symptoms, treatment-related symptoms, and sexual activity and functioning. The results of international field validation has been published in 2008[
9]. In Taiwan, Chie was authorized by the EORTC to perform Chinese translation (in traditional Chinese characters) and validation the Core Questionnaire (QLQ-C30) and its supplementary modules, and their study results have been in recent years[
12‐
16]. But the psychometric analyses were primarily assessed at the domain level with a small sample size.
In this study, we used the same Taiwan Chinese Version of the EORTC QLQ-PR25 questionnaire published by Chie et al. in 2010[
16] to address issues relevant to traditional psychometric analysis and small sample sizes. We employed both confirmatory factor analysis and Rasch analysis to thoroughly examine and understand its psychometric properties in a larger patient population. Rasch analysis has increasingly become popular in assessing the quality of existing outcome assessment tools and in developing new ones[
17,
18], as it offers a methodologically rigorous way to evaluate and enhance the measurement properties of the assessment tools[
19,
20]. Specifically, this study aimed to: 1) examine the stability of item calibrations (i.e., item parameter invariance) within each of the four domains across different time of assessment; 2) evaluate whether the item coverage was adequate to reliably assess the person traits along the latent construct; and 3) determine whether the response category thresholds were in intended sequence (from less to more).
Discussion
The results of our analyses of the Taiwan Chinese Version of the EORTC QLQ-PR25 showed that each of the four domains satisfied the unidimensionality assumption and items in their respective domain had a good fit to the Rasch model. Overall, the item hierarchy was found to be consistent and item stability (item parameter invariance) was observed in all four domains across the three time periods. The items in the US domain spread satisfactorily along the latent trait continuum (coverage rate, 71.3%). The significant ceiling effect in both the BS and TS domains, as well as the noticeable floor effect in the SX domain together suggested the inadequate item coverage at the end in these three domains. The ordering of the thresholds for all the domains, except for the BS domain, was in sequence from less to more as intended.
Our findings of low alpha coefficients in the BS and TS domains of the EORTC QLQ-PR25 are similar to those have been previously reported. The EORTC official version[
9], Spanish version[
10] and the Taiwan Chinese version[
16] all reported low reliability (< 0.6) and high ceiling effects in the BS and TS domains. van Andel et al. indicated Cronbach’s alpha coefficient reflects the ratio of variances of the individual scale items to the variance of the total scale. A restricted range of responses will have a greater impact on the total scale score than on the individual items, resulting in a lower reliability estimate[
9].
The initial development of the EORTC QLQ-PR25 items nearly 20 years ago was based on the selection of important items by both prostate cancer patients and clinicians. However, recent improvements in treatment (e.g., three-dimensional conformal radiotherapy, intensity-modulated radiotherapy[
34]) and symptom control for prostate cancer may have contributed to the low symptom frequency in the items assessed in the BS and TS domains. For example, over 90% patients reported not having “Fecal blood”, “Fecal incontinence” or “Breast tenderness”. In this study, the large ceiling effect (87% and 61.5% in the BS and TS domains, respectively) is consistent with the findings previously reported[
9,
10]. These items may not seem as important or relevant as previously selected and should be considered for revision or even remove. More clinically relevant items based on clinicians’ recommendations to measure the contemporary patient's symptoms and concerns should be developed and added. For example, items like “Difficulties with bowel function” may be added to the BS domain and “Bone dysfunction” (osteoporosis, etc.) may be added to the TS domain. Adding and validating new clinically relevant and psychometrically sound items in future studies can potentially eliminate the ceiling or floor effect and item content gaps, and may improve the performance of the items within the same domain.
Sexual dysfunction can occur and impact patients regardless of the treatment modalities they receive[
35,
36], and the time to recover from it is usually much longer than that from other side effects. In general, about 38% to 48% of patients had not recovered from sexual dysfunction one year after receiving treatment[
6]. Although there are six SX items in the EORTC QLQ-PR25, four of them (SX52-SX55) are conditional and only applicable to those being sexually active, which may lead to less precise measurement in this domain[
30]. Adding more commonly experienced sexual functioning items, such as the impact of “Loss of libido”, may improve it measurement precision and clinical relevance. Using items from other questionnaires, such as the 15-item version of the International Index of Erectile Dysfunction (IIEF-15)[
37] and the Male Sexual Health Questionnaire (MSHQ)[
38], may help physicians to better measure and monitor changes of the sexual functioning aspect of their patients’ HRQOL.
The first two thresholds were very close to each other but far away from the third threshold, as shown in the item-person maps of the US and TS domains, seemed to suggest that a binary 2-category response category may be practical to improve readability and measurement precision[
39]. Furthermore, the issue related to the out-of-sequence thresholds in the BS domain and the noticeable item coverage gaps in the SX domain suggest that further improvement is still needed. Hsueh et al. reported that the middle categories were never the most likely responses of any patient and were thus redundant, when polytomous items of index exhibited disordering of the step difficulty. The psychometric properties of the dichotomous items were equivalent to those of the polytomous items. A scale with only dichotomous items is much more convenient and efficient to administer[
40]. Maio and Perrone pointed out that HRQOL assessment in the elderly is complicated by several unresolved methodological problems (higher frequency of illiteracy, worse compliance with the questionnaires, concomitant diseases, use of instruments not validated in the aged population)[
41]. A binary response category (“No” for “Not at all” vs. “Yes” for combined “A little”, “Quite a bit” and “Very much”) may be practically feasible to improve readability and measurement precision for the Taiwan Chinese Version of the EORTC QLQ-PR25 and easier to respond for prostate cancer patients, who are typically older (70% of all prostate cancers are diagnosed in men over the age of 70 in Taiwan)[
3] and less educated (half of the patients had an education level of less than 9 years in this study)[
42].
Besides the many statistics and ways to allow for thorough psychometric evaluation from the Rasch analysis in this study, one additional strength is that our data were from a large group of prostate cancer patients with varying levels of severity, receiving different treatment modalities and assessed at various times. As shown in Table
2, the assessments were grouped into three groups based on the assessment time period. The data were first analyzed separately by time period to validate the EORTC QLQ-PR25. The stability of item calibrations within each domain was then compared across different time periods. The data of these three periods were then combined and validate again. The combined data potentially maximize the patient diversity and representativeness to ensure the generalizability of the study results in Taiwan. Our results showed the item stabilities held across the three different time periods, satisfying an important measurement property for making meaningful HRQOL score comparison for prostate cancer patients in a longitudinal study[
43].
Some limitations of this study should also be noted. First, this study was limited in scope only to the stability assessment across different time periods; therefore larger-scale studies with adequate sample sizes are still needed to examine the stability of item calibrations across different age groups, cancer stages, and treatment groups. Secondly, only outpatients in Central Taiwan were sampled, which might limit its generalizability to all Chinese-speaking prostate cancer patients in Taiwan or other regions. Patients coming to outpatient clinics normally are expected to have milder symptoms than those in the inpatient settings, and may produce a higher ceiling effect in our study. Thirdly, this study did not include stratified analysis of different types of treatment and disease stages. However, since patients with prostate cancer often receive multiple treatment modalities and exhibit long disease duration, our study cohort appears to be a fair representative and therefore the results from this pooled sample can be of practical value for clinical implications.
Competing interests
All authors declare that they have no competing interests.
Authors’ contributions
CHC and WML designed the study, wrote the protocol and revised the manuscript. HCW was the coordinator of this research and conducted the field work. YJC performed the statistical analyses and drafted the manuscript. HCL, JYW and TCL designed the study, wrote the protocol and managed the field work. YCY was responsible for data collection and interpretation. All authors contributed to and have approved the final manuscript.