1 Introduction
Quality-adjusted life-years (QALYs) are used to concurrently quantify morbidity and mortality within a single parameter [
1]. For this reason, QALYs can facilitate the discussion of risks and benefits during patient counselling regarding treatment options [
2]. To help make funding decisions, policy makers may also combine QALYs with cost estimates to calculate the incremental cost-effectiveness ratio [
3]. QALYs are calculated using “utilities,” or health-related quality-of-life (HRQoL) weights, which are obtained by direct valuation or from generic health status measures [
4].
The choice of utility valuation approach is driven by available data. Direct valuation is the classical approach in which individuals rate hypothetical health state descriptions using the time trade-off or standard gamble procedures [
5]. These procedures can be used to measure utilities for very specific and uncommon health states. However, it can be cumbersome to develop valid health state descriptions for particular diseases. Alternatively, techniques have been developed to convert generic health status measures (e.g. EuroQol-5 Dimensions [EQ-5D], Short Form-6 Dimensions [SF-6D], or Health Utilities Index 3) to utilities [
1]. Conversion of generic health state measures is advantageous because custom health state descriptions are not required. However, utilities can only be obtained for health states actually observed in a cohort of patients involved in the generic health survey.
Unfortunately, generic health scores have not been collected for many diseases, meaning direct valuation is necessary for measuring utilities. Best practices in economic evaluation are to recruit a sample of healthy individuals from the general population for utility valuation [
6,
7]. Traditionally, general population utility valuation has been conducted using face-to-face interviews, phone interviews, or postal surveys [
8]. These forms of survey administration are time intensive and costly, so web-based surveys are increasingly being used [
9‐
22]. Typically, these studies are conducted using proprietary software, which limits application to other disease contexts. Furthermore, the psychometric properties of these propriety software programs have not been assessed [
23].
It is important to determine whether web-based utility valuation has acceptable psychometric properties. If this mode of administration has acceptable psychometric properties, rather than building custom software for new utility valuation studies, it would be beneficial and efficient for investigators to be able to build disease-specific modules on a common platform that has been used to develop modules with acceptable psychometric properties. To meet this need, we developed a new open-source (non-proprietary), web-based, self-directed utility valuation platform useable on major computer systems (including touch-screen devices) called the Self-directed Online Assessment of Preferences (SOAP) (Appendix 1 and 2 in the Electronic Supplementary Material [ESM]). SOAP was designed with flexibility in mind and can accept new health state descriptions (modules) with minimal programming.
We decided to first create a SOAP module for metastatic epidural spinal cord compression (MESCC), a condition for which HRQoL data are limited. MESCC can be treated with surgery or radiotherapy, but few high-quality studies compare these interventions using generic health status measures for patients. However, surgery and radiotherapy outcomes could be compared using utilities obtained by direct valuation of hypothetical probe health state descriptions. The European Organisation for Research and Treatment of Cancer (EORTC) MESCC working group has developed an HRQoL questionnaire for MESCC [
24]. Items from this questionnaire could be used to generate health state descriptions for a SOAP module.
The objective of this study was to determine whether the SOAP platform can be used to develop a valid, reproducible, and responsive module for MESCC. For this first application of the SOAP platform, we developed a MESCC module based on the work of the EORTC and measured psychometric properties in a general population sample.
4 Discussion
Utility valuation studies are traditionally conducted using face-to-face interviews, phone interviews, or postal surveys. These modes of administration have undergone psychometric validation. Web surveys are increasingly used for utility valuation and usually use custom and proprietary valuation tools that have not been psychometrically validated. It would be beneficial and efficient for investigators to be able to build disease-specific modules on a common platform that has been used to develop modules with acceptable psychometric properties.
We developed a new platform called the SOAP (Appendix 1 and 2 in the ESM). For the first application of this platform, we developed a module for MESCC health states. The SOAP platform met published benchmarks for reproducibility (both agreement and reliability) and responsiveness for utility measurement. This study demonstrated that the SOAP platform can be used to develop modules with acceptable psychometric properties.
In total, 81.4% of participants provided valid responses on the first test, and 66.4% of participants provided valid responses on both the test and the retest. These results should be considered in the context of other ex ante valuation studies reported in the literature. We classified a participant’s responses as valid if their utility valuations decreased with increasing dysfunctional attributes in the health state. For example, if a participant valued the fully dysfunctional health state higher than the single dysfunctional health state, their responses were classified as invalid. This definition of validity is termed “logical consistency” and has been used in traditional general population ex ante utility valuation studies of EQ-5D-3L health states.
Logical consistency rates for face-to face valuations have been reported for the UK and Netherlands [
41,
42]. In the UK study, 12 pairs of health states per participant could be evaluated for logical consistency. The median rate of logical consistency, per participant, ranged from 83.8 to 91.7%. In the Dutch study, 87.6% of participants provided at least one pair of logically inconsistent valuations. Postal surveys conducted in the USA and New Zealand reported at least one logically inconsistent pairing in 88 and 79% of participants, respectively [
43,
44]. With 81.4% of participants providing a valid response (28.6% providing a logically inconsistent response), the logical consistency rate for the SOAP MESCC module was similar to that of traditional population studies. Logical consistency has also been assessed for other self-administered general population
ex ante utility valuation studies of EQ-5D-3L health states over the internet [
19,
45,
46]. Each study reported a logical consistency rate < 70%.
Compared with the SOAP MESCC module, the face-to-face, postal, and web-based EQ-5D-3L utility valuation studies required greater cognitive effort because participants rated more health states (between five and ten) that were also more complex (five attributes and three levels of dysfunction). Furthermore, these studies did not provide error checking, whereas the SOAP MESCC module notified participants of a logical error if they rejected a lottery with 100% of success. Considering these differences, a logical consistency rate of 81.4% on the first test with the SOAP MESCC module is consistent with the literature.
Valuing MESCC health states using the classical standard gamble is problematic for two reasons. First, the classical standard gamble uses perfect health as a top anchor, which is an unrealistic outcome for metastatic cancer. Second, the classical standard gamble considers timeless (i.e. perpetual) health states, which is incongruent with the metastatic cancer disease process. To make the standard gamble more realistic, we characterised perfect health as the absence of dysfunctions and restricted all health states (including the top anchor) to a survival period of 5 years. These modifications may affect the interpretation of our results relative to classic utility assessment.
Utilities are typically estimated for specific health states and are used to weight the time in such health states. Consequently, a utility value for a specific state is typically considered “timeless,” that is, utilities are usually assumed not to change with time spent in a health state [
47]. As a reflection of this, the duration of time spent in a probe health state is not specified in the classical standard gamble [
5]. For MESCC health states, we were concerned that the most severe health states would connote poor survival and therefore confound the measurement of HRQoL using the standard gamble with quantity of life. To alleviate this difficulty, we explicitly stated a 5-year duration for each health state, which was the longest survival observed in a randomized controlled trial of treatments for MESCC [
27]. This approach has also been used in other utility valuation studies for cancer health states [
48]. This modification to health state descriptions should not affect results because the standard gamble (and all other utility-elicitation methods) relies on the “utility independence” assumption [
49]. Under this assumption, if a health state has a utility of
\(x\), the utility of this health state for 5 years should still be
\(x\). Unfortunately, a systematic review concluded that individuals tend not to satisfy the utility independence assumption with no consistent pattern of violation [
50]. We are unaware of any algorithm to convert utilities for fixed period of time to “timeless” utilities. Consequently, the utilities measured in this study may not be directly comparable to utilities obtained using the classical standard gamble.
A strength of our study is that we built on the work conducted by the EORTC MESCC working group to ensure the attributes in the MESCC module were appropriate and representative of the MESCC disease process. A limitation of our study is that we did not assess criterion validity by comparing utilities obtained by SOAP MESCC and a “gold standard” [
32]. This could be done by having patients with MESCC value their own health using the SOAP MESCC module and comparing these utility valuations with those derived from a generic health questionnaire. We did not have the resources to conduct such a study. Furthermore, measures of logical validity, reproducibility, and responsiveness are more relevant than MESCC criterion validity to investigators considering developing modules for new diseases.
To our knowledge, this is the first validated open-source, web-based, self-directed utility valuation module. For the first application of the SOAP platform, we developed a module for MESCC health states. We have demonstrated the SOAP MESCC module is valid, reproducible, and responsive for obtaining ex ante utilities. Considering the successful psychometric validation of the SOAP MESCC module, other investigators can consider developing modules for other diseases where direct utility valuation is needed.
Compliance with Ethical Standards
All procedures performed in studies involving human participants were in accordance with the ethical standards of The Ottawa Hospital Research Ethics Board and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.