1 Background
Cutaneous T-cell lymphomas (CTCLs) are a group of rare subtypes of non-Hodgkin lymphomas that primarily involve the skin and account for approximately 2% of all lymphomas. Mycosis fungoides (MF) is a low-grade cutaneous lymphoma encompassing more than half of primary CTCL cases, with an incidence rate of around 5.6 per million persons and a median age at diagnosis of 55–60 years. The choice of treatment depends on the patient’s comorbidities and disease staging [
1]. In MF-CTCL patients with limited/localised skin involvement, the National Comprehensive Cancer Network (NCCN) Guidelines recommend topical mechlorethamine hydrochloride (MCH, or nitrogen mustard) as a primary skin-directed treatment option [
2]. However, there is currently no curative treatment for MF-CTCL, and the main treatment objective is to reach effective palliation with symptom improvement and enhance the patient’s quality of life (QoL) [
1]. Indeed, patients with CTCLs experience several symptoms affecting their daily life, such as skin sensitivity, itching, annoyance about the disease, worry that it could worsen, and impairment in sexual life [
3]. Therefore, the use of patient-reported outcome measures (PROMs) to measure the self-perceived health status and QoL is essential in CTCLs [
3].
The only instrument measuring QoL specifically in MF or Sézary syndrome (SS) subtypes of CTCLs (MF/SS-CTCLs) is the MF/SS-CTCL QoL, for which a total score is calculated by adding up the patient’s total score from the 12 MF/SS-CTCL QoL items [
4]. Other PROMs, either skin-specific, pruritus-specific, or cancer-specific, are also suitable to address CTCL symptomatology [
4]. Among skin-specific questionnaires, the Dermatology Life Quality Index (DLQI) is a simple 10-item questionnaire for routine clinical use in dermatology [
5]. The more recent Skindex is an instrument that studies the effects of a wide variety of skin diseases on patient’s QoL, while the original 29-item version (Skindex-29) inquiries about how often (never, rarely, sometimes, often, all the time) during the previous 4 weeks the patient experienced the effect described in each item. Seven items address the ‘symptoms domain’, 10 items address the ‘emotional domain’, and 12 items address the ‘functioning domain’. All responses are transformed to a linear scale of 100, varying from 0 (no effect) to 100 (effect experienced all the time) [
6]. Skindex-29 showed high correlation with MF/SS-CTCL QoL [
4]. A shorter 16-item version (Skindex-16) was developed to measure bother rather than frequency of symptoms, and to reduce respondent’s burden [
6]. Among pruritus-specific questionnaires, the Visual Analogue Scale (VAS) has been considered as a valuable technique for assessing pruritus [
7], in addition to the 22-item ItchyQoL [
8] and the 5-D itch scale [
9], which both measure QoL in patients with chronic pruritus. Lastly, European Organisation for Research and Treatment of Cancer (EORTC) questionnaires (
https://qol.eortc.org/) and Functional Assessment of Cancer Therapy-General (FACT-G;
https://www.facit.org/FACITOrg) can apply to patients with CTCLs to investigate cancer-specific issues.
However, none of these PROMs is provided with a preference-based algorithm converting responses into health state utility values (HSUVs) for quality-adjusted life-year (QALY) calculations. In several jurisdictions, the most common technique used to inform drug coverage and reimbursement decisions is the cost-effectiveness analysis, which generally expresses results in terms of incremental cost per QALY gained. Therefore, the lack of collection of preference-based PROMs in a clinical study might be an issue. In the UK, the National Institute for Health and Care Excellence (NICE) recommends that QALYs are used as a measure of outcome for economic evaluation, and that the EuroQol-5 Dimension (EQ-5D) is the preferred measure of health-related utility to calculate QALYs. However, the institution recognises that EQ-5D data may not always be available to manufacturers producing submissions and reports, and thus ‘mapping’ can be used to predict them from other measures of health. Mapping is defined as
the development and use of an algorithm (or algorithms) to predict HSUVs through regression analyses using data from any indicator or measures of health [
10,
11].
The PROVe study is a prospective, observational, US-based study conducted in patients diagnosed with MF-CTCL and treated with Valchlor
®. Valchlor
® gel is a new formulation of MCH (or nitrogen mustard) that has been shown to be well tolerated and effective in a clinical trial [
12]. The PROVe study collected information in a ‘real-world’ clinical setting on the mana gement and outcomes of MF-CTCL patients treated with Valchlor
®. In detail, 301 adult patients (≥18 years of age) actively using Valchlor
® were enrolled at 41 US sites (March 2015–July 2017) and were monitored for up to 2 years [
13]. Data collected included clinical, healthcare utilisation, adverse events and treatment patterns. The primary endpoint was the proportion of patients with ≥ 50% reduction from baseline in percentage of body surface area of disease. QoL was assessed as a secondary endpoint by using Pruritus-VAS (scale 0–10, where 0 indicates no pruritus and ≥ 9 indicates very severe pruritus), Skindex-29, and the newly developed MF/SS-CTCL QoL, none of which is preference-based and yields HSUVs. Thus, the aims of the current study were to derive HSUVs in MF-CTCL by applying any mapping algorithms that used one of the three PROMs adopted in the PROVe study, and to assess the feasibility of this approach by comparing mapped utilities with the HSUVs estimated in the literature for MF-CTCL patients.
4 Discussion
The use of mapping is becoming popular in estimating HSUVs for cost-effectiveness analyses [
10]. Overall, mapping introduces a degree of uncertainty in the estimated HSUVs and should be considered as a second-best approach compared with the direct collection of preference-based PROMs [
10,
27]. However, generic PROMs yielding HSUVs are considered not sensitive enough to capture relevant changes in symptomatology over a treatment period, and disease-specific PROMs are usually preferred to measure QoL in patients recruited in clinical studies [
10]. Moreover, the administration of multiple questionnaires within the same study may be too burdensome. The use of generic PROMs is particularly unlikely in studies on rare diseases, to which MF-CTCL also belongs, with an incidence of 0.59 per 100,000 [
28]. Indeed, in rare diseases, the symptoms experienced by patients are usually more severe and heterogeneous than in common conditions, and EQ-5D has been shown to miss relevant patients’ concerns, such as fatigue, relationship/social life, and comorbidities [
29]. In the absence of the collection of preference-based PROMs, the mapping technique has been increasingly accepted to inform reimbursement decisions of novel drugs and has recently been explored in the literature on rare diseases [
30]. For example, in 2017, NICE recommended the use of carfilzomib in multiple myeloma, which is another rare cancer with an incidence of 6 per 100,000 [
28], based on HSUVs derived from the application of a mapping algorithm [
31] to trial EORTC data.
In this study, we used mapping algorithms to derive HSUVs for a US-based clinical study (PROVe) in MF-CTCL. The HERC database yielded only one study mapping Pruritus-VAS onto EQ-5D-3L [
17]. From this study, we selected two (of three) algorithms to be applied to patient-level data collected in the PROVe study and converted Pruritus-VAS scores into EQ-5D-3L utilities. As expected, higher VAS scores (indicating worse pruritus) resulted in lower HSUVs. In subgroup analyses, we observed significant differences in average mapped utilities by age, race, and cancer stage, and no significant differences by visit number or sex. However, the applied algorithms largely overestimated HSUVs and predicted utilities above 1, and Model 2 to a larger extent than Model 3, although the former was the preferred algorithm by Park et al. [
17]. In CC analysis, 51.5% and 57.2% of all mapped utility values generated were above 1, and 42.8% and 54.1% after MI, by applying Model 3 and Model 2, respectively. The average mapped EQ-5D utilities ranged between 0.950 and 0.999, depending on the algorithm applied and the imputation (or not) of missing values. Such values are considerably higher than mean HSUVs reported by previous studies that were comprised between 0.51 and 0.87, depending on the CTCL type (MF or SS) and stage (early or advanced), and likely on the technique adopted to estimate them [
22‐
26]. For example, it has been shown that direct methods such as TTO or standard gamble tend to provide higher HSUVs than preference-based instruments such as the EQ-5D and HUI [
32]. This phenomenon has also been observed in our review, where the only study using TTO provided the highest mean value (0.87) among the studies retrieved, although estimated from a very small sample [
22].
The tendency of available algorithms to predict HSUVs above 1 has been previously reported in the mapping literature, but there is no consensus on how to deal with this issue and some studies have simply used the unadjusted mapped utility data [
27,
33]. The application of algorithms developed in common diseases to their rare variants has shown even more inaccuracies due to the greater severity of the latter [
30]. For example, Arnold et al. [
34] showed that the available algorithms tended to overpredict HSUVs in patients with pleural mesothelioma, who are generally in poorer health compared with more common neoplasms (e.g., lung cancer) where the original algorithms were developed.
The results obtained in this study require some considerations. First, of the three PROMs collected in the PROVe study, the Pruritus-VAS might be the least suitable to be mapped onto EQ-5D, due to the mono-dimensionality of VAS compared with the other two scales (i.e., Skindex-29 and MF/SS-CTCL QoL).
Second, the Park et al. study used data and the EQ-5D-3L value set from the general population in South Korea, which may not be representative of the US population since HSUVs are likely to be affected by cultural differences among countries. In addition, the general population may have no experience of pruritus and therefore tends to under/overestimate the HSUVs of those affected by this chronic symptom [
17], such as MF-CTCL patients. However, the unavailability of a specific mapping algorithm for MF-CTCL patients is not surprising, given the rarity of this condition. Lastly, the study by Park et al. did not follow any specific recommendations (e.g., the MAPS Statement [
35]) for generating the algorithms and we did not perform any quality assessment of the mapping exercise.
Third, the mapping exercise was performed using data from the PROVe study, which mainly recruited patients with early-stage MF-CTCL. Therefore, the mapped utilities from this analysis could not be comparable with those obtained from other types of MF-CTCL patients, such as those diagnosed with advance stage or who progressed after initial treatment, as included in some of the studies retrieved [
24,
26]. The PROVe study had a maximum follow-up period of 2 years, which limited the amount of QoL data collected from patients who had progressed, due to the slow progression of MF-CTCL.
Fourth, we observed a large proportion (31.3%) of missing Pruritus-VAS data across all visits, which limited the application of the available algorithms to a database portion. In clinical studies, missing data is often MAR, in which case MI is the preferred technique to overcome this issue. If the MAR assumption was violated, this could lead to biased results [
18], but since findings from CC and MI analyses were almost overlapping, we were reassured on the robustness of the technique adopted for imputing missing data.
Lastly, since the PROVe study did not collect EQ-5D, we could not compare original and mapped utilities resulting from the same database, or calculate related differences, for example, through mean absolute error (MAE) and root mean squared error (RMSE), as recommended by existing guidelines [
35].
5 Conclusions
This study derived HSUVs for patients with MF-CTCL enrolled in a clinical study and as already observed in the literature, especially in rare diseases, showed the poor applicability of mapping algorithms developed in different conditions or populations. Indeed, we obtained largely overestimated HSUVs by using the algorithms of Park et al. mapping Pruritus-VAS onto EQ-5D, if compared with the values reported in previous studies on MF-CTCLs. Therefore, the mapped HSUVs cannot be used in future cost-effectiveness analyses of treatments for MF-CTCLs.
Overall, we encourage future clinical studies to collect EQ-5D directly from patients to avoid the use of mapping algorithms for deriving HSUVs. However, in conditions where the use of preference-based PROMs is challenging, the application of mapping algorithms can represent a valuable alternative. The development of mapping algorithms using disease-specific PROMs (i.e., MF/SS-CTCL QoL) is required to increase the precision of mapping estimates in CTCLs. Moreover, studies with a longer follow-up period and recruiting more patients with advanced stages would allow to generate (or test) algorithms on a more representative MF-CTCL patient population. More research is also required to identify the most appropriate techniques to deal with the overestimation of mapped utilities.