Introduction
In recent years, there has been an increase in the use of economic evaluations for the appraisal and strengthening of healthcare programs at national and multi-national levels [
1]. Economic evaluations commonly use the Quality Adjusted Life-Years (QALY) as a measure of benefit to value health outcomes. Estimating QALYs requires the application of preference weights/utilities for health states in different populations, countries or regions. These preferences are measured with preference elicitation studies and subsequently captured in a so called ‘value set’. Currently the most used instrument to generate preference weights/utilities for these value sets is the EQ-5D [
2,
3]. Many countries ideally have their own set of national values for the EQ-5D to ensure that resource allocation decisions based on economic evaluations reflect the preferences for health states of its own population. Value sets using the time trade-off (TTO) valuation technique for 3-level of the EQ-5D (EQ-5D-3L) now exist for twelve European countries [
4‐
13]. For countries that do not yet have a national value set, it is common practice to use another country’s value set as a proxy. Currently, no definitive criteria exist to choose from for such a proxy value set.
An option for the use of proxy values of neighbouring or otherwise culturally related countries, might be the use of a pooled value set. Such a pooled value set would mitigate, to some extent, the variance due to methodological differences, but it will obviously also eliminate the possibility to account for differences in cultural values. After an extensive study, using several available international EQ-5D values sets, Roudijk et al. came to the conclusion that although differences between national values sets exits, these differences could not be explained by national cultural values [
14]. The main difficulty the authors encountered was that the possible influence of national cultural values is nested with possible methodological variation between the national studies. However, this study included both 3L and 5L studies in their analyses, which made it hard to distinguish if variation between value sets was driven by national cultural values or by specific methodological differences at the level of national investigations.
Within the European context, a pooled value set may also inform health care policy and decision making at the European level, for example in determining the value of a vaccine in a multi-Member State procurement setting through health technology assessment (HTA) methods or for use with multi-national trials. Using one pooled European value set could be seen as a simple way to standardise health care evaluation with respect to Health Related Quality of Life (HRQoL). Indeed, organizations such as EUnetHTA and Beneluxa are promoting networks between European countries in order facilitate reliable, timely, transparent, and transferable information to contribute to HTA [
1,
15]. A pooled European value set can support these efforts.
The aim of the present study is to derive a pooled value set from the published coefficients of TTO valuation studies of the EQ-5D-3L within Europe. We will refer to this pooled value set as a ‘pan-European’ value set. Our reasoning to use published coefficients to create such a value set is two-fold: methodological and practical. The methodological reasons include that when the raw national data is used, and international uniform in- and exclusion criteria are applied, the newly selected data is no longer similar to the one used in the original national valuation studies. That means consideration to ex- or included values at a national level are ignored, although they might be based on relevant local knowledge and preferences of the researchers involved. Moreover, model specifications that are considered relevant at national level are ignored. An alternative is to generate data from the published coefficients of the national studies, which are based on locally informed in- and exclusion criteria and reflect local considerations about the choices of the models. Practically, the developed methodology will allow for an easy update of the pan-European value set, when new national valuation studies become available. Even though, new valuation studies are focusing on generating health state preferences for the newly developed five-level version of the EQ-5D instrument, the 3-level version is also still in use, and EuroQol does not recommend one or the other. Therefore, we initialize to develop the pan-European value set for EQ-5D-3L; the resulting methodology can also accommodate an estimation of a pooled value set on the EQ-5D-5L [
16].
Discussion
This study compared methodological, procedural and analytical characteristics of the twelve EQ-5D-3L TTO valuation studies. Differences existed in sample size, the number of health states valued and exclusion criteria. All except the Hungarian and Romanian valuation studies were based on the MVH protocol. All studies used the additive 10-parameter model, which represents levels 2 and 3 for each dimension except for the Slovenian study which used a constrained 6-parameter model approach that assumed the relative severity of level 2, “moderate problems”, being similar across dimensions. This method was used in the Slovenian value set due to concerns about the relatively small sample size and limited number of valued health states. Furthermore, in the Polish, Dutch and Italian studies, the translations chosen for describing the levels of severity in health states may render differences in comparison with other value sets. For instance, in the Polish value set; mobility level 3 “confined to bed,” implied being bedridden, therefore, the Polish values may be lower for health states that included level 3 of mobility. However, these differences in valuation techniques and methodologies did not hinder us from pooling the utility values that each country is using for their respective HTA. Based on the published coefficients, we were able to simulate a dataset on which we could estimate the ‘pan-European’ value set. The resulting coefficients can be applied when national values are absent. The pan-European value set would also be an optimum choice when decisions need to consider a European perspective, for instance, for reimbursement decisions at the European level. This contributes towards cross-country harmonization of outcome measures for economic evaluations [
31].
As this study aims to provide a means for standardizing multi-country evaluations by combining valuation tariffs from different countries in a particular region (e.g. Europe), one obvious factor to consider is the varying population size of different European countries included in this analysis. In order to account for differences in population size, we applied population size weights, adjusted for clustering at the country level. We found that including these weights for population size complicated the modelling. This may be related to the coefficients of the German value set, which is known to have the highest values, and is weighted with the highest population size [
9,
32]. This weighting therefore introduces considerable variance, while it is unclear whether these high values truly represent higher values of the German population, or that the high values are an artefact of the sampling technique employed in the study. Indeed, when catering values for the new EQ-5D-5L, a decade later, the German values converge with other values from European value sets which suggests that the first attempt with the EQ-5D-3L had methodological issues [
33].
Given the reasoning above, the application of population size weights in our study should be considered as an illustrative example. When it comes to weighting value sets of different countries, other factors such as socio-demographic, societal, religious, economic and linguistic factors can be included as weights as they may further explain inter-country differences [
34]. A flexible modelling technique which can easily incorporate these weights would be helpful to guide our choice of the OLS model to predict the pan-European value set for EQ-5D-3L. Nevertheless, given the small changes in the coefficients, as found in this study, it needs to be investigated whether the incorporation of weights for background variables increases the validity, or rather complicates interpretations. Therefore, application of weights in the analyses and their interpretation may need to be treated with caution.
We used EQ-5D-3L as an illustrative example in the exercise to estimate a pan European value set because of its widespread application in Europe. The same methodology can be applied for the new five level 5-level version of the EQ-5D, or with any other utility questionnaire that uses regression analytical techniques to estimate a value set.
Various previous studies have compared different EQ-5D valuations in an attempt to unify EQ-5D data and generate preference weights for regional general populations. Greiner et al. were one of the first to derive European weights using the EQ-5D – Visual Analogue Scale (VAS) data from 11 European countries [
35]. Time trade-off data is preferred over VAS data as this valuation method asks respondents to make a trade-off between the attributable time and HRQoL, much in the same way as a QALY can be interpreted. Olsen et al. also compared time trade-off valuations in four Western countries and three non-Western countries. They concluded that between the four European countries, there is less variance than between value sets of Western and non-Western value sets [
36]. Another study compared three EQ-5D valuations in Central and Eastern European countries and further estimated a population norm for this region [
37]. These studies thus suggest that a pooled value set depicting averaged European health state values may indeed be a feasible and sensible way forward in health economics research.
Some strengths and limitations merit consideration: despite of the differences among the included valuation studies, we present a flexible approach using published coefficients, which can accommodate more value sets as soon as they become available. This is a pragmatic approach that suggests that coefficients from existing published valuation studies could be combined to generate health state preferences for any specific region, being this Europe or any other geographical area or a sub-set of countries.
One can argue that a starting point for deriving a pooled value set should be the raw data of each country’s national valuation study [
14]. However, the major disadvantage with this approach is that data collection for this study depended on the willingness of authors and institutes to share the data. Moreover, data sharing could be limited by constrains enforced by the informed consent, as the data is used for different purposes as described in the informed consent and data is transferred to others than the original research team, which may initiate privacy infringing.
In this study we applied and compared OLS regression, gamma regression, and FMM to best fit the pooled saturated data. We present the pan-European value set using the OLS which was the most pragmatic choice according to goodness of fit, prediction error and model convergence. Even though, the FMM model performs slightly better than the OLS model based on the penalized likelihood criteria (AIC) the model did not achieve convergence after the application of population weights. Therefore, further research into advanced analytic techniques is needed to test various model specifications using the FMM which are beyond the scope of the current paper. Future research to test different hypothesis, for instance, that the probability of belonging to a particular group (class) could also be consequently tested.
We included a sensitivity analysis with addition of the identified interaction terms from the existing valuation studies to the OLS model. Various interaction terms are used in some of the older EQ-5D-3L valuation studies such as N3, I2, I3
2, D1 However, such interaction terms are not recommended to be included in models in recent valuation studies as they could increase the misprediction errors [
38]. Furthermore the use of D1 interaction term has been heavily criticized as it may complicate the model [
39].
The UK is no longer a part of the European Union (EU). This also entails that the UK is no longer a part of the regulations regarding therapeutic products, interventions, and evaluations of their effectiveness within the European economic area. Therefore, taking Brexit into consideration, we re-ran the OLS model as a scenario analysis with exclusion of the UK value set. The resulting pan-EU value set can be used for economic evaluations of drugs within the EU context (see Additional file: Table
7).
A limitation of this study is that the samples included in each valuation set were not entirely representative of the general population of the corresponding country [
35]. Since some of the value sets are quite old, it is also questionable whether these value sets are still representative of the values of the general population, as population structure in the respective countries have changed over the years. Furthermore, societal differences such as educational status, culture, norms, wealth and on the other hand methodological differences such as elicitation methods, modelling, and quality of data may have influenced the health state valuations at individual country level. We identified that each valuation study had its unique characteristics, its own methodological framework and reasons for inclusions/exclusions. We also recognize that the quality of some value sets may be questionable. For instance, there are inconsistencies within the Portuguese value set where the value of health state 33,331 (- 0.536) is lower than the value of 33,333 ( -0.496). One approach to account for such differences would be to derive a quality score and further adjusting the analyses for it. However, we argue against this approach because the identified differences between studies might be more often properties which solely represent the thorough understanding of the respective country’s preferences rather than differences in quality.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.