Based upon the opinions of the distinct stakeholders’ groups involved in the Web-Delphi process, this study, firstly, explored the relevance of 34 aspects for the evaluation of each of two types of medical devices, according to the panel and per group of stakeholders; secondly, analysed the level of agreement within stakeholders’ groups; and, finally, concluded about which aspects gathered a common relevance across the two types of medical devices. This work was developed to form a basis for discussing HTA processes and the construction of value models to evaluate medical devices. It is hereinafter discussed in terms of stakeholders’ views on the aspects to consider in medical devices evaluation, implications for policy, and limitations of the study.
Stakeholders’ views on the aspects to consider in medical devices evaluation
The Delphi panel was composed of distinct stakeholders’ groups. Analysing the distribution of answers of the panel and per groups, results show one of four situations: (1) there is a panel majority opinion with all groups presenting majority on the same relevance level, (2) there is a panel majority opinion but not all groups present majority on the same relevance level, (3) there is no panel majority opinion but some groups present majority on a relevance level, and (4) there is no panel majority opinion nor majority in any group. Considering the meaning of the relevance levels (presented in
Overview of the Web‑Delphi process, in the
Methods section), these results suggest that participants consider there are aspects that must always be part of the basis of evaluation: that is the case of aspects assigned ‘Fundamental’ or ‘Critical’ by the panel, for instance, ‘User-friendliness for the healthcare professional’ for ‘implantable’ devices and ‘Time between procedure and results’ for ‘in vitro’ devices. Furthermore, participants consider some of these aspects can even preclude the evaluation if there is no data for assessing them – this applies to aspects set as ‘Critical’ –, for instance, ‘Specific features of the medical device’ for ‘implantable’ devices and ‘Sensitivity and Specificity’ for ‘in vitro’ devices. Additionally, participants’ answers suggest that there are ‘Complementary’ aspects, i.e., aspects that can add some value but will not always be part of the basis of evaluation, for instance ‘Environmental impact of the production and use of the medical device’ and ‘Learning curve of the healthcare professional’ for ‘implantable’ and ‘in vitro’ devices, respectively. This can be seen in Table
3 that presents the panel majority opinion and systematizes the groups majorities aligned with the panel. In general, stakeholders’ groups did not present obvious contradictory opinions, as even the aspects with no panel majority opinion gathered most groups answers around the same two consecutive relevance levels (as presented in Table
2). The Kruskal–Wallis test followed by Dunn’s-Bonferroni post hoc method allowed to identify only eight aspects (six in ‘implantable’ and two in ‘in vitro’ devices) with statistically significant differences across groups, but these differences were always observed across only one pair of groups, and the inter-rater reliability calculated with Gwet’s AC2 agreement coefficient showed a strength of agreement from moderate to substantial within each group, suggesting an alignment of the panel and within groups. Despite the general alignment of the groups, the reasons underlying the observed differences may be of interest for further research [
46].
This Web-Delphi process collected opinions not only from different stakeholders’ groups but also for two types of devices, a therapeutic and a diagnostic type of device. Comparing the opinions across both types of devices, results show that the panel attributed a common relevance level for 16 aspects, five ‘Critical’ (two agreed by all stakeholders’ groups) and 11 ‘Fundamental’ (one agreed by all groups) (see Table
3). Examples of this are the ‘Clinical efficacy and/or effectiveness’ considered ‘Critical’ (majority in all groups), or the ‘Comfort for the patient’ considered ‘Fundamental’ (not getting majority in all groups but gathering a panel majority opinion). The former aspect is aligned with economic evaluation literature [
6] centred into the effectiveness and costs of technologies whereas the latter is not explicitly considered in such methods. Moreover, many other aspects were recognised as relevant by the participants of our study, suggesting the need to formally consider a larger number of aspects in the evaluation of medical devices. This need has been recognised in literature [
3,
47], by authors advocating for the use of MCDA in HTA [
2], such as the ISPOR (The Professional Society for Health Economics and Outcomes Research) Medical Devices and Diagnostics Special Interest Group [
48], by authors developing value framework models for evaluating medical devices, such as in the HTA Core Model from EUnetHTA (European network for Health Technology Assessment) [
49], and also by the review on value assessment frameworks of Zhang et al. [
22] that covered 19 studies addressing health technologies in general and 38 addressing specific types of health technologies (mainly drugs). Four of the frameworks reported in that review targeted diagnostic or genetic tests and one targeted nondrug health technologies, with evaluation aspects included varying between three and 16 and covering different devices’ features, namely, their medical benefit, the adverse effects, the quality of life and satisfaction of the patient, and the costs. Our study, besides validating this need with a large and diverse panel of stakeholders, adds additional value aspects not identified in the existing frameworks’ literature, e.g., regarding environmental impact and aspects related with devices’ usage by the healthcare professional, such as user-friendliness, the learning curve, the training and the workload.
The list of aspects included in our study tries to be purposefully inclusive, which brings the advantage of being as complete as possible but the disadvantage of entailing potential overlap in some aspects. To evolve towards the construction of a multidimensional framework or of multicriteria models, the identified aspects would require further work and restructuring, eventually combining and clarifying aspects and exploring how to measure them in practice [
32] (for e.g., understanding what participants have in mind when considering the sensitivity and specificity of an implantable medical device as relevant). Nonetheless, our work provides important insights to inform such a framework development. In 44 frameworks reviewed by Zhang et al. [
22], value aspects were identified through literature review, engagement of stakeholders, or a combination of both, but only four frameworks involved patients or citizens in aspects’ identification. Our study explored a way to collect the wide range and diversity of stakeholders’ perspectives, including patients and citizens, adding to the discussion on how to bring these insights into HTA for standardising and bringing guidance and transparency to the evaluation of medical devices [
19,
28,
50] and how to include stakeholders’ views to inform HTA and adoption decisions [
23,
26].
Methodologically, through the Web-Delphi it was thus possible, first, to involve a large and heterogeneous group of HTA stakeholders, enabling them to interact and learn with each other by sharing their views and build an agreement about the relevance of most aspects. Second, to draw conclusions about differences in opinion between stakeholders and across types of devices. Third, it has shown in which aspects there is a panel majority opinion. All of this provides input information for additional research on how to develop multidimensional evaluation models and frameworks, and assists in planning future directions of research.
Implications for policy
This study shows that it is possible to gather the views of distinct stakeholders’ groups in a structured format, producing results that can be more widely used within HTA processes, as deemed as relevant by several authors [
24‐
27,
51]. All aspects were considered to some extent relevant, and some aspects gathered the same relevance level irrespectively of the type of medical device under analysis. Accordingly, approaches to assess medical devices value need to consider a broader range of aspects and the specificities of distinct types of devices. Despite the heterogeneity of this type of technologies [
12], there seems to be possible to attain some systematization and common standards, so asked in literature [
4,
9,
20,
52]. Nevertheless, one should recall that the evaluation may be affected by the context [
53], and that ‘Implantable medical devices’ and ‘In vitro tests based on biomarkers’ still comprise diverse devices, which needs to be considered when interpreting results.
Limitations of the study
Several limitations should be acknowledged in this study. Firstly, this study takes place in Portugal, having only national participants, and thus results can be context- and/or country-dependent. Nevertheless, the list of aspects was based on international peer-reviewed literature which brings useful information to inform the discussion on HTA for medical devices, beyond the considered country and context. Secondly, as the Delphi process is highly dependent on the availability and commitment of participants [
54], there was not a balance of participants across stakeholder groups, with the Buyers and policymakers and the Industry having a lower representation. This unbalance is somehow usual in Delphi processes as panels are purposive or convenience samples, not aiming to be representative samples of populations [
54]. Furthermore, Delphi literature does not present unequivocal recommendations for the sample size, with studies suggesting ranges from five to more than one thousand participants [
54,
55]. To try to mitigate the unbalance as much as possible, invitations and reminders for participation were sent. Thirdly, and still related with the Delphi process, it is important to acknowledge shortcomings of the method, namely the possibility of occurring cognitive biases and other behavioural influences during the process (such as egocentric discounting or the influence of majority positions) due to the freely online interactions among participants, which can also lead to answers not completely clarified by participants [
56], or the possibility of information overload due to the high number of aspects to be analysed by the participants, which could become tiresome and cognitively challenging for them [
57]. To avoid the occurrence of such shortcomings, not only the panel of participants was heterogeneous but also the aspects were organised, during the validation with experts, so that it would be easier, to the best of their knowledge, to follow the exercise. Additionally, participants could also rate each type of medical device in different time periods by re-accessing the platform, or even answering only one type of medical device if they felt more comfortable. Finally, the initial list of aspects could be biased as it was the result of a literature search followed by the validation by experts of the HTA agency. To overcome this possibility, participants could suggest additional aspects during the process, through the comments option, which was not observed.