Introduction
Pulmonary arterial hypertension (PAH) is a progressive, severely debilitating and incurable disease characterised by increased pulmonary vascular resistance and pulmonary arterial pressure, which ultimately leads to right heart failure and death. Although survival has improved in the modern management era [
1‐
3] compared with historical data [
4], the prognosis remains poor [
5]. Therefore, assessment of a patient’s risk of disease progression and mortality has become an essential part of PAH management and is now centred around risk-based treatment algorithms [
6‐
9].
Based on a growing body of evidence, the current guidance [
8,
9] from leading experts recommends regular multi-parametric risk assessment whereby patients are classified as being at low (< 5%), intermediate (5–10%) or high (> 10%) risk of 1-year mortality [
6‐
9]. It is recommended that a comprehensive assessment should be conducted at baseline (i.e. diagnosis/treatment initiation); in the event of clinical worsening, at least every 6–12 months routinely; and 3–6 months after a change in therapy [
7‐
9]. To aid risk classification, the 2015 European Society of Cardiology (ESC)/European Respiratory Society (ERS) pulmonary hypertension (PH) guidelines outline 9 determinants of prognosis, comprising 13 variables and their corresponding low-, intermediate- and high-risk thresholds. These variables include clinical and functional assessments, biochemical markers, and imaging and haemodynamic parameters [
9]. The patient’s overall risk at a given timepoint is a composite measure of the individual variables, which may not all fall into the same risk category. It is the patients’ overall risk that is to be used to drive decisions on whether and how to escalate treatment, with the goal of achieving/maintaining a low-risk status [
6‐
9].
Translating all available measures and indicators of patient risk into a single risk category can pose a significant challenge for the treating physician and, as a result, research has focused on development of standardised tools and protocols for risk stratification. Several groups have retrospectively analysed data from prospective PAH registries to develop scoring systems that can aid risk assessment in clinical practice. For example, the Registry to Evaluate Early and Long-Term PAH Disease management (REVEAL) has been used to develop a calculator in which measurements for 12–14 variables are entered and their prognostic values weighted in order to derive a final risk score [
10,
11]. Other studies have also taken a systematic approach to accurately predict risk of mortality, with the aim of developing a tool/scoring system that could be easily implemented in clinical practice without placing too much burden on the patient. These include analyses performed using data from the Swedish PAH Registry [
12] and the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) study [
13], which assigned risk to patients newly diagnosed with PAH using a scoring system whereby an overall risk score was obtained by averaging scores based on at least two measurements across 6–8 of the 13 variables outlined in the ESC/ERS guidelines. Each of these risk assessment strategies were able to accurately stratify patients according to their 1-year mortality estimates [
10‐
13]. In another study, data from the French PH Registry were used to calculate patients’ risk according to the number of low-risk criteria present from a total of 3, 4 or 5 variables, and found that a greater number of low-risk criteria was indicative of better survival [
14]. While the approaches taken in each of the studies described above differed, all point to the same conclusions: that regular risk assessment can and should be performed.
While there is now a clear mandate for risk assessment in the management of PAH, how this is implemented in clinical practice remains unclear. To the best of our knowledge, this is the first international study conducted to investigate how physicians currently assess risk of clinical worsening or death in patients with PAH, and how closely they adhere to the recommendations on risk assessment in the 2015 ESC/ERS guidelines. The aim of this study was to investigate how physicians assess PAH patient risk in clinical practice and to explore differences and similarities between the risk category assigned to patients by the treating physician (gestalt judgement of risk) and the risk category calculated using a published algorithm for risk assessment (calculated risk) [
13].
Methods
Participants
The study included respondents from France, Germany, Italy and the United States (US). Data were collected between October 9 and November 6, 2017. Respondents were cardiologists and pulmonologists who had worked in their specialty for between 2 and 30 years, and who managed at least 7 patients with PAH at the time of the survey. Full eligibility criteria for survey respondents are described in Supplemental Table 1. Respondents were predominantly recruited via panel databases (membership of which was reliant on relevant privacy permission) maintained by GLocalMind Inc. (Texas, US). Respondents were remunerated for their participation (equivalent range US$60–80 per respondent, in accordance with fair market value rates).
Table 1
Threshold values from the 2015 ESC/ERS guidelines to aid risk assessment [
9]
Clinical signs of right heart failure | Absent | Absent | Present |
Progression of symptoms | No | Slow | Rapid |
Syncope | No | Occasional syncope | Repeated syncope |
WHO FC
|
FC I, II
|
FC III
|
FC IV
|
6MWD
|
> 440 m
|
165–440 m
|
< 165 m
|
CPET | Peak VO2 > 15 mL/min/kg (> 65% predicted) | Peak VO2 11–15 mL/min/kg (35–65% predicted) | Peak VO2 < 11 mL/min/kg (< 35% predicted) |
VE/VCO2 slope < 36 | VE/VCO2 slope 36–44.9 | VE/VCO2 slope ≥ 45 |
NT-proBNP plasma levels
a
|
BNP < 50 ng/mL OR NT-proBNP < 300 ng/mL
|
BNP 50–300 ng/mL OR NT-proBNP 300–1400 ng/mL
|
BNP > 300 ng/mL OR NT-proBNP > 1400 ng/mL
|
Imaging (CMR imaging, echocardiography) | RA area < 18 cm2 | RA area 18–26 cm2 | RA area > 26 cm2 |
No pericardial effusion | No or minimal pericardial effusion | Pericardial effusion |
Haemodynamics
|
RAP < 8 mmHg
|
RAP 8–14 mmHg
|
RAP > 14 mmHg
|
CI ≥ 2.5 L/min/m2
|
CI 2.0–2.5 L/min/m
2
|
CI < 2.5 L/min/m
2
|
SvO
2
> 65%
|
SvO
2
60–65%
|
SvO
2
< 60%
|
Development of Questionnaire
The questionnaire (Supplementary appendix) was developed by Cello Health Insight (London, UK) in collaboration with Actelion Pharmaceuticals (Allschwil, Switzerland) and was conducted by Cello Health Insight. The questionnaire was developed in English, with French, German and Italian translations prepared by GlobaLexicon (London, UK). The accuracy of the translations was confirmed by Cello Health Insight’s language team in partnership with Actelion Pharmaceuticals. A pilot survey was conducted in the US with 6 of the 90 respondents in total to confirm that data could be collected accurately with no programming errors and that the questions were understood by the respondents with no ambiguity.
Execution of Questionnaire
The questionnaire was completed by respondents online. Respondents first had to answer some initial screening questions, to determine if they met the study’s eligibility criteria (Supplementary Table 1). Eligible respondents were then asked to complete 3 tasks, relating to the 9 parameters defined as “determinants of prognosis” in the 2015 ESC/ERS guidelines (Table
1) [
9]. Respondents with missing or invalid entries were removed from the final analysis set.
The first task consisted of several general questions to determine (1) which parameters respondents used to assess their PAH patients when evaluating prognosis, severity, clinical worsening and/or response to therapy (hereafter referred to as risk), (2) the timepoint(s) at which they performed the assessments, and (3) how often they performed the assessments. Note, no minimum number of variables were required to be reported by physicians for data to be included in this survey; however, at least 2 variables were required for the subsequent risk calculation.
The second task was a maximum-difference scaling survey used to rank the 9 parameters in terms of their importance to respondents for assessing risk, on a common scale. There were 9 evaluation rounds in the survey. Each round contained a different choice set of 4 of the 9 parameters, with each parameter being presented 4 times in total. From each choice set, respondents were asked to select the most and least important parameter in their practice for risk assessment in patients with PAH. The maximum difference means were calculated from the difference between the number of times each parameter was chosen as the most or the least important (the ‘count’), and ranking parameters based on these differences. Important parameters were defined as those that score between 1 and 49% higher than the average for all parameters.
In a third task, all respondents were asked to provide details of the 5–7 most recent adult patients with PAH they had managed in patient record forms (PRFs). Patients were included if they were currently receiving an endothelin receptor antagonist as mono- or combination therapy and were not taking part in a clinical trial. Quality checks were performed to ensure that all reported variables had a corresponding and appropriate value entered. Respondents were asked “In your opinion, how would you describe the patient’s current level of risk in terms of clinical worsening or death”, hereafter referred to as gestalt judgement.
Calculation of Risk
In the PRFs, respondents were asked to provide each patient’s measurements, where available, from the last clinic visit for the 13 variables (across the 9 parameters/determinants) specified in the ESC/ERS guidelines (Table
1). ‘Calculated risk’ refers to patient risk, as calculated using the strategy published by Hoeper et al. [
13], which provided accurate 1-year mortality estimates for patients with PAH. Risk scores were calculated from all PRFs that included measurements for ≥ 2 of the following variables at the most-recent clinic visit: New York Heart Association/World Health Organization functional class (WHO FC), 6-min walk distance (6MWD), brain natriuretic peptide (BNP) or N-terminal pro-BNP (NT-proBNP) plasma levels, right atrial pressure (RAP), cardiac index (CI) and mixed venous oxygen saturation (S
vO
2). All variables were assigned a score of 1, 2 or 3 according to whether the measurement was within the ESC/ERS guideline thresholds for the low-, intermediate- or high-risk categories, respectively. The average score (rounded to the nearest integer) was calculated and this represented the patient’s calculated risk category, i.e. average scores of ≥ 1 to < 1.5 were considered low risk, average scores of ≥ 1.5 to < 2 were considered intermediate risk and ≥ 2 to 3 were considered high risk [
13].
Statistical Analysis
For the maximum-difference scaling, R language was used to process the raw data and a pre-compiled STAN Hierarchical Bayes model was used to determine the importance value, with a mean importance value of 100. To ensure the stability and consistency of the results, the analyses were run several times with varied parameters. No other formal statistical comparisons were made, and all data are presented descriptively.
Compliance with Ethics Guidelines
Cello Health Insight is a member of the British Healthcare Business Intelligence Association and the European Pharmaceutical Market Research Association, and this research was conducted in accordance with their guidelines on market research. All methods performed in studies involving human participants were in accordance with the ethical standards of the national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Discussion
In recent years, the assessment of clinical worsening or death in patients with PAH has become increasingly important for informing treatment decisions, with experts now recommending physicians perform multi-parametric risk assessment regularly at routine appointments [
6‐
9]. To our knowledge, this is the first international survey of physicians to investigate how risk assessment is currently being implemented in clinical practice and the results show that there are differences between physicians’ gestalt judgement of patient risk and patient risk as calculated using an objective algorithm. It is important to note that 41% of the patients were excluded from the risk calculation analysis due to insufficient measurements, indicating that risk assessment as recommend in the ESC/ERS guidelines may often not be performed.
This study captured which parameters physicians measure and how often they do so, based on both physicians’ recollection of their usual practice and on which measurements were present in the PRFs. These 2 sets of information were broadly aligned and demonstrate that most physicians evaluate progression of symptoms, clinical signs of right heart failure and WHO FC, with NT-proBNP levels, CPET and haemodynamics being less frequently assessed. Overall, haemodynamics and CPET were the parameters least likely to be measured at follow-up, which is not in contravention of the guidelines, as they suggest assessment of these every 6–12 months [
8,
9], nor is it unexpected, given that many centres do not have access to CPET and many do not perform right heart catheterisation routinely to avoid potentially unnecessary invasive procedures [
15]. However, it does seem to disagree with the results from the maximum-difference scaling survey, which demonstrated that physicians consider haemodynamics as the second-most important parameter for assessing prognosis. This apparent misalignment may be partly explained by the finding that physicians stated that they measure these 2 parameters more often at baseline than at follow-up visits. Given the inclusion criterion that patients must currently be receiving an endothelin receptor antagonist and that, for all patients with a value reported, the time from diagnosis was at least 13 months, the most recent clinic visit for these patients was likely to have been a follow-up visit. In addition, the exact timing of the last visit was not captured in this survey, so another possible reason for misalignment may be that it did not fall within the timeframe suggested for these assessments.
Our study demonstrates a discordance between physicians’ gestalt judgement of risk and calculated risk. For example, of the patients judged to be at low risk (
n = 118), 80% were calculated to be at higher risk and of the patients judged to be at high risk (
n = 81), 69% were calculated to be at lower risk. The reasons for this lack of agreement are not clear, but could be due to several factors. First, physicians may not be measuring enough parameters to produce an accurate algorithm-based estimate of risk. In the Hoeper analysis, 95.3% of patients had at least 4 variables measured and 55.4% had all 6 [
13], compared with 12.2% and 6.3% in this study, respectively. In addition, it is possible that physicians may not agree with, or strictly apply, the parameter thresholds given in the guidelines. Finally, a physician’s clinical gestalt is based on more than just the assessments used in the algorithm to calculate risk, such as the patient’s disease aetiology, age, gender and comorbidities, as well as the physicians’ overall feeling of how the patient is doing as compared to previous consultations. Importantly, as this study did not capture outcome data, we do not know how well the physicians’ gestalt judgement of risk would stratify patients according to their 1-year risk of mortality and how that compares to the methods implemented by Hoeper et al. [
13]. The potential for under- and over-estimation of risk should be further explored, as this can have profound detrimental impacts on patients’ lives, such as increasing the likelihood of clinical worsening and death in patients at intermediate or high risk, and potentially subjecting low-risk patients to unnecessary treatments and/or assessments, further impacting their quality of life.
Limitations of this study include selection bias and recall bias, which are inherent to all surveys, and that the results may be different from those that would be obtained in countries not studied. The main limitation of this study is the lack of clinical outcome data, which prevents evaluation of the effectiveness of gestalt judgement in stratifying patients according to their 1-year risk of mortality. Furthermore, more qualitative data would be required to better understand why physicians choose to assess certain variables and not others, and what potential barriers they face, in order to reconcile the observed differences between perceived and calculated risk. Finally, it would be interesting to compare how physicians’ gestalt compares to risk scores derived using other methods, such as the approach published by Boucly et al. [
14]. However, the French approach stratifies patients according to the number of variables defined as low risk in the ESC/ERS guidelines and does not directly categorize patients as ‘low’, ‘medium’ or ‘high’ risk. Given this, a comparison could not be performed here.
Acknowledgements
The authors would like to thank the participants of this study for completing the survey, and Jordan Mullen for his support in designing and conducting the survey and collecting data.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.