Introduction
Chronic obstructive pulmonary disease (COPD) is a complex, chronic condition, which is characterised by progressive airflow limitation that is not fully reversible. The major symptoms of COPD, such as dyspnoea, cough and sputum production, are disabling and have substantial impact on both patients' health status and the health care system [
1,
2]. Although treatment involves several approaches, bronchodilator medications are central to the management of COPD, improving both lung function and symptoms [
1].
The complex nature of COPD means that it is important to assess treatment effectiveness in terms of patient-reported outcomes, including symptoms or health status scores [
3]. Clinicians and policy makers have recognised the importance of measuring health status, in order to make informed patient management and policy decisions [
4], and clinician-led guidelines recommend this approach for COPD [
1,
2]. However, regulatory authorities continue to emphasise airflow obstruction, measured by spirometry, as the primary outcome required for registration trials of new bronchodilators. It is therefore relevant to establish if and how changes in lung function may translate into patient-reported outcomes.
Although primary studies with bronchodilators frequently report both spirometry and patient-reported outcomes, the relationships between outcome measures are poorly understood. A study by Stahl et al. published in 2001, showed weak correlations between the St George's Respiratory Questionnaire (SGRQ) and cough, breathlessness, forced expiratory volume in 1 second (FEV
1) and walking distance but reported only limited supporting patient level data [
5]. Study-level meta-analysis is a meaningful and cost-effective approach to addressing a clinical research question, particularly where individual patient data is difficult to obtain [
6]. We are unaware of any study level analysis which has specifically addressed how lung function is related to outcomes.
The present study was a systematic review of randomised controlled trials (RCTs) of inhaled bronchodilators in adult patients with stable COPD, which reported change in trough FEV1, the primary physiological outcome in most studies of long-acting bronchodilators, alongside patient-reported outcomes. The primary objective was to assess at a study level the relationship between FEV1 change and health status change, as measured by the SGRQ, and to estimate the increase in mean FEV1 associated with a clinically important improvement in health status. As secondary objectives, we assessed the relationship between change in FEV1 and SGRQ domains, the influence of study duration, and the relationship between change in FEV1 and change in other patient-reported outcomes, such as dyspnoea, as measured by the Transition Dyspnea Index (TDI), and COPD exacerbations.
Discussion
Our study-level analysis demonstrated a relationship between improved lung function (as measured by FEV1) and improvements in health status (as measured by SGRQ) in adult patients with stable COPD who are treated with long-acting inhaled bronchodilators. Results of random-effects regression modelling indicated that a 100 mL increase in FEV1 was associated with a reduction in SGRQ total score of 2.5 units. This equates to a clinically meaningful reduction of 4 units in SGRQ being associated with an estimated improvement in FEV1 of 160.6 mL. These results were supported by correlation analyses which demonstrated a moderate negative correlation between change in total SGRQ score and change in trough FEV1, when all treatment arms were considered. When the placebo arms were excluded from the analyses the relationship was not significant, which may be due in part to the reduction in sample size, but principally because clustering of results for the placebo arms around zero for change in FEV1 and change in SGRQ increased the scatter in the data which allowed correlations to emerge. It should be emphasised that the principal objective of our review was to investigate the relationship between trough FEV1 and outcomes rather than test differential effects of treatment, so all use of treatment arms including placebo arms was appropriate. It is important to note that our analysis focussed on studies including long-acting bronchodilators. Relationships between FEV1 and outcomes may be different for anti-inflammatory treatments. Further, different results may have been obtained had we assessed the relationship between peak FEV1 and outcomes. However, we selected the trough measurement since it was the primary endpoint and therefore best documented outcome in most studies.
Despite the discrepancy in outcome measures required to demonstrate clinical effectiveness between the regulatory authorities and reimbursement agencies, such as the National Institute for Health and Clinical Excellence in the UK and the Institute for Quality and Efficiency in Health Care in Germany, few studies have investigated the relationship between change in lung function and change in patient-reported outcomes. We are aware of no other analysis addressing this issue at a study level. However, our data are consistent with the results of patient-level analyses [
5,
48], although in these studies the strength of the relationship between change in SGRQ and FEV
1 was too weak to allow health status gains to be inferred from spirometric changes [
48]. This is not a limitation, but rather reflects how different individuals with the same physiological limitations may experience differing effects on their health status.
Our study indicated that the correlation between change in trough FEV
1 and change in SGRQ total score appears to strengthen with increasing study duration from 3 to 6 to 12 months. Over an intermediate and longer term period, the impact of an improvement in lung function may have a greater effect on patient well-being, although in our analysis, the limited data reported in the included studies did not allow us to assess whether changes in FEV
1 at 3 months correlated with longer term changes in outcomes. There was also a trend to increasing mean change in SGRQ, across all study arms, with longer study duration. When data were analysed by SGRQ domain, the association between change in FEV
1 and change in SGRQ scores was still present for the Activity and Impacts domains. A weak correlation between SGRQ Symptoms domain and FEV
1 has been reported ever since the first validation of this instrument [
3].
Another important issue to be addressed is the "meaning" of the 100 mL increase in FEV
1 associated with a reduction in SGRQ total score of 2.5 units, and an estimated improvement in FEV
1 of 160 mL in relation to a clinically meaningful reduction of 4 units in SGRQ. There is no universally accepted approach for determining the clinical important difference of a measurement. As a measure, SGRQ reflects aspects of COPD beyond lung function alone [
48]. In our analysis, the corresponding increase in health status in treatment arms with larger improvement in FEV
1 enhances the ability to interpret lung function changes at a study level, but not at a patient level. Depending on the intervention under study, FEV
1 may offer the perspective of an intermediate end point in assessing likely treatment effectiveness. However, treatment effectiveness cannot be based exclusively on spirometry, requiring assessment of other relevant clinical parameters such as patient-reported health status.
It is interesting to note that a zero change in FEV
1 still resulted in a reduction in SGRQ score of 2.5. This effect has been noted in many clinical trials in COPD and appears to relate to a 'Hawthorne effect', whereby patients receive better care by participating in the trial [
49]. It could relate to a number of different factors, including improved compliance with treatments which may not all have bronchodilator effects.
There was also some evidence of a positive relationship between change in FEV1 and other outcomes, i.e., improvements in TDI score and reduction in the proportion of patients experiencing at least one exacerbation. These associations were weaker than those observed with SGRQ. However, correlation data for TDI versus trough FEV1 were limited by the relatively small number of studies (n = 8) reporting both outcome measures. For data on exacerbations, longer study durations would have been required to fully assess the apparent negative correlation with change in FEV1.
Our review has limitations. We did not explicitly seek primary studies assessing the correlation between outcome measures and the restriction of our search strategy to RCTs in order to enhance the quality of the analysis means that observational studies of this type would not have been identified. In addition, the objectives of included studies differed from those of the review: included studies were generally designed to measure the effects of treatment upon COPD outcomes, whereas we were interested in the relationships between outcome measures. Included studies tended to present full results for their primary outcome measure only, with reporting of additional outcomes being poor and measures of variance were often absent. Thus, standard deviations had to be imputed for a high proportion of the data sets included in our analyses. In addition, many studies did not report numerical data and values were estimated from graphs, although such approaches are consistent with established systematic review methodology.
Although our review did not address treatment effect sizes, our objectives did include an assessment of the relationships between treatment effects upon treatment effect sizes (data addressing this objective were sparse and not included in this article). For this reason only RCTs of long acting bronchodilators which included a placebo arm or which compared different classes of bronchodilator were compared.
Finally, the correlation analyses used to assess the relationships between patient-reported outcomes and FEV1 where data were insufficient to support regression modelling, combined treatment arms from different studies. Thus the data were essentially treated as observational cohorts and the strengths of the RCT design were lost. Combining the data in this way does not take account of differences between studies, such as treatment and dose, and participant baseline characteristics, which may affect estimates of correlation. In theory, this limitation can be overcome using random effects regression modelling. However, even where such modelling was possible, the number of explanatory variables which could be included was constrained by both the reporting of these variables in the primary studies and the size of the data set; both poor reporting and small data sets were factors in this review.
The results of this review give important new insight into the relationship between FEV1, a key primary outcome required by regulatory authorities for COPD clinical trials, and patient-reported outcomes such as health status, dyspnoea and exacerbations, which are of greater interest to clinicians, patients and reimbursement agencies. Our analyses have been limited by the size and quality of the available data set and are encouraging, but should be considered hypothesis generating and warrant further investigation.
This study-level analysis indicated that improvement in trough FEV1 with inhaled bronchodilators may be associated with improvement in health status and may also be associated with improvements in other patient-reported outcomes. Although the strength of the association was modest, improvements in both FEV1 and SGRQ, relative to changes likely to be clinically relevant, were of similar magnitude. FEV1 may offer the perspective of an intermediate endpoint in assessing treatment effectiveness at a study level.
Competing interests
AC and GCN are employees of Novartis. MW and GW have no competing interests related to the content of this paper. JB has received fees for speaking at conferences and for serving as an expert on advisory boards for AstraZeneca, BI, GSK, Novartis, Nycomed and Pfizer. JB's MUHC Research Institute received research grants for investigator-initiated researches and unrestricted educational grants from AstraZeneca, BI, GSK, Novartis and Pfizer. PJ has received advisory board and consulting fees from Novartis, GSK, AZ, Boehringer, Roche, Almirall and Spiration. He has received speaker's fees from GSK.
Authors' contributions
MW developed the design, concept of the study and analysis, and carried out the systematic review. JB participated in the design and analysis planning and advised on the interpretation of the study. PWJ participated in the design and advised on the interpretation of the study. AC conceived of the study, participated in its design and analysis planning and contributed to its interpretation. GCN conceived of the study, participated in its design and analysis planning and contributed to its interpretation. GW developed the design and concept of the study, carried out the systematic review and performed the statistical analysis. All authors had full access to the data and were involved in drafting the manuscript. All authors read and approved the final manuscript.