Background
Idiopathic pulmonary fibrosis (IPF) is a progressive interstitial lung disease (ILD) with a poor prognosis [
1]. Patients with IPF experience both physical and psychological deficits including dyspnea, reduced exercise capacity, social isolation and loss of mental well-being [
2]. These symptoms inevitably affect the quality of life of patients with IPF.
Health-related quality of life (HRQL) expresses of the impact of a patient’s health status on his or her quality of life. As the current treatments for IPF do not significantly reduce mortality [
3,
4], improving HRQL is becoming an important outcome in both clinical trials and daily clinical practice. HRQL can be measured using both generic and disease-specific instruments [
5]. Disease-specific instruments have been designed to assess aspects of health status particularly relevant to the disease of interest. This improves the relevance of the items of the instrument to patients and will probably make them more responsive to changes than generic instruments [
5].
Often, non-IPF specific instruments have been used to assess HRQL in patients with IPF e.g. the St. George’s Respiratory Questionnaire (SGRQ) [
3,
6]. SGRQ was originally developed for patients with obstructive lung diseases [
7,
8], but due to a lack of disease-specific HRQL instruments, SGRQ has been widely used in patients with IPF. Even though SGRQ holds acceptable validity and reliability in patients with IPF, some items are less relevant to this patient group and possesses weaker psychometric properties [
7]. Among these, especially the symptoms domain including questions about attacks of chest trouble and wheezing are less relevant to patients with IPF.
An IPF-specific version of the SGRQ (SGRQ-I) was developed based on a cohort of patients with IPF [
9]. Of the 50 items in SGRQ, the 34 items which were most reliable for measuring HRQL in patients with IPF were retained in SGRQ-I. However, important aspects of validity have not been assessed in SGRQ-I. To our knowledge, no previous studies have examined the ability of SGRQ-I to distinguish between patients with different stages of disease severity. This is a substantial part of validity, as the instrument should be able to discriminate patients with advanced disease from patients in early disease states. Neither has SGRQ-I been compared to a dyspnea instrument which is validated for use in patients with IPF nor to another ILD-specific HRQL instrument. A number of instruments are used to measure dyspnea, but the University of California, San Diego Shortness of Breath Questionnaire (SOBQ) is one of the best validated instruments for use in patients with IPF [
10,
11]. The King’s Brief Interstitial Lung Disease questionnaire (K-BILD) is an ILD-specific instrument measuring HRQL that has high validity in patients with IPF [
12]. By comparing SGRQ-I to such instruments, the validity of the questionnaire can be strengthened. Furthermore, test-retest reliability of SGRQ-I has only been examined in a small study of 23 patients with IPF [
13]. It is essential that the results of the instrument are repeatable with minimal variation in stable patients.
To increase the generalizability and reliability of SGRQ-I, the results of the initial validation should be repeatable in other cohorts of patients with IPF. Also, the validity should be examined in both patients with a recent diagnosis of IPF and longer disease durations. Another aspect of generalizability is the use of instruments in other languages. So far, SGRQ-I has only been translated into Spanish [
13], and no IPF-specific HRQL instruments are available in Danish. Translation of valid and reliable HRQL instruments is important to support international research in new IPF treatments and studies aiming at uncovering determinants of HRQL in patients with IPF. This is needed to make effective interventions targeted at improving HRQL in patients living with this burdensome disease. Thus, efforts might include discussing advance care planning and palliation at an early stage in patients with this progressive disease, which is also recommended by the World Health Organization (WHO) [
14,
15].
The aim of this study was to evaluate the known-groups validity and test-retest reliability of SGRQ-I, assess the validity of SGRQ-I in patients with different disease durations, translate SGRQ-I into Danish and examine the correlations to SOBQ and K-BILD.
Discussion
SGRQ-I was translated into Danish and proved to be a valid tool to measure HRQL with a good internal consistency, solid concurrent validity, high test-retest reliability and a good ability to discriminate between patients with different stages of disease. SGRQ-I was also equally valid in patients with different disease durations on almost all parameters.
The known-groups validity of SGRQ-I has not previously been investigated. An important aspect of measurement validity is the ability of the instrument to distinguish between patients with different stages of disease, as HRQL worsens with increasing disease severity [
28]. Our results show that SGRQ-I is very good at differentiating patients with respect to pulmonary function measured by FVC and DLCO. When stratifying patients into groups according to the GAP index or use of LTOT, SGRQ-I was also able to differentiate between these groups. These novel results add further weight to the validity of SGRQ-I and emphasizes the relevant utility of the instrument.
Reliability was not assessed during the development of the instrument and was only evaluated in a small group of 23 patients in another study. Reliability is a central part of an instrument’s measurement qualities to supply trustworthy results. SGRQ-I proved to be very reliable when completed twice within a short period of time in stable patients. Apart from the limited sample size, patients were only asked for worsening of symptoms upon completing SGRQ-I the second time [
13]. We excluded patients with both improvement and deterioration to ensure that only truly stable patients were included in the analysis of reliability.
In order to examine the concurrent validity, we compared SGRQ-I to SGRQ and correlated SGRQ-I to other HRQL instruments and measurements of disease severity relevant to IPF. The ICCs were high for both domain and total scores, indicating very good agreement between SGRQ-I and SGRQ. The Bland-Altman plots supported these findings, even though there was a tendency towards slightly higher scores in SGRQ-I compared to SGRQ with increasing average scores. As such, SGRQ-I scores indicate a broader spectrum of HRQL, as patient have better HRQL measured by SGRQ-I than by SGRQ with low average scores and worse HRQL with higher average scores. This may be due to the removal of selected item with poor fit to the Rasch model or many missing answers in patients with IPF [
9]. If the two instruments had very similar results, the justification for SGRQ-I would only lie in face and content validity. Based on these results, one could argue that SGRQ-I should be used instead of SGRQ in patients with IPF, as the results differ slightly and SGRQ-I is targeted at IPF. The validity of SGRQ-I is also supported by the strong correlations to K-BILD. After all, comparing SGRQ-I to an ILD-specific HRQL instrument provides better evidence of the validity than comparisons to instruments developed for other lung diseases.
Compared to SGRQ, the SGRQ-I holds a pronounced advantage as it only consists of 34 items compared to 50 items in the SGRQ. It is easier to complete and has the same validity and reliability as SGRQ. Nevertheless, both instruments are more suitable for research purposes than clinical assessments. A Tool to Assess Quality of life in IPF (ATAQ-IPF) is another IPF-specific HRQL instrument containing 74 items [
29]. ATAQ-IPF covers more domains than SGRQ-I but is also more time consuming to complete which may limit its use. As such, SGRQ-I should be considered as an IPF-specific HRQL instrument in future clinical trials. Other HRQL questionnaires validated for IPF and other ILDs include K-BILD and the COPD Assessment Test (CAT). K-BILD consists of 15 items and has validity and reliability comparable to SGRQ-I [
12]. CAT was developed for patients with COPD, but has subsequently been validated in IPF and other ILDs [
30‐
32]. However, as SGRQ-I is more comprehensive than both K-BILD and CAT, doctors and healthcare professionals will have a better impression of the disabilities and limitations experienced by the patients in their daily living. Hence, it will be easier to intervene and assist the patients in an attempt to improve their everyday HRQL.
Dyspnea is a major symptom in IPF and correlations to SOBQ were generally strong, demonstrating a good reflection of this symptom in SGRQ-I. In the original version, dyspnea was measured using the Borg dyspnea index and the baseline dyspnea index (BDI). The correlation of SGRQ-I total score to SOBQ was stronger than the correlations to Borg scale and BDI (0.80 vs 0.46 and − 0.67, respectively). SOBQ has been validated for use in patients with IPF [
10,
11] and covers dyspnea associated with a wide range of daily activities. As such, SOBQ may be a better measure of dyspnea in IPF than Borg and BDI, and SGRQ-I seem to capture the severity of dyspnea very well.
Correlations to the generic SF-36 confirmed the concurrent validity of SGRQ-I, although the correlations were mainly weaker than correlations to the other HRQL instruments. This is probably caused by the generic nature of the SF-36, which has to be applicable across a wide range of conditions and is not tailored to reflect the symptoms and implications of living with for instance IPF in the same way as disease-specific HRQL instruments. The mental component score had weaker correlations than the physical component score. Comparable result were obtained in the initial development and validation of SGRQ-I [
9]. As the psychological domain of K-BILD also had weaker correlations to SGRQ-I, the psychological impact of living with IPF may be more diffuse and difficult to incorporate into a HRQL instrument than the physical symptoms accompanying IPF.
Correlations to FVC were weaker than correlations to DLCO and 6MWD. Even though SGRQ-I had moderate correlations to PFT results and 6MWD, these measures only estimate the physiological limitations of IPF and not the full impact of IPF on the patients’ lives. Similarly, moderate to weak correlations have been demonstrated in other HRQL questionnaires including SGRQ, K-BILD and ATAQ-IPF [
7,
12,
29]. Therefore, HRQL instruments are important supplements for both clinical trials and daily clinical practice to get a full picture of the current state of patients with IPF.
This study included the largest number of patients in a translation and validation study of SGRQ-I, which has previously only been translated into Spanish in a population of only 23 patients [
13]. By including a larger cohort of patients, the generalizability of our results increases, as the study population is more likely to reflect the background population in terms of disease severity, socio-economic status and views on life. Our results support the former findings indicating that SGRQ-I is a valid and reliable measure of HRQL [
9,
13]. Also, SGRQ-I proved to be equally valid in patients with different disease durations which is a novel finding. The weaker correlations of the activities and impacts domains to DLCO and 6MWD in incident patients do not significantly change these results.
SGRQ-I is currently the only tool in Danish to measure HRQL explicitly developed for patients with IPF. The questionnaire was both well-received and perceived as relevant by patients with IPF. The Danish version of SGRQ-I was comparable to the original English version and as such, SGRQ-I performed well in a non-English speaking population.
Responders and non-responders were comparable regarding demographics, LTOT, medical treatment or PFTs in the missing data analyses at baseline. After two weeks, the only significant differences were smoking status and 6MWD. Though these results could indicate some degree of healthy volunteer bias, however, as differences between the two groups were minimal, we presume that no significant selection bias was introduced.
The large number of participants is a clear strength of our study. Also, the fact that the patients were recruited in a multicenter setup increased the generalizability of the results with a better reflection of the background IPF population. Furthermore, we assessed many different aspects of validity and reliability, including comparisons to both other HRQL instruments and measures of disease severity. A limitation of our study is the single measurement of pulmonary function and level of physical activity. Symptoms can vary from day to day, and repeated measurements at home, e.g. with home spirometry or accelerometers, might give a better impression of the true physical functional state of the patients.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.