Introduction
Idiopathic pulmonary fibrosis (IPF) is a progressive and generally fatal disease. Several retrospective studies have suggested that the condition is associated with a median survival time of only two to three years after diagnosis [
1‐
5].
The 2011 IPF guidelines provide updated and simplified IPF diagnostic criteria proposed by the ATS/ERS/JRS/ALAT [
6]. This may result in HRCT scanning playing a central role in the diagnosis of IPF. According to the guidelines that include major changes in the process of the diagnosis of IPF, the exclusion of other known causes of interstitial lung disease in addition to the detection of the usual interstitial pneumonia (UIP) pattern on high-resolution computed tomography (HRCT) in patients not subjected to surgical lung biopsies (SLBs) is adequate to diagnose the disease.
Furthermore, the 2011 IPF guidelines state that disease progression manifests as worsening respiratory symptoms, worsening pulmonary functions, the presence of progressive fibrosis on HRCT and acute respiratory decline and that monitoring patients with IPF is necessary to detect the development of the disease and proactively identify those with progressive disease. Additionally, pulmonary function tests are considered to be the most standardized approach for objectively monitoring and quantifying disease progression. In clinical practice, it is sometimes difficult to evaluate the progression of the disease based only on a worsening of the pulmonary function due to patient’s non-cooperation.
On the other hand, technological advances in HRCT have brought about decreases in examination times and the ability to obtain clearer images of secondary pulmonary lobules without the need for patient cooperation, unlike spirometry. Therefore, HRCT is an accurate, sensitive and objective technique for evaluating IPF. In addition, physicians often experience patients who exhibit worsening of HRCT findings associated with a poor prognosis in clinical practice. However, the use of regular follow-up with chest HRCT remains controversial in routine clinical practice and the procedure is not currently recommended in clinically stable patients.
The aim of our study was to assess the prognostic value of changes in HRCT findings using a new HRCT scoring system based on the grading scale published in the guidelines.
Methods
Study subjects
Our institutional review board approved this retrospective study (approval number H24-174, February 20, 2013), with a waiver of informed consent due to the retrospective study design. All consecutive patients diagnosed with IPF between January 2008 and January 2012 were enrolled in this study. All patients with IPF diagnosed using HRCT alone according to the 2011 IPF guidelines (the presence of an UIP pattern, including all four of the following features: subpleural features, basal predominance, reticular abnormalities, honeycombing with or without traction bronchiectasis) and the absence of features inconsistent with the UIP pattern were included. The patients were also diagnosed based on the exclusion of a possible UIP pattern associated with other known causes of interstitial lung disease, such as chronic hypersensitivity pneumonia, occupational or environmental exposure, connective tissue disease and drug-induced pneumonia. HRCT examinations, pulmonary function tests and serological studies were performed every six months from the initial diagnosis (baseline). Patients with missing or failed inspiratory chest CT scans were excluded from this study. An acute exacerbation was defined as the acute onset of increased dyspnea and hypoxia with progressive infiltrates on HRCT within the preceding 30 days in the absence of infection, pulmonary embolism or cardiac failure [
7].
HRCT assessment and HRCT fibrosis score
HRCT scans were obtained with 1-mm collimation and a 1-mm slice thickness at 10-mm intervals from the lung apices to the bases with the patient in the supine position at full inspiration. Two observers [H.I, K.N.] who were unaware of the clinical data and lung function of the patients (all the HRCT images were assessed in random order) evaluated the data independently. The observers made a subjective assessment of the overall extent of normal attenuation, reticular abnormalities, honeycombing and traction bronchiectasis.
A reticular abnormality was defined as a collection of innumerable areas of small linear opacity [
1]. Honeycombing was defined as the presence of a cystic airspace measuring 3–10 mm in diameter, with 1- to 3-mm thick walls [
8]. Traction bronchiectasis was defined as irregular bronchial dilatation within the surrounding areas showing parenchymal abnormalities. The morphological criteria on HRCT scans included bronchial dilatation with respect to the accompanying pulmonary artery, a lack of tapering of the bronchi and the identification of bronchi within 10 mm of the pleural surface [
8].
The HRCT findings were graded on a scale of 1–4 based on the classification system: 1. normal attenuation; 2. reticular abnormality; 3. traction bronchiectasis; and 4. honeycombing. The assessments of the two observers were averaged. This grading scale and assessed zones were determined based on the previous reports by Ichikado et al. [
9,
10] with minor changes for this study. The presence of each of the above four HRCT findings was assessed independently in three (upper, middle and lower) zones of each lung. The upper lung zone was defined as the area of the lung above the level of the tracheal carina, the lower lung zone was defined as the area of the lung below the level of the inferior pulmonary vein and the middle lung zone was defined as the area of the lung between the upper and lower zones. The extent of each HRCT finding was determined by visually estimating the percentage (to the nearest 5%) of parenchymal involvement in each zone. The score for each zone was calculated by multiplying the percentage of the area by the grading scale score [
1‐
4]. The six zone scores were averaged to determine the total score for each patient. The highest score was 400 points and the lowest score was 100 points using this calculation method. We named the total score the “HRCT fibrosis score”. The HRCT fibrosis score was recorded at the initial diagnosis and after six and 12 months in a similar manner, and an investigation was conducted regarding the chronological changes in these values.
Physiological testing
Pulmonary function tests, including spirometry and an assessment of the diffusing capacity of the lungs for carbon monoxide (DL
CO), were performed using a standardized spirometry procedure [
11] on the same day as the HRCT examination. The degree of improvement was defined based on 10% absolute changes in the forced vital capacity (FVC) from the baseline values as “improved (≥ 10% increase),” “stable (< 10% change)” or “worsened (≥ 10% decrease)” using the FVC values measured at six and 12 months after the initial diagnosis. In the present study, disease progression was defined as the presence of acute exacerbation, a ≥10% absolute decrease in the FVC and/or a ≥15% decrease in the %DL
CO from the baseline value, as determined according to pulmonary function tests [
12].
Statistical analysis
The mean ± standard deviation (SD) values of the pulmonary function parameters, HRCT fibrosis score and other continuous variables were determined at baseline and at six and 12 months. The paired t-test was performed to evaluate the changes in the variables from the baseline to six and 12 months, respectively. The interobserver variation with respect to the presence/absence of HRCT findings at baseline was evaluated using the kappa statistic based on the diagnosis made prior to the assessment by consensus [
13]. The interobserver agreement was categorized as “poor (κ < 0.20),” “fair (0.21 < κ < 0.40),” “moderate (0.41 < κ < 0.60),” “substantial (0.61 < κ < 0.80)” or “almost perfect (0.81 < κ < 1.00).” The interobserver variation regarding the extent of the HRCT findings at baseline and the changes in the extent of honeycombing from baseline to follow-up at six and 12 months were evaluated using Fleiss’s intraclass correlation coefficient (ICC) [
14].
Univariate Cox’s proportional hazard models were used to determine the ability of each variable to predict mortality. Additionally, the stepwise multivariate Cox’s proportional hazards model was used for variables found to be significant (p < 0.05) in the univariate model in order to identify more significant variables.
To analyze the changes in the HRCT fibrosis score (ΔHRCT fibrosis score) from baseline to follow-up (after six or 12 months) as a predictor of disease progression (as stated previously) within one year, we used receiver operating characteristic (ROC) curves and the corresponding area under the curve. The cut-off value for the test was selected based on an analysis of the tabular ROC curve data in order to obtain the best possible sensitivity and specificity. Furthermore, the cut-off value was used to investigate whether the presence of an increase, as determined using the cut-off value, after six months or 12 months was related to the overall survival.
The rates of overall survival were estimated using the Kaplan-Meier method and compared using the log-rank test. The patients were divided into groups based on %FVC at six months and the degree of disease progression according to the HRCT fibrosis score, and the overall survival was studied. All statistical analyses were performed using the Statistical Package for Social Sciences (SPSS, version 19). All tests were performed at a significance level of p < 0.05.
Discussion
In the present study, we first clarified that changes over time in HRCT findings in combination with the %FVC predict the prognosis of patients with IPF. In addition, our data demonstrated that the new HRCT scoring system is helpful for identifying patients with an adverse prognosis when used in combination with pulmonary function examinations. These results indicate that the new HRCT scoring system is an appropriate and subjective method for monitoring IPF patients.
In the present study, the %FVC was found to be a baseline factor predicting the prognosis of patients with IPF. In addition, the patients with a decline of ≥ 10% in the absolute value of FVC six months after the initial diagnosis (the “worsened” status group) had a poor prognosis in this study. Furthermore, these patients demonstrated a tendency toward further declines in the %FVC over the subsequent six months and to have a worse prognosis. On the other hand, the vast majority of the patients in the present study (93.9%) exhibited an “improved” or “stable” status six months after the initial diagnosis and tended to not show any changes in the %FVC over the following six months. Changes in the %FVC are an indicator of disease progression as a surrogate endpoint for overall survival [
15‐
17]; therefore, pulmonary function examinations are commonly used to monitor patients with IPF. However, the clinical course of IPF varies widely [
18,
19], and some patients may exhibit a sudden decrease in their pulmonary function [
20]. The variety in the clinical course of patients with IPF makes it difficult to identify those with a poor prognosis based on the results of pulmonary function tests alone, and it is therefore necessary to establish a proper monitoring method enabling clinicians to identify the patients most likely to have a poor prognosis from a different perspective.
Previous reports regarding the relationship between HRCT findings and survival rates in IPF patients have been published [
21‐
23]. The CT visual score [
24] and fibrotic score [
2,
25] determined using various software tools [
26] to evaluate findings of reticular abnormalities and honeycombing are useful for assessing the prognosis of IPF. Furthermore, some reports [
27,
28] have evaluated changes in HRCT findings in patients with IPF over time using a scoring system. In the present study, we used the new HRCT scoring system-based grading scale. The new HRCT scoring system was designed to reflect all findings of the UIP pattern described in the 2011 IPF guidelines. The grading scale reflects the progression of pulmonary fibrosis. Additionally, we graded the HRCT findings in order of priority from honeycombing to traction bronchiectasis, reticular abnormalities and normal attenuation. Furthermore, we set the monitoring period at six-month intervals in order to accurately investigate the appropriate duration for monitoring disease progression in patients with IPF.
The HRCT fibrosis score on the initial diagnosis was not found to be a factor predicting the prognosis in this study, although the extent of changes in the HRCT fibrosis score within at least the first six months after the initial diagnosis did reflect the prognosis, and acceleration in the findings of pulmonary fibrosis on radiology had an impact on the final outcome. In terms of the extent of acceleration of fibrosis, the number of fibroblastic foci has previously been shown to have an impact on the prognosis in pathological studies [
5,
29,
30] and is an important factor associated with the clinical state of IPF. Additionally, the combined evaluation of radiological assessment within the first six months after the initial diagnosis with a change in %FVC facilitated the extraction of patients with a poor prognosis, who were otherwise hidden within the group with stable %FVC values. In other words, among the patients with IPF diagnosed using only HRCT, the %FVC was found to be a baseline factor indicating the prognosis, and pulmonary function examinations (as a monitoring method) in combination with the assessment of HRCT findings were useful for inferring the detailed prognosis.
From the standpoint of early detection of concomitant lung cancer [
31,
32], it is important to avoid overlooking small lung cancer lesions, even on routine HRCT (instead of incremental CT). Furthermore, in line with the 2011 revisions to the guidelines, it is anticipated that diagnosing IPF using only HRCT will become more common in the future, as this modality also facilitates the determination of responsiveness to treatment [
33], and it is believed that HRCT examinations will continue to play a pivotal role in the diagnosis and management of IPF.
There are several limitations associated with this study. First, the design study was retrospective, and the treatments given to the patients were not identical, which may have influenced the assessments of the prognosis. Second, only 19.1% of the patients were allocated to the group in which a decrease of ≥ 10% in the absolute value of FVC was observed 12 months after the initial diagnosis (the “worsened” status group). This group is considerably smaller than that observed in previous studies, including a proportion of 36.9% in a Japanese study of pirfenidone [
34] and 41.9% in a study of etanercept [
35]. The fact that all patients who died within one year of the initial diagnosis were eliminated in the present study may be a further limitation. Third, the diagnosis of IPF was restricted to patients in whom the UIP pattern was diagnosed using HRCT and did not include those with a possible UIP pattern. In other words, IPF patients requiring pathological consideration were not included, and the study therefore does not reflect the entity of IPF as a whole. Fourth, this work was carried out jointly across multiple facilities, restricted to Japanese IPF patients. For this reason, when interpreting the results, it is necessary to consider potential racial selection bias, etc. Finally, this study was primarily based on HRCT findings obtained using a visual score, with good interobserver variation for the assessment of the presence/absence of each HRCT finding. However, among the various radiological factors, honeycombing has previously been reported [
36] to have an insufficient rate of concordance, and radiological changes in this feature over a one-year period are small; thus, it is necessary to pay attention to inaccuracies in the manual scoring system used for the radiological CT examinations.
Acknowledgements
This research was partly supported by a grant to the Diffuse Lung Diseases Research Group from the Ministry of Health, Labour and Welfare, Japan and was a Ministry of Education, Science, Sports and Culture Grant-in-Aid for Scientific Research (B), 2013–2014 (25860665, Keishi Oda).
The authors thank all test personnel for their work during the data collection at the four institutions involved in the study: University of Occupational and Environmental Health, Japan, Steel Memorial Yawata Hospital, Kirigaoka Tsuda Hospital and Kyushu Rosai Hospital. The authors also thank Drs. Takeshi Johkoh, Chiharu Yoshii, Yukiko Kawanami, Koji Kamada, Shingo Noguchi, Naoyuki Inoue, Kentarou Akata, Kanako Hara, Kaori Kato and Tetsuya Hanaka for their cooperation in this research.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
KO, HI and KYat made substantial contributions to the conception and design of the study. KO, TO, TI and TT acquired the data. TO, KN and HN analyzed and interpreted the data. KO, HI, KYat, KYam, TK and HM participated in drafting the article and critically revising it for important intellectual content. All authors have read and approved the final manuscript.