Background
Idiopathic pulmonary fibrosis (IPF) is a chronic interstitial lung disease characterized by progressive scarring of the lung parenchyma associated with a steady worsening of respiratory symptoms and a decline in pulmonary function, ultimately leading to death [
1]. IPF is currently treated with systemic antifibrotic drugs, such as pirfenidone and nintedanib, which have been shown to delay the progressive decline of lung function and to reduce mortality [
2]. However, neither pirfenidone nor nintedanib block or reverses the progression of IPF. Although therapeutic interventions targeting the immune response, inflammation, or oxidative stress have been attempted, none have been proven to be successful, and no effective pharmacotherapy for IPF exists yet [
3‐
5]. These unsatisfactory results in drug development might be partially ascribed to the complexity and heterogeneity of IPF. The active cellular components and mechanistic features of inflammation and fibrosis are mixed, and their levels differ depending on the disease status or anatomical lesions [
1,
6,
7]. Further, IPF is associated with diverse clinical progressions, from an asymptomatic stable state to gradual progressive respiratory failure or rapid deterioration of respiratory function through acute exacerbation [
8].
Molecular subgroups exist within some fibrotic diseases in which different clinical phenotypes or outcomes are presented [
9‐
12]. Gene expression profiling has provided insights into the pathogenesis of IPF [
6,
8,
13,
14]. In previous pilot studies, molecular signatures from lung parenchyma proved helpful in predicting the likelihood of disease progression and therapeutic responsiveness [
13,
15]. However, because all previous studies have been performed with cohorts of variable sizes and with different patient backgrounds, technical protocols, and technologies, direct comparisons between datasets and results are not feasible. A comprehensive integrated analysis of a compendium compiled using genome-wide datasets could reduce dataset bias, capture features missing from previous studies, and detect key factors driving the disease.
In this study, we compiled lung tissue transcriptome datasets from public data repositories to establish an IPF compendium and characterized the cellular and molecular features in detail. The samples were separated using data-driven, unsupervised clustering methods, and the clustered subgroups were subjected to prognostic profiling. Finally, we employed an integrative network-based approach and Bayesian inference to identify key drivers of the disease and evaluated the impact of current and investigational drugs in the context of the disease module.
Discussion
In the present study, we built a comprehensive transcriptomic compendium of IPF lung tissue and performed an integrative analysis to better understand the relationship between cellular and molecular expression patterns and clinical parameters. An unsupervised cluster analysis of the IPF transcriptomic profiles yielded two subgroups with different cellular and pathologic activities and prognostic risk profiles. Finally, we identified KDGs and molecules that may serve as promising targets for therapeutic intervention based on network-based Bayesian inference.
Biological functions in living organisms are orchestrated by the cooperative interactions of genes, proteins, and chemical compounds. Likewise, a complex disease is rarely the consequence of an abnormality in a single gene but rather results from the aberrant activation of pathways or disease modules by dysregulated genes and their linked neighbors. We identified the p53 signaling pathway as the most significant dysregulated pathway in IPF, and its enrichment score had a close correlation with cellular senescence and apoptosis in the lung tissue of IPF patients. This finding aligns with previously reported upregulation of p53 and activation of the p53 signaling pathway in response to the proliferation and hyperactivity of AECs leading to AEC apoptosis and senescence, the pathologic hallmarks of IPF lungs [
1,
54,
55]. IL-17 was also localized to active area of IPF and profibrotic roles of IL-17 in the pre-clinical models were well-documented [
56]. Given that the p53 and IL-17 signaling pathways were closely correlated with FVC and DL
CO, p53 and IL-17 signaling pathway activity could be an intriguing biomarker bridging the mechanistic feature and clinical condition in IPF.
We identified two novel molecular subtypes of IPF using an unsupervised clustering method. C1 subgroup was more enriched with p53 signaling pathway and ciliated cell signature, indicating further transformation into fibrosed structural change, honeycombing state. Histologically, honeycomb cysts are lined with ciliated cells that express a variety of epithelial markers [
57] and this result are consistent with the previous results [
13,
48]. Interestingly, C1 highly expressed
MMP7,
MUC16, and
SPP1, which were powerful predictors of IPF progression [
15,
50]. This underscores a close relationship between the p53 signaling pathway, AEC senescence, and progressive disease in IPF, suggesting the need for a stratified approach to patient management based on the molecular signature of lung tissue. Cellular senescence, oxidative stress, and profibrosing signaling pathways including FoxO-, PI3K-Akt-, TGFβ-, and Wnt signaling pathways were more enriched in C2 subgroup, and molecular signatures of AECs were also stronger, indicating that C2 subgroup is less advanced and under ongoing fibrosis. If the fibrotic processes are effectively controlled, it is presumed that C2 subgroup has partial reversibility.
Current IPF therapies cannot effectively modify the disease’s clinical course and their efficacy is inconsistent, although the anti-fibrotic drugs pirfenidone and nintedanib have demonstrated to ability to significantly slow respiratory deterioration in some IPF patients [
1,
2]. Their limited and heterogeneous efficacy might be partly ascribed to a failure to optimally target pathways that will disrupt the IPF disease module. Therefore, we constructed differentially expressed and probabilistic causal gene networks to model molecular interactions and causal gene relationships and applied a Bayesian network-based analysis to identify key drivers of the IPF disease module.
CDK1,
CDK2,
CDKN1A,
CDKN2A, and
MDM2 were identified as KDGs involving both the p53 signaling pathway and cellular senescence. To better understand the KDGs in the disease module network, we calculated their diffusion scores. Fibronectin (FN1) is responsible for mediating cell–matrix adhesion and is essential in driving myofibroblast differentiation. Inhibition of FN1 deposition attenuated fibrosis in hepatic and cardiac fibrosis models [
58,
59]. FN1 was highly ranked in our analysis but seemed to be of low druggability because it is an end product of multiple fibrosing pathways deposited at an extracellular matrix. In contrast, CDKs and HDACs are fascinating targets because they are high-priority drivers, and drugs targeting these genes are currently in use or under clinical trials as anti-cancer agents. The unbalanced proliferation and profibrosing activity of AEC2 and fibroblasts is a key initial event in the pathogenesis of IPF [
1]. CDKN1A, also known as p21, is a physiologic CDK antagonist under the control of p53 and was also identified as a KDG. In the bleomycin-induced pulmonary fibrosis model, the forced expression of p21 exerted both anti-apoptotic and anti-fibrotic effects [
55,
60]. HDAC inhibitors are known to cause cell-cycle arrest by inducing CDKN1A or inhibiting CDKs and effectively suppress profibrotic fibroblast phenotypes IPF, notably offering better performance than that of pirfenidone [
61]. These results could be more applicable to the high-risk C1 subgroup than to subgroup C2.
To evaluate the use of current drug therapies in the disease module, we compared the diffusion scores of their target molecules.
N-acetylcysteine, imatinib, and nintedanib ranked higher than did other drugs and were assumed to be more effective owing to their multiple targets. The performance of pirfenidone was likely underestimated because its exact targets and mechanisms are not clearly defined. Notably, the investigational drug UCN-01 (7-hydroxystaurosporine) showed good diffusion scores for its targets. UCN-01 targets CDK1, CDK2, and CHEK1, the main components of the p53 signaling pathway and cellular senescence, and reactivates FoxO3 to control its inappropriate proliferation and differentiation [
52,
53]. In particular, UCN-01 showed great promise in the pre-clinical IPF model by reverting the IPF myofibroblast phenotype in vitro and blocking the bleomycin-induced lung fibrosis in vivo [
53].
This study had several limitations. First, the combination of multiple datasets inevitably caused the loss of genes that overlapped only among some datasets, and the correction of the batch effect was not ideal. Second, we did not address the association with clinical factors, such as radiographic pattern or fibrosis score, due to the lack of this information. Third, minority signatures of specific cell subsets might have been diluted because the gene expression signature was at the bulk tissue level. Fourth, the datasets did not provide detailed background medications for individual patients. Surgical lung biopsy protocol [
62,
63], the fresh lung area that best represents the disease should be biopsied for reliable results. Lung tissue under ongoing fibrosis reflects the current pathologic status rather than the response to treatment. Although some molecular signatures could be susceptible to the effect of current or past treatments and potentially biased, it is considered that they would not be enough to overturn our finding of the overwhelming significance of the p53 signaling pathway.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.