Introduction
Chronic obstructive pulmonary disease (COPD) is a progressive lung disease characterized by chronic inflammation, airway obstruction, and destruction of the parenchyma. It is the fourth leading cause of death globally, and is projected to be the third leading cause of mortality by 2030 [
20]. Current therapies for patients with advanced COPD mainly treat symptoms such as chronic cough and excessive sputum production, as well as prevent disease progression. However, 46–91% of adults still suffer from persistent and disabling breathlessness at rest and on minimal exertion [
25]. To date, no therapy has been developed for reducing disease progression and lower mortality rates. Therefore, additional approaches for accurate diagnoses of advanced COPD are urgently needed.
In 2011, the Society for Qualification of Biomarkers for COPD was established to accelerate research and development of biomarkers. To date, however, only a handful of biomarkers associated with COPD have been discovered [
21]. Integrated multi-omics data analysis can provide insights into the pathological mechanisms of COPD. Analysis of the proteome can provide studying disease-related mechanisms and diagnostic biomarkers, which reveals disease phenotype [
21]. Compared to traditional proteomic techniques, TMT-LC–MS/MS is a more comprehensive and efficient method for capturing and quantification of proteins, with a smaller sample requirement without offset. In addition, the metabolome, which is defined as the total collection of small molecular metabolites present in a given type of cell or organism, is the final downstream product of metabolism. Particularly, it provides an exact reflection of the current metabolic status of the organic body. To date, some progress has been made in the fields of functional proteomics and metabolomics. For example, researchers have applied proteomic approaches to identify novel biomarkers, such as plasma sRAGE for detecting presence and progression of emphysema [
33], whereas others have adopted metabolomics approaches to identify potential disease severity markers or therapeutic candidates such as purines [
8], sphingolipids [
2], and glycerol phospholipids [
4]. However, no discovery-based approach has yet resulted in validated clinical biomarkers. Although findings from these omics-centric studies have added to the existing knowledge base, there are several gaps that are yet to be filled. We hypothesize that integrating contemporary proteomics and metabolomics approaches can effectively evaluate metabolic pathways and diagnostic biomarkers in advanced COPD. Moreover, most of the previous multi-omics studies have focused on patients derived from European, American, and African populations [
24]. Therefore, it is important to systematically analyze the metabolic and proteomic profile of Chinese patient cohorts to generate new insights for this region.
Patients with COPD are often predisposed to various co-morbidities, such as cardiovascular disease, metabolic syndrome, and bronchiectasis [
15,
20,
22]. Additionally, smoking is a risk factor for such co-morbidities, with previous evidence showing that some smokers develop a predominately emphysema phenotype, characterized by alveolar damage, while others developing predominantly airway disease. Evidence from other studies has shown that proteases, inflammation, oxidative stress, immune defects, and infections play a role in the development and progression of COPD [
25]. Since COPD is a heterogeneous disease, grading the severity and identifying phenotypes according to the concomitant diseases (i.e., subpopulations of subjects with similar disease characteristics) can expand our understanding of the biological mechanisms underlying the disease’s development and progression. This will facilitate accurate diagnoses of the disease. Particularly, lowering mortality rates in patients with advanced COPD relies on early and accurate diagnosis and differentiation of different subtypes using simple and objective diagnostic assessments. The heterogeneity of COPD also exists at the molecular level, and thus molecular sub-phenotyping is the first and crucial step in the identification and classification of these subgroups. Previous studies have shown that omics approaches, based on appropriate sample sizes, can not only efficiently reveal heterogeneity of these subtypes but also facilitate diagnosis and reveal the exact mechanisms underlying COPD subgroups [
22,
24]. Proteomics techniques, based on mass spectrometry, have shown strong power in detecting disease phenotypes.
In this study, we hypothesized that changes in proteomic and metabolic profiles of patients with stable COPD would produce a unique pattern of molecules compared to those without COPD, and that these molecular profiles would change with disease complications. Therefore, we first performed quantitative shotgun proteomic analyses to investigate COPD-related proteins molecular portrait and reveal COPD-related functional modulation. Next, we applied a targeted proteomics approach to validate specific members of dysregulated proteins in another independent sample set. In addition, untargeted metabolomics was performed using the same participants as the proteome. Our findings not only reveal the profiles of COPD biomarkers and molecular subtypes, but also provide data that will guide future studies seeking to develop tools for clinical application.
Discussion
Globally, COPD kills more than 3 million people every year. Although several advances have been achieved in the symptomatic treatment and prevention of acute clinical cases, there are few interventions for ameliorating disease progression or decrease mortality. Therefore, it is important to identify biomarkers that can predict disease occurrence or aid in diagnose of advanced COPD. This will facilitate early intervention and prevent progression. In this study, we found that a combination of theophylline, palmitoylethanolamide, hypoxanthine, and CDH5 provides a high diagnostic accuracy. Proteomics facilitates the differentiation of COPD from COPD with co-morbidities. We also found that basophil count could effectively distinguish COPD from COPD-BE or COPD-MD. Moreover, hypoxanthine was still significantly different between mild-to-moderate COPD and controls.
In clinical practice, plasma or serum is the most widely used specimen for biomarker discovery because proteins/metabolites in the circulatory system likely reflect disease pathophysiology. Our dataset can be used to identify potential predictive biomarkers of advanced COPD. Theophylline and the other three methylxanthine derivatives (aminophylline, etophylline, and caffeine), are the first four compounds to have been approved for use in clinical practice [
12]. Among them, as bronchodilators, theophylline is the most effective and is widely used for the treatment of asthma and COPD. Evidences showed that corticosteroids and theophylline, both in low doses, have synergistic and clinically useful anti-inflammatory effects in COPD [
26]. The underlying molecular mechanisms suggest that this happens through theophylline increasing the activity of the nuclear enzyme histone deacetylase-2 (HDAC2), which is decreased in COPD, therefore preventing the anti-inflammatory effect of corticosteroids [
1]. Scientists have identified that low-dose theophylline, especially below those which lead to bronchodilatation, can reverse corticosteroid insensitivity in COPD [
9,
26]. Another study has demonstrated an effect for low-dose theophylline on the forced expiratory volume in one second (FEV1) as well as exacerbations [
37]. The metabolic disposition of theophylline in humans was first reported by Brodie et al. [
3]. Following a therapeutic dose, only 85% has been accounted for by measurement of known metabolites, and unchanged drug excreted in urine. Therefore, about 10% of theophylline administered to man appears in urine in an unchanged form. This would be one of the main sources of theophylline in the body, and the main reason for deviations between patients and controls. It may also explain why there were no differences in theophylline between mild-to-moderate COPD and control in the present study. In addition, as one of the methylxanthines, theophylline is also a natural and synthetic compound found in tea, most of which is metabolized by some types of bacteria and fungi, some of which exist in blood circulation in the human body [
35]. However, the information about tea drinking was lacking in this study. This need to be investigated in the future.
Hypoxanthine is a product of ATP degradation, and its conversion to uric acid is facilitated by the enzyme xanthine oxidase, generating free oxygen radicals [
5]. It is a metabolite that is involved in purine biosynthesis and nucleotide metabolism, and often serves as a biomarker. For instance, hypoxanthine is a potential marker for oxidative stress in cystic fibrosis [
31]; a combination of eight metabolites including uric acid, stearic acid, threitol, acetylgalactosamine, heptadecanoic acid, aspartic acid, xanthosine and hypoxanthine were found to accurately diagnose asthma while discriminating between healthy control and asthma subgroups. In preschool children with cystic fibrosis, hypoxanthine concentrations were found to be elevated in BALF from lobes of the lung containing localized bronchiectasis and were correlated with neutrophil counts and important clinical outcomes [
7,
32]. Elevated hypoxanthine concentrations in various body fluids are as a result of vital tissue hypoxia. For mild-to-moderate COPD, higher level of hypoxanthine has also been demonstrated, and this might explain that tissue hypoxia exists in COPD at early time. Suppressed serum hypoxanthine levels have been reported in lung cancer [
14] and cystic fibrosis lung disease [
17]. Increased conversion to uric acid during exacerbation, may result in a reduction in the concentration of hypoxanthine, generating superoxide and hydroxyl radicals in which cause cellular damage. However, this phenomenon needs to be investigated in COPD.
Vascular endothelial cadherin 5 (CDH5), an endothelial specific cell–cell adhesion molecule, plays important roles in the formation, maturation, and remodeling of the vascular wall [
10]. RAB26 is a newly identified small GTPase involved in regulation of endothelial cell (EC) permeability [
6]. It confers protective effects on EC permeability, which is in part dependent on autophagic targeting of active SRC, and the resultant CDH5 dephosphorylation maintains adherent junction stabilization. During inflammation, CDH5 phosphorylation at tyrosine residues induces opening of endothelial adherent junctions [
30]. Post-translational modifications of CDH5 at tyrosine residues are involved in vascular permeability and leukocyte transmigration. Moreover, cell surface CDH5 phosphorylation is directly linked to EC barrier integrity. These results suggest that any change in CDH5 will impact endothelial barrier functions at multiple levels and CDH5 inhibition may lead to a marked increase in permeability [
11,
27]. Enhanced permeability is an early step in the angiogenic process, enabling endothelial migration out of the primary vessel in order to format the tumor neovasculature in the next [
18]. Moreover, induction of CDH5 during epithelial mesenchymal transformation accentuates breast cancer progression via TGF-β signaling, indicating that in certain tumor cells, CDH5 can induce cellular responses that counteract its inhibitory role in cell–cell contact growth in EC [
16]. Therefore, CDH5 has two functions in angiogenesis and cancer progression. Smoking, a key factor that regulates COPD development, causes hypoxia, which is an important driver of angiogenesis which participates in the pathogenesis of COPD.
COPD is a heterogeneous condition that presents the opportunity for precision therapy based on more precise disease subtypes. Subtype directed therapies, such as inhaled corticosteroids for patients with frequent exacerbations, have had only moderate success. This is likely due to imprecise phenotype categorization, the limited number of drugs for treating COPD, and the generally modest effects of most of these drugs. It is, therefore, crucial to provide precise therapies for patients with specific COPD subtypes based on specific biomarkers. Since comorbidities have a tremendous impact on the prognosis and severity of COPD, the 2015 American Thoracic Society/European Respiratory Society (ATS/ERS) Research Statement on COPD urgently called for studies to elucidate on the pathological mechanisms involved in the association between COPD and its comorbidities. Since comorbidities have influence the clinical outcomes of COPD, identification of the mechanisms linking COPD to its comorbidities is key to developing effective therapies. Presently, it has not been established whether BE or MD is an independent co-existing condition or a direct consequence of progressive lung pathology in COPD patients. In this study, we developed a pipeline for proteomic dominated subtyping of COPD, which complements subtyping approaches based on clinical or imaging data [
23,
29], as well clustering by omics in Chinese. In particular, based on proteomics results, COPD patients were grouped into three clusters according to prominent molecular features, including simplex COPD, COPD-BE, and COPD-MD. To further differentiate the disease subtypes, we identified that COPD-MD is highly involved in complement and coagulation cascades processes, and was enriched with various proteins, including HP, LBP, SERPINA1, SERPINA3, SAA1, ORM1, ORM2, and CRP. COPD-BE participates in complement and coagulation cascades processes, and is enriched with various proteins, including metabolic pathways, biosynthesis of antibiotics, carbon metabolism, biosynthesis of amino acids, and glycolysis/gluconeogenesis. Moreover, SOD1, PRDX2, CAT, PRDX6, HBB, GSTO1, and HBA1 were highly expressed in COPD-MD. Since advanced COPD possess unique metabolic pathways and typically express protein isoforms that may have special functions, proteomic approaches for studies of metabolic pathways are especially important.
This study has some limitations. First, no follow-up investigation of the same participants was carried out. Further multi-center and longitudinal studies are need to the prediction performance of the identified biomarkers in advanced COPD. Second, this was a retrospective study, therefore, laboratory tests might be underestimated in medical records, making it difficult to explore their effects on outcomes. Moreover, information on medication, disease control status, and disease phenotypes before admission were incomplete. The impact of these factors on disease expression should be further evaluated. Third, the study population was relatively small. Thus, large prospective studies should be performed to validate the present findings. Finally, although traditional methods, such as logistic regression used in this study, are often used to establish prediction models, it has been suggested that Artificial Intelligence (AI) based machine learning (ML) approaches may be more accurate than traditional logistic regression. This is because AI-based ML can overcome many of the disadvantages of conventional statistical approaches used for analyses of high-volume next generation sequencing data. For instance, ML does not require full details of sequencing measurements and can extract features from sequences [
28]. Therefore, ML approaches should be considered in further studies.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.