Background
Chronic obstructive pulmonary disease (COPD) is currently the third leading cause of death [
1]. The disease is under genetic and environmental control with cigarette smoking being the major modifiable risk factor in the Western world [
2]. COPD is characterized by chronic irreversible airflow limitation that is often accompanied by systemic inflammation [
3,
4]. The two main morphologic phenotypes of COPD are small airway obstruction and emphysematous destruction and enlargement of airspaces. While the molecular mechanisms underlying the two processes may be different, COPD is diagnosed and assessed using lung function parameters; the most commonly used are the forced expiratory volume in 1 s (FEV
1) and its ratio with the forced vital capacity (FEV
1/FVC).
There is a huge unmet clinical need to identify clinically useful biomarkers for COPD [
5]. To this end, blood biomarkers would be highly desirable since blood is very accessible. However, the main limitation of blood as a source for biomarker discovery is that its signals may not reflect the disease process in lungs, which are the predominant site of disease in COPD. Recently, a number of studies have evaluated the relationship of gene expression profiles in peripheral blood with COPD endpoints and have demonstrated some signal [
6,
7]. One major limitation of using gene expression data for biomarker discovery is the requirement for statistical stringency in determining significant expression changes. However, biologically, this traditional approach lacks intuition since genes are expressed (and function) in clusters or networks rather than as independent entities.
To address this limitation, in this study, we used weighted gene co-expression network (WGCNA) [
8] to identify “modules” of co-expressed genes in peripheral blood of former smokers with COPD. We then used these modules to discover novel molecular pathways that are related to FEV
1.
Discussion
COPD is an inflammatory lung disease, which has a significant systemic component that contributes to its overall morbidity and mortality. Because inflammation is thought to play a central role in the pathogenesis of COPD, there has been a tremendous surge of interest in studying circulating immune and inflammatory cells as potential biomarkers for the disease. There is a pressing need to identify genomic signatures of disease severity and activity that can guide therapeutic decisions and address the growing burden of COPD worldwide. In this study, we used modules of co-expressed genes in a highly accessible tissue, peripheral blood, to identify genomic signatures of COPD severity using FEV1 as the readout.
The main findings of the present study were that: 1) At the gene level, only one gene was associated with FEV1 (FDR < 0.1); 2) the 18,892 genes expressed in peripheral blood mapped to 17 modules of co-expressed genes; 3) three of the modules were associated with FEV1, 4) in a second and larger cohort of current and former smokers with COPD and controls, all of the modules were preserved at the co-expression level, 5) the three modules in the discovery cohort that were statistically associated with FEV1 showed the strongest associations with FEV1 in the replication cohort (P < 0.05), 6) the two modules, which were negatively related to FEV1, were enriched in IL10 and IL8 pathways and were strongly correlated to neutrophil cell-specific expression, while the positively related module was enriched in DNA transcription pathways and strongly correlated to T cell specific expression.
Previous studies investigating differential expression in COPD have mainly tested genes and probesets individually; however, in vivo, genes are co-expressed in networks. By leveraging co-expression patterns, networks of closely co-expressed genes can be identified, often revealing novel functional pathways. The resulting network modules can then be tested for differential expression with FEV1. Another major advantage of network analyses is that this approach can significantly decrease false negatives (Type II error) by markedly reducing the number of features that are tested. In the present study the three modules reproducibly associated with FEV1 were enriched in biological pathways suggesting that co-expressed genes share biological functions within a particular module.
In each of the co-expressed networks, driver or “hub” genes can be identified, which can additionally inform the biology of these modules as they relate to FEV
1. The top hub gene for the yellow module was
DOCK5 which is a member of the DOCK family of guanine-nucleotide exchange factors that activate Rho-family GTPases by exchanging bound GDP for free guanosine triphosphate (GTP) [
23]. DOCK5 has been shown to interact with the regulatory and catalytic subunits of protein phosphatase 2, encoded by PPP2R1A/B/C [
24]. In mice, protein phosphatase 2A has been shown to regulate innate immune and proteolytic responses to cigarette smoke exposure in the lung [
25]. The top hub gene for the green module was
GAB2 which was negatively correlated to FEV
1. GAB2 is a member of the growth factor receptor-bound protein 2 (GRB2) associated binding protein (GAB) gene family, which acts as an adapter molecule in signal transduction of cytokine and growth factor receptors, and T and B cell antigen receptors [
26]. GAB2 is the principal activator of phosphatidylinositol-3 kinase in response to activation of the high affinity IgE receptor [
27]. In a previous study, the expression of GAB2 in sputum was significantly increased in patients with severe emphysema compared to those who had minimal emphysema [
28]. In the brown module, DDB1 and CUL4 associated factor 16 (
DCAF16) and eukaryotic translation initiation factor 2 alpha kinase 3 (EIF2AK3) were the top two FEV
1 hub genes. Little is known about
DCAF16, and
EIF2AK3 encodes a protein, which functions as an endoplasmic reticulum stress sensor [
29].
Although the present study is one of the largest to date that have evaluated peripheral gene expression signature in COPD [
6], at the gene level, only one gene; butyrophilin subfamily 2 member A1 (BTN2A1) was significantly associated with FEV
1. Butyrophilin has been shown to regulate immune function [
30]. In contrast to gene-by-gene comparison approach, the use of network based modules identified a larger number of genes within the three significant modules which were related to FEV
1 highlighting the value of network approaches in identifying gene signatures. Previous work on exacerbations in COPD demonstrated similar findings [
31].
It is notable that adjustments for cell count had a large impact on the relationship between gene expression signatures and FEV
1. This is not surprising given that peripheral whole blood is a heterogeneous tissue composed of many different immune cell subsets. Moreover, its cellular composition varies in response to physiological or pathological processes. These processes often involve cell differentiation and/or transit of specific cell types between blood and tissues, resulting in important shifts in the cellular makeup of samples under different conditions affecting blood-derived gene expression data. Disentangling causal from reactive relationships is challenging in observational studies. Although it is common practice to statistically adjust for peripheral blood cell composition by including CBC and differential cell counts as covariates, regression methods do not fully take into account cell-specific gene expression and thus may obfuscate important cell-specific signatures. To explore this possibility, in the present study, in addition to the standard regression analysis, we interrogated cell-specific gene expression in three external studies that contained cell-specific gene expression data that were generated by using cell isolation methods. Using this approach, we found that the two modules which were negatively associated with FEV
1, contained strong neutrophil-specific gene expression, suggesting that increased number and/or activation of peripheral neutrophils is associated with airway obstruction. The role of neutrophils in the pathogenesis of COPD is well established [
32,
33]. The module that was positively related to FEV
1, on the other hand, contained gene expression signals that were T and B cell specific. Previous studies have highlighted the role of the adaptive immune response in COPD [
34‐
37].
The current study has a number of limitations. First, gene expression signatures in peripheral blood may not reflect disease process in lungs of COPD patients. However, peripheral blood is more accessible than lung tissue and may provide information on biological processes such as immune responses that may be relevant in COPD. Second, FEV
1 may not fully capture disease activity in COPD and could reflect different pathological processes (emphysema or airway disease). Finally, the cell count adjustment had a large effect on the relationship between modules and FEV
1. Given that changes in cell abundance can be causally related to changes in FEV
1 and disease status [
38,
39] and given the strong correlations with cell specific expression in external datasets, the regression methods used for adjustment may have been overly conservative. Most published studies to date on peripheral blood in COPD do not adjust for cell count [
6,
31,
40]. Future studies are warranted that incorporate differences in cell counts and/or measurement of cell specific expression changes.
Acknowledgement
Ma’en Obeidat is a Postdoctoral Fellow of the Michael Smith Foundation for Health Research (MSFHR) and the Canadian Institute for Health Research (CIHR) Integrated and Mentored Pulmonary and Cardiovascular Training program (IMPACT). He is also a recipient of British Columbia Lung Association Research Grant.
Competing interests
BEM is an employee and shareholder of GSK.
SR has served as a consultant, participated in advisory boards, received honorarium for speaking or grant support from: American Board of Internal Medicine, Advantage Healthcare, Almirall, American Thoracic Society, AstraZeneca, Baxter, Boehringer Ingelheim, Chiesi, ClearView Healthcare, Cleveland Clinic, Complete Medical Group, CSL, Dailchi Sankyo, Decision Resources, Forest, Gerson Lehman, Grifols, GroupH, Guidepoint Global, Haymarket, Huron Consulting, Inthought, Johnson and Johnson, Methodist Health System – Dallas, NCI Consulting, Novartis, Pearl, Penn Technology, Pfizer, PlanningShop, PSL FirstWord, Qwessential, Takeda, Theron and WebMD. Since August 10, 2015 he has served as chief clinical scientist, new clinical development, AstraZeneca, UK.
DDS: Over the past 3 years, DDS has served as a consultant on AstraZeneca (AZ) and Novartis Advisory Boards for COPD. He has been a consultant with Amgen and Almirall. He has received research funding from AZ and Boehringer Ingelheim (BI). He has given lectures sponsored by BI and AZ.