Background
Crohn’s disease (CD) is a type of chronic and relapsing inflammatory bowel disease (IBD) that affects the gastrointestinal tract and is accompanied by extraintestinal manifestations and perianal diseases [
1]. Although the etiology of CD remains unclear, a complex interplay between genetic variation, environmental factors, immune dysfunction, and intestinal microbiota is believed to underlie disease pathogenesis [
2]. Unraveling the complexity behind this interplay may provide crucial insights into CD pathogenesis and expose potential targets for therapeutic interventions and disease prevention.
Oxidative stress (OS) is defined as an imbalance between oxidants and antioxidants in favor of the oxidants leading to a disruption of redox signaling and control and/or molecular damage [
3]. Multiple OS-related genes contribute to the complex multifactorial pathophysiology in CD [
4,
5]. For example, genetic polymorphisms of the inducible nitric oxide synthase gene (encoded by
NOS2A) are associated with IBD susceptibility accompanied by increased gene expression, suggesting a vital role of genetic effects on OS genes in CD [
6]. The nicotinamide adenine dinucleotide phosphate oxidase genes
NOX1 and
DUOX2 also play a key role in mediating reactive oxygen species (ROS) generation. Overexpression of these genes is involved in impaired intestinal barrier integrity, microbial dysbiosis, and bacterial invasion, highlighting the association between host OS signaling and gut microbiota in CD [
7‐
10]. Moreover, DNA methylation (DNAm) modulates redox homeostasis by regulating gene expressions of
NRF2,
HIF1A, and related proteins [
11‐
14]. However, few studies have addressed whether OS has a causative role in triggering CD or merely inflicts collateral tissue damage alongside intestinal inflammation. Studying the underlying disease mechanisms of OS-related genes may help identify potential pathogenetic factors and redox-related therapeutic targets for IBD [
15].
Although a growing number of studies have suggested relevant OS genes in CD, no study has comprehensively and systematically identified their potential causal association with this disease. Genome-wide association studies (GWASs) have been employed to identify genomic loci containing OS genes associated with CD [
16,
17]. However, the top associated variants may not be causal because of the complicated linkage disequilibrium (LD) structure of genomes [
18,
19]. Moreover, these genetic variants can potentially regulate DNAm, gene expression, protein levels, and the abundance of gut microbiota [
20,
21]. Integration of multi-omics is an emerging approach in the post-GWAS era to identify critical regulators for exploring therapeutic targets in CD [
22]. For example, summary data-based Mendelian randomization (SMR) that integrates IBD GWAS data with expression quantitative trait loci (eQTLs) has been developed to prioritize causal variants mediated by gene expression in the blood [
20]. However, the causal OS genes in CD-affected tissues and their interactions with gut microbiota are poorly understood [
23‐
25].
This study presents a multi-omics-based Mendelian randomization (MR) study to identify the putative causal effects and molecular mechanisms of OS genes in CD using blood and intestinal tissues. A sizable intestinal transcriptome meta-analysis was performed to identify differentially expressed CD-related OS genes. Utilizing SMR methods, we integrated the largest CD GWAS summary statistics with eQTLs and DNA methylation QTLs (mQTLs) in the blood. Furthermore, up-to-date intestinal eQTLs and fecal microbial QTLs (mbQTLs) were first integrated into the current analysis to uncover the potential interactions between host OS genes and gut microbiota. Two additional MR methods were used as sensitivity analyses to test the heterogeneity. Finally, the putative results were then partially replicated in an independent multi-omics cohort.
Discussion
To the best of our knowledge, this study is the first to leverage a multi-omics integration method to detect putative causal OS genes and the underlying mechanisms in CD using blood and intestinal tissues. We identified 438 OS-related DEGs out of 817 potential genes in CD in a sizable meta-analysis of intestinal transcriptome data and successfully validated these DEGs in our cohort. Integration of GWAS with the eQTLs and mQTLs of these DEGs from the peripheral blood prioritized five putative OS genes and their regulatory elements associated with CD onset: BAD, SHC1, STAT3, MUC1, and GPX3. Moreover, the integration of intestinal eQTL data also identified five candidate causal genes, of which MUC1, CD40, and PRKAB1 were involved in intestinal gene–microbiota interactions through further colocalization analysis. Differentially expressed OS genes in CD can either be a cause or a collateral effect of intestinal inflammation; our study is therefore fundamental as an attempt to fill the gaps in our understanding of discriminating between either causally or remotely involved OS genes and pinpointing the relevant interactions in CD in a genomic context.
Recently, more and more blood-based biomarkers are being used to diagnose, monitor, and predict IBD activities, and blood tissue may serve as a valuable proxy in terms of characterizing genetic effects on gene expression and understanding the complex etiology of IBD. Using an SMR analysis with blood tissue, we detected five putative causal associations between OS genes and CD susceptibility through genetically epigenomic and transcriptomic regulation, suggesting a vital role of epigenetic factors and gene expression in the disease onset. Among these genes (
BAD,
SHC1,
STAT3,
MUC1, and
GPX3), the causal roles of
STAT3 and
MUC1 have been extensively characterized in CD [
69‐
71]. For instance,
STAT3 plays a crucial role in many cellular processes, including cell growth and apoptosis in response to cellular stimuli, and has been regarded as a CD susceptibility gene according to previous GWASs [
72‐
74]. Moreover, a T cell-specific
STAT3 deletion has been reported to ameliorate dextran sulfate sodium-induced colitis in mice by reducing the inflammatory response [
75]. Importantly, our study confirmed that an increased transcript level of
STAT3 may lead to an increased CD risk (beta
SMR = 0.70). Additionally, we revealed that DNAm in enhancer regions negatively regulated
STAT3 expression, suggesting a link between DNAm,
STAT3 expression, and CD risk. More importantly, another three candidate genes lacking intensive study were identified from blood tissue that might be causal to CD:
GPX3,
SHC1, and
BAD.
GPX3 is involved in the redox-sensitive KEAP1-NRF2/ARE signaling system which is considered a pivotal target in maintaining cellular homeostasis under OS, inflammatory conditions, and pro-apoptotic conditions [
76]. To date, studies on the
GPX3 gene, which is a target gene of NRF2, have mainly focused on cancers, including colitis-associated carcinoma; for example,
Gpx3-deficient mice exhibited increased tumor number and inflammation, suggesting a protective role of
GPX3 in colitis-associated carcinoma [
77]. Similarly, our findings indicated a negative (protective) effect of
GPX3 expression on CD susceptibility (beta
SMR = − 0.15).
SHC1 is a signaling adapter molecule that is heavily understudied in CD. This gene encodes three main isoforms, and the most extended isoform (p66Shc) is a central regulator of OS in mitochondria and cells across multiple diseases [
78‐
80]. This work showed that four DNAm sites near the promoter regions were significantly associated with
SHC1 and CD, indicating a co-regulatory pattern involving multiple epigenetic regulatory elements [
81]. BAD protein is a key participant in mitochondria-dependent apoptosis and pathophysiological processes that involve the regulation of OS [
82]. Although previous CD GWAS data have identified genetic variants located nearby OS genes like
SHC1 and
BAD [
83], it remains unclear whether these genes have a causal effect on CD. Based on our SMR analysis, we hypothesize that genetic variants could regulate the expression of these genes through DNA methylation, thereby affecting CD pathogenesis.
Tissue- and cell-specific gene expression has been shown to elucidate different biological molecular mechanisms [
55,
84]. CD is a gastrointestinal disease, and studying its genetic effects on OS gene expression in the intestine using intestinal eQTLs (the most pertinent tissue type) may be more meaningful than that in the blood. As the intestinal barrier directly contacts with luminal microbes and oxidized compounds from external environment factors and senses the recurrent oxidative changes, OS genes in the intestine may be associated with CD through host–microbiota interactions. Our SMR-based analysis pinpointed
MUC1,
CD40,
PARK7,
PRKAB1, and
NDUFS1 as putative causal genes in intestinal tissue, of which
MUC1 was also of interest in the blood. However, the association between
MUC1 expression and CD differed in the blood compared to that in the intestine, suggesting tissue-specific effects during the onset of CD. Moreover, we identified novel genes in this context that might contribute to CD pathogenesis, such as
PRKAB1 and
NDUFS1. Three genes,
MUC1,
CD40, and
PRKAB1, were further prioritized when considering the interactions between host genetics and microbiota.
MUC1 encodes a vital constituent of mucus and is overexpressed and hypo-glycosylated in the development of inflammation and IBD given its role in regulating intestinal barrier function upon multiple stimuli, including OS [
71,
85‐
87]. Furthermore,
Muc1 knockout mice are resistant to dextran sulfate sodium-induced acute intestinal injury [
70]. This is consistent with our findings which confirmed that high expression of
MUC1 increases the risk of developing CD. Additionally, our study colocalized the genetic regulations of
MUC1 expression and gut microbiota. Microbial creatinine degradation and myo-inositol degradation shared genetic effects with
MUC1 expression, suggesting the potential interactions between the gene and microbiota. Creatinine supplementation is identified as a potential therapeutic treatment for IBD; creatinine can be degraded to creatine by gut microbiota [
88]. Creatinine clearance is associated with reduced inflammation and decreased fibrosis [
89]. In addition, myo-inositol derived from dietary phytate can be converted to short-chain fatty acids through colonic bacterial phytase activity [
90].
B. aciditolerans, one of the predicted pathway-related taxa, was negatively correlated with
MUC1 expression in the FAH-SYS cohort. This suggests that high
MUC1 expression accompanied by decreased beneficial microbial activities could confer an increased risk of CD. Our study also inferred that
PRKAB1, which encodes the regulatory subunit of AMP-activated protein kinase that monitors cellular energy status and responds to ROS [
91], was a CD-protective OS gene. We suggest that genetic regulation of
PRKAB1 expression is associated with CD onset (beta
SMR = − 0.30). Moreover,
E. coli and related purine nucleobase degradation pathways may interact with host
PRKAB1 expression. IBD-associated
E. coli strains have been reported to facilitate IBD flares [
92]. A negative correlation between
PRKAB1 expression and
E. coli was consistently observed in the FAH-SYS cohort. Interestingly, a recent study reported the role of
PRKAB1 agonists as barrier-protective therapeutic agents in IBD [
93]. However, further evidence based on genetic background (such as knockout mouse models) is needed to precisely explain the potential role of
PRKAB1 in CD.
Integrating multi-omics from multiple tissues enables researchers to dissect GWAS signals, such as the prioritization of genes and disease mechanisms. Peripheral blood tissue has a less direct and significant effect on CD than intestinal tissue. However, its significance in generating epigenomic, transcriptomic, and proteomic evidence for identifying causally involved genes and therapeutically relevant targets is well recognized [
19,
94,
95]. We prioritized a list of novel genes and DNAm sites for follow-up functional studies using the largest up-to-date CD GWAS and OS-targeted approach. More importantly, this is the first study providing evidence to support a causal role of OS genes interacting with the gut microbiota in intestinal tissue. Despite the moderate associations between host genetics and gut microbiota [
46], we observed common genetic regulations of intestinal gene expression and bacterial metabolic potentials. Different bacteria harboring shared genomic contents can participate in the same metabolic functions [
96,
97]. However, no individual taxa were significantly colocalized with the intestinal gene expression in the current analysis. This is likely owing to the low statistical power to detect zero-inflated taxa data in mbQTL studies [
44]. Nevertheless, we used an external multi-omics cohort to confirm the association between the expression of these genes and pathway-related bacterial abundance. In addition, microbiota detected from fecal and intestinal tissues showed considerable differences [
98‐
100], which might explain the small effect sizes from the intestinal gene–fecal microbiota associations.
Some limitations of this study warrant recognition. First, the meta-analysis of intestinal DEGs included different data resources (microarray and bulk RNA-seq with varying sample sizes) which could impose heterogeneity. However, we successfully replicated over 80% of the DEGs in an independent cohort with pronounced transcriptomic alterations of the OS gene family in patients with CD compared with the controls. Second, cell type-dependent eQTLs vary with disease progression [
101,
102]. The eQTLs from the bulk RNA-seq limited the identification of key molecular mechanisms at the intestinal cell level (enterocytes, immunocytes, fibrocytes) related to CD. Third, we only focused on the
cis-regions for OS genes in the analysis, despite the possibility that
trans-eQTL SNPs (SNP and the center of the gene > 5 Mb) may have a widespread impact on regulatory networks [
41]. Fourth, we used a Bayesian colocalization method which relies on an assumption that two straits share the same single genomic variant while the case of multiple causal variants is under-explored [
103]. Finally, functional experiments are still needed to validate our findings. Moreover, as multiple factors can influence the expression of OS genes, we believe that integrating other omics data at different molecular levels (such as those of proteins and metabolites) with large sample sizes may lead to novel discoveries and improve the characterization of putatively involved causal mechanisms of OS in CD.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.