Implicating type 2 diabetes effector genes in relevant metabolic cellular models using promoter-focused Capture-C
verfasst von:
Nicholas A. Wachowski, James A. Pippin, Keith Boehm, Sumei Lu, Michelle E. Leonard, Elisabetta Manduchi, Ursula W. Parlin, Martin Wabitsch, Alessandra Chesi, Andrew D. Wells, Struan F. A. Grant, Matthew C. Pahl
Genome-wide association studies (GWAS) have identified hundreds of type 2 diabetes loci, with the vast majority of signals located in non-coding regions; as a consequence, it remains largely unclear which ‘effector’ genes these variants influence. Determining these effector genes has been hampered by the relatively challenging cellular settings in which they are hypothesised to confer their effects.
Methods
To implicate such effector genes, we elected to generate and integrate high-resolution promoter-focused Capture-C, assay for transposase-accessible chromatin with sequencing (ATAC-seq) and RNA-seq datasets to characterise chromatin and expression profiles in multiple cell lines relevant to type 2 diabetes for subsequent functional follow-up analyses: EndoC-BH1 (pancreatic beta cell), HepG2 (hepatocyte) and Simpson–Golabi–Behmel syndrome (SGBS; adipocyte).
Results
The subsequent variant-to-gene analysis implicated 810 candidate effector genes at 370 type 2 diabetes risk loci. Using partitioned linkage disequilibrium score regression, we observed enrichment for type 2 diabetes and fasting glucose GWAS loci in promoter-connected putative cis-regulatory elements in EndoC-BH1 cells as well as fasting insulin GWAS loci in SGBS cells. Moreover, as a proof of principle, when we knocked down expression of the SMCO4 gene in EndoC-BH1 cells, we observed a statistically significant increase in insulin secretion.
Conclusions/interpretation
These results provide a resource for comparing tissue-specific data in tractable cellular models as opposed to relatively challenging primary cell settings.
Data availability
Raw and processed next-generation sequencing data for EndoC-BH1, HepG2, SGBS_undiff and SGBS_diff cells are deposited in GEO under the Superseries accession GSE262484. Promoter-focused Capture-C data are deposited under accession GSE262496. Hi-C data are deposited under accession GSE262481. Bulk ATAC-seq data are deposited under accession GSE262479. Bulk RNA-seq data are deposited under accession GSE262480.
Nicholas A. Wachowski, James A. Pippin, Struan F. A. Grant and Matthew C. Pahl contributed equally to this study.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Abkürzungen
ATAC-seq
Assay for transposase-accessible chromatin with sequencing
cRE
Cis-regulatory element
eQTL
Expression quantitative trait locus
1frag
One fragment; individual DpnII restriction fragment
4frag
Four fragments; four sequential DpnII restriction fragments concatenated in silico
FXR2
Fragile X syndrome-related protein 2
GWAS
Genome-wide association study
IBMX
3-Isobutyl-1-methylxanthine
LD
Linkage disequilibrium
OCR
Open chromatin region
PIR
Promoter interacting region
PIR-OCR
Promoter-interacting region OCR
RNAi
RNA interference
RIN
RNA Integrity Number
RT-qPCR
Reverse transcriptase quantitative PCR
SGBS
Simpson–Golabi–Behmel syndrome preadipocyte cell line
SGBS_diff
Differentiated SGBS cells
SGBS_undiff
Undifferentiated SGBS cells
SMCO4
Single-pass membrane and coiled-coil domain-containing protein 4
TF
Transcription factor
TPM
Transcripts per million mapped reads
TSS
Transcription start site
V2G
Variant-to-gene
×
Introduction
Genome-wide association studies (GWAS) have identified hundreds of common genetic variants robustly associated with increased susceptibility to type 2 diabetes. The overwhelming majority of these GWAS signals are located in non-coding regions and are thought to act by altering the activity of cis-regulatory elements (cREs) and subsequently affecting gene expression either locally or through distant chromatin contacts [1, 2].
Anzeige
There are several key challenges in elucidating the mechanism by which disease-associated variants contribute to type 2 diabetes risk. The GWAS approach does not unambiguously identify causal variants, i.e. the SNP(s) at a given locus molecularly responsible for the disease-associated risk. This is due to the fact that GWAS signals are typically in linkage disequilibrium (LD) with multiple proxy SNPs, any of which could be the causal variant(s). Additionally, GWAS does not identify the cellular context in which the variant confers its effect. While many type 2 diabetes variants are thought to act principally via pancreatic islets, other metabolic tissues such as liver and adipose are also considered critical in the disease aetiology. Finally, GWAS results in themselves do not connect putatively causal non-coding variants to their corresponding effector genes.
Given that GWAS-implicated SNPs do not always act via the nearest gene, mapping the epigenomic landscape in relevant cell types and cellular models presents an opportunity to improve causal variant nominations and gene selection for downstream functional studies [3, 4]. To this end, methods such as expression quantitative trait locus (eQTL) analysis, chromatin conformation/state profiling and RNA interference (RNAi)/CRISPR screening approaches have yielded a degree of success in identifying putative effector genes at type 2 diabetes GWAS loci [3, 5‐9].
EndoC-BH1 cells have been shown to be a particularly powerful model for pancreatic beta cells, by recapitulating insulin secretion in response to glucose and sharing similar regulatory elements to human islets [6, 10]. Simpson–Golabi–Behmel syndrome (SGBS) cells provide an equally useful model that recapitulates adipocyte biology [11], while HepG2 cells are a well-characterised and widely leveraged carcinoma-derived tractable liver cell line. While in vitro cell lines represent a simplification of the in vivo situation, they provide a highly consistent and modifiable setting for the initial validation of type 2 diabetes variant–gene pairs in a human genetic context.
Prior work has examined the epigenomic similarities and potential causal role of type 2 diabetes-associated SNPs in EndoC-BH1 and pancreatic islets using Hi-C-based approaches [3, 6]. Complementary work with eQTLs has nominated target genes for type 2 diabetes via colocalisation and the impact of non-coding variants on transcription factor (TF) binding [5]. Moreover, EndoC-BH1 cells have been used as an in vitro platform for high-throughput perturbations of beta cell physiology, through measures of insulin secretion and content [12], providing a powerful approach for experimental validation of genome-wide predictions [8].
Anzeige
Building on previous reports, we describe the integration of GWAS signals for type 2 diabetes and related metabolic traits with high-resolution DpnII-based promoter-focused Capture-C in the EndoC-BH1 setting along with several other cellular models relevant to type 2 diabetes.
Methods
Cell culture
We used a modified protocol from the Human Cell Design (France) direct protocol for culturing EndoC-BH1 cells. Cells were expanded in DMEM low glucose (1 g/ml) (Gibco, USA) with Glutamax, 2% BSA, 50 μmol/l 2(β)-mercaptoethanol, 10 mmol/l nicotinamide, 5.5 μmol/l transferrin, 6.7 ng/ml sodium selenite, 100 U penicillin/100 μg/ml streptomycin/0.25 μg/ml amphotericin B (1×, Gibco). Plates were coated with either Matrigel or MaxGel ECM (Sigma, USA) and 2 μg/ml fibronectin according to the manufacturer’s instructions: Cells were passaged when confluence was observed or otherwise passaged by two-thirds every 2 weeks. Cells were negative for mycoplasma and validated by expression of insulin. The other cell lines were maintained under standard conditions. See the electronic supplementary material (ESM) Methods ‘Cell culture’ for details for each cell type and for the SGBS cell differentiation.
Sequencing library preparation, sequencing and analysis
The generation of promoter-focused Capture-C, assay for transposase-accessible chromatin (ATAC-seq) and RNA-seq libraries for EndoC-BH1 was in line with methods described previously [13]. The libraries for HepG2 and a preadipocyte cell line (SGBS_undiff) that was differentiated in vitro (SGBS_diff) were generated in line with our previous reports [14]. For full details see ESM Methods ‘ATAC-seq’. We considered the set of open chromatin regions (OCRs) as the ATAC-seq peaks identified in at least two replicates in each cell line (ESM Fig. 1). We used standard methods for generation of 3C libraries [14‐20]. See ESM Methods ‘Promoter-focused Capture-C’ and ‘Hi-C library preparation and analysis’ for details. The capture library was re-annotated under Gencode V30 (https://www.gencodegenes.org/) at both 1-fragment and 4-fragment resolution. We defined promoter OCRs as the set of OCRs overlapping the region within −1500/+500 bp of the gene transcription start site (TSS). This range may include both proximal and distal promoter elements but is generally not in the range where a chromatin contact at our resolution is able to distinguish these [15].
RNA-seq library generation and analysis
RNA was isolated from ~1 million cells of each cell type using TRIzol reagent (Invitrogen, USA), purified using the Directzol RNA Miniprep Kit (Zymo Research, USA) and depleted of contaminating genomic DNA using DNAse I. Purified RNA was checked for quality on a Bioanalyzer 2100 (Agilent, USA) using the Nano RNA Chip (catalogue no. 5067-1511, Agilent) and samples with RNA integrity number (RIN) >7 were used for RNA-seq library preparation. RNA samples were depleted of ribosomal RNA using the QIAseq Fastselect RNA removal kit (Qiagen, Germany). Samples were then processed for the preparation of libraries using the SMARTer Stranded Total RNA Sample Prep Kit (Takara Bio, USA) according to the manufacturer’s instructions. See ESM Methods ‘RNA-seq’ for details.
EndoC-βH1 cells and reagents, including the maintenance medium (OPTIβ1), starvation medium (OPTIβ2) and coating matrix (βCoat), were purchased from Human Cell Design (France). First, culture dishes were prepared following the manufacturer’s instructions and coated with βCoat. EndoC-BH1 cells were thawed in a 37°C water-bath for 1–2 min, resuspended and plated. For resuspension, 1 ml of OPTIβ1 medium was slowly added to the cells before transferring the cell suspension to a separate tube of 8 ml of OPTIβ1 medium. Following this, the cell suspension was then centrifuged for 5 min at 500 g at room temperature. The supernatant was discarded, and the cell pellet was resuspended with 1 ml of OPTIβ1 medium by carefully detaching the cell pellet from the bottom of the tube by gently pipetting up and down. Then, 1 ml of OPTIβ1 was added to the cell suspension and gently homogenised. Next, live cells were counted using Countess 3 (Thermo Fisher Scientific, USA). Then, 1.6 × 106 cells or 4.2 × 106 cells were plated in 6 cm dishes or 10 cm dishes, respectively. EndoC-βH1 cells were routinely split every 7 days to a density of 70,000 cells per cm2. The growth rate of EndoC-βH1 cells stabilised after the second passage of cells; therefore, all experiments were at a minimum of the third passage. The replicates were each within ±2 passages of each other.
Glucose-stimulated insulin secretion
Prior to cell seeding, four 12-well plates were coated with 500 µl of βCoat coating matrix (Human Cell Design), following the manufacturer’s instructions. These plates were then prepared in a 37°C, 5% CO2 incubator for 3 h prior to cell seeding. Approximately 125,000 cells were uniformly added to each well and allowed to grow at 37°C, 5% CO2 for 4 days. On the fifth day the appropriate siRNA treatments (non-targeting 50 nmol/l, fragile X syndrome-related protein 2 (FXR2; 50 nmol/l) or single-pass membrane and coiled-coil domain-containing protein 4 (SMCO4; 37.5 nmol/l) were performed. For details see ESM Methods ‘EndoC-BH1 siRNA transfection’. Then, 48 h later, the cells were transferred from the siRNA/OPTIβ1 medium to OPTIβ2 starvation medium (Human Cell Design) and placed in the incubator for 24 h. Following this, the cells were preincubated for the glucose stimulation by removing the OPTIβ2 and adding 1 ml/well of βKrebs (Human Cell Design)/BSA for 60 min. Next, the stimulation step began by removing the βKrebs/BSA and adding the corresponding treatment (0 mmol/l glucose, 20 mmol/l glucose, 0 mmol/l glucose + 45 µmol/l 3-isobutyl-1-methylxanthine [IBMX] or 20 mmol/l glucose + 45 µmol/l IBMX) following manufacturer’s instructions for 40 min. For collecting the secreted insulin, 800 µl of supernatant was collected from each well and these samples were stored at 4°C until the centrifugation step. For cell insulin content, the remaining medium was removed and 1 ml of cold lysis buffer (5 mol/l NaCl, 0.2 mol/l EGTA, 1 mol/l Tris pH 8.0, glycerol, Triton X-100 and H2O) was added to each well for 5 min, then cells were manually sheared by pipetting up and down to complete the lysis. Next, these samples were kept at 4°C until centrifugation at 700 g for 5 min at 4°C. Finally, 300 µl from each sample was transferred into a new microtube and stored at −20°C until the Human Insulin ELISA was performed. For quantifying insulin content, we used the Mercodia Human Insulin ELISA kit (catalogue no. 10-1113-01, Mercodia, Sweden) to evaluate insulin secretion in EndoC-βH1 cells. Full details are available in ESM Methods ‘Human insulin ELISA’. This experiment was performed four times, with the technical replicates with the mean of each experiment treated as biological replicates.
RNA isolation, reverse transcription and quantitative RT-PCR
Total RNA was extracted from EndoC-βH1 cells 72 h post transfection using Invitrogen TRIzol (Thermo Fisher Scientific, catalogue no. 15596018) and the Zymo Research Direct-zol RNA MiniPrep Plus kit (catalogue no. R2070), following the manufacturer’s instructions.
Anzeige
The quantity and purity of RNA extracted were assessed using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA).
First-strand cDNA synthesis was performed using the SuperScript VILO cDNA Synthesis Kit (Thermo Fisher Scientific, catalogue no. 11754050), following the manufacturer’s instructions. The quantity and purity of cDNA synthesised were assessed using a NanoDrop 2000 spectrophotometer.
Reverse transcriptase quantitative PCR (RT-qPCR) was performed using TaqMan Fast Advanced Master Mix (catalogue no. 4444556, Applied BioSystems, USA) and analysed with the Agilent Technologies AriaMX Real-Time System. Knockdown efficiency was assessed by RT-qPCR of the target genes using the following primers: FXR2 (Taqman FXR2, catalogue no. 4331182, assay ID Hs00191579_m1_FAM, Thermo Fisher), SMCO4 (Taqman SMCO4, catalogue no. 4351372, assay ID HS04980617_g1_FAM, Thermo Fisher). The endogenous control gene was beta-actin (Taqman ACTB, catalogue no. 4331182, assay ID HS01060665_g1_VIC, Thermo Fisher). For details see ESM Methods ‘Quantitative RT-PCR’. The endogenous control gene was used to normalise gene expression following the ΔΔCt method [23]. The non-template control did not show amplification.
Statistical analyses
Wilcoxon rank-sum tests were used to determine enrichment of expression of genes linked to promoter-interacting region OCRs (PIR-OCRs). Bar-plots with error bars represents mean and standard deviation, while bar-plots without error bars represent summary information as either a count or ratio from gene lists. In boxplots, the central line represents the median, edges represent 25th–75th percentiles, whiskers represent 1.5 times IQR and outliers are depicted as points. For experiments for glucose-stimulated insulin secretion assays, paired two-sided t tests were used to assess statistical significance. Randomisation was not carried out as it was not applicable to our study design and analysts were not blinded to study variables. No data were excluded from this study.
Anzeige
Ethics statement
Research was conducted in accordance with the Children’s Hospital of Philadelphia’s ethics and compliance policies. The researchers had access to deidentified biospecimens and datasets.
Results
Epigenetic landscape of cell line models of metabolism
To gain insights into key aspects of the epigenetic landscape of type 2 diabetes-relevant cell models, we analysed high-resolution promoter-focused Capture-C and ATAC-seq data in triplicate for the EndoC-BH1 and HepG2 cell lines, along with SGBS cells both at the pre-adipocyte stage and the in vitro differentiated adipocyte-like state (SGBS_undiff, SGBS_diff). We also leveraged the gene expression in each cell line using bulk RNA-seq [14].
For analysing the promoter-centric chromatin architecture, we identified Capture-C-defined contacts with gene promoters both at the level of individual restriction fragments (1frag) and by in silico binning four consecutive fragments (4frag) in order to increase the power to call distant interactions, as previously performed [14, 20, 24], yielding 53,689 1frag and 92,152 4frag promoter contacts for EndoC-BH1, respectively, and ranging from 80,449 to 229,880 1frag and 155,569 to 293,618 4frag promoter contacts for the other cell lines (Fig. 1a, ESM Table 1).
×
We leveraged ATAC-seq to identify OCRs, which potentially act as cREs to influence gene expression. We called OCRs as the set of reproducible peaks in at least two of three replicates. We identified 68,596 OCRs in EndoC-BH1 cells, 73,296 in HepG2, 74,296 in SGBS_undiff and 52,609 in SGBS_diff.
Anzeige
We then annotated OCRs to genes if they either overlapped with a promoter (−1500/+500 bp) or intersected the other end called by promoter-focused Capture-C, which we termed promoter interacting regions (PIRs). We found that 28% of OCRs contacted at least one gene promoter (Fig. 1b, ESM Table 2), which is in line with previous reports [14, 20]. We next compared the distribution of distances between the putative enhancer and promoter-connected region fragments and found that they were comparable (mean: EndoC-BH1, 133,320.9 bp; HepG2, 109,703.7 bp; SGBS_undiff, 155,686.6 bp; SGBS_diff, 207,733.6 bp) (Fig. 1c).
Next, we determined the degree of similarity across promoter–OCR connections between cell types by comparing the magnitude of overlap between the PIR-OCR-connected regions using the Jaccard index, i.e. the ratio of nucleotides located in both annotations to the union present in either set of annotations. We found that the four cell types displayed less than 30% overlap (Fig. 1d), suggesting that the promoter landscapes have distinct cell type-specific epigenetic and chromatin conformational features that contribute to transcriptional differences across these cell lines. Also consistent with previous reports, we observed bait-to-bait contacts ranging from 10% to 29% of called loops (ESM Table 3).
Comparison of promoter-focused defined maps with existing enhancer atlases
To further validate our OCR–gene contacts, we compared the expression of genes with a promoter contacting at least one OCR vs that of genes not in contact with any OCR. Consistent with our prior promoter-focused Capture-C datasets [16, 17, 19], genes with a promoter in contact with OCRs were generally expressed at significantly higher levels (unpaired Mann–Whitney U test, two-sided p<2.2 × 10−16) than those without OCR contacts across cell types (Fig. 2a). Similarly, we assessed the degree of similarity to GTEx expression profiles of equivalent tissue types and found that the genes contacted in our cell lines were expressed at higher levels in the corresponding tissue in that public domain dataset [25] (Fig. 2b). These results suggest that promoter contacts with OCRs are associated with increased gene expression.
×
We then compared the chromatin states identified by the epigenome roadmap with the open connected regions defined by ATAC-seq and promoter-focused Capture-C. As expected, we found at least fourfold enrichment for active and bivalent TSSs and enhancers, given that OCRs can represent either active or ‘poised’ regulatory elements (Fig. 2c). The shared TSS/promoter signature represents the number of bait-to-bait interactions present, suggesting either that some genes are co-regulated or that promoters of genes not expressed in a particular cell type can act as enhancers, as previously reported by others [26].
We subsequently examined the VISTA database for curated validated enhancers in different embryological tissues [27]. However, there were only six experimentally validated pancreatic enhancers within the database. Despite this limitation, we observed in our data that a known human enhancer, hs1977, was in contact with the ABCC8 promoter (Fig. 2d, e). ABCC8 encodes a component of an ATP-sensitive potassium channel expressed in beta cells and modulates glucose-dependent insulin secretion [28]. This enhancer region was accessible only in EndoC-BH1 cells, while expression was observed in the pancreas in the VISTA database, although several of the in situs display staining in neural tissues in addition to the pancreas. We also identified a liver enhancer, hs1752, expressed in the liver, heart and other abdominal tissues with contacts to PCYOX1L, which encodes prenylcysteine oxidase 1 like (Fig. 2f, g). Prenylcysteine oxidases are enzymes that scavenge free cysteines from a metabolic pathway involved in the degradation of prenylated proteins [29]. This cRE was also connected to the promoter of several non-coding RNAs: miR-143, miR-145 and AC131025.2. Taken together, these results support the use of chromatin features in these cell lines as a valid model to investigate cREs for type 2 diabetes-related traits.
Variant-to-gene mapping in metabolic-relevant cells and prioritisation of type 2 diabetes-associated GWAS signals
Next, we sought to investigate the degree that cellular models are enriched for heritability associated with metabolic traits. To this end, we performed partitioned LD score regression [30] to test for enrichment of putative cREs (promoter OCRs + PIR-OCRs) (Fig. 3, ESM Table 4). We observed significant enrichment of EndoC-BH1 cREs for type 2 diabetes, fasting glucose, BMI and fetal body weight. HepG2 cREs were enriched for coronary artery disease and fasting glucose levels. SGBS_diff cREs were enriched for WHR, fasting insulin, plasma triglyceride and HDL-cholesterol levels, while SGBS_undiff cREs were enriched for WHR. cREs across all cell types were enriched for height. In addition, we included recent GWAS studies from Alzheimer’s disease and systemic lupus erythematosus as negative controls, and as expected we did not observe enrichment for these neural and immune GWAS traits in these principally metabolic cell types [30‐32]. Therefore, these results supported the utility of these cell models in investigating the cis-regulatory architecture for the type 2 diabetes-related traits.
×
To implicate type 2 diabetes causal variants impacting cREs, we curated the set of lead sentinel SNPs from the two most recent European and trans-ancestral GWAS reports for the disease [33, 34]. We identified proxies in high LD with the reported lead variants for each signal (r2>0.8) and intersected their genomic coordinates with OCRs and PIRs (Table 1). The subsequent 810 implicated genes from this variant-to-gene (V2G) mapping strategy, corresponding to 370 type 2 diabetes sentinels, revealed that the majority of putative causal variants did not contact the gene nearest to the sentinel, and in most cases where the nearest gene was implicated other gene promoters were also additionally contacted (Fig. 4a, ESM Table 5). Of these 810 genes implicated by this V2G mapping approach, 79.9% were specific to one cell line (130 EndoC-BH1, 311 HepG2, 123 SGBS_diff, 83 SGBS_undiff) while 163 (20.1%) were shared in at least two cell types and 53 (6.5%) across all cell types (Fig. 4b). We compared the list of genes with phenotypes identified in mice, and found significant enrichment for nervous system, haematopoietic system, growth/size body region and mortality/ageing (ESM Fig. 2, ESM Table 6).
Table 1
Summary table of the input sentinels and proxies used in this study and how many in at least one cell type
AFR, African ancestry analysis; AMR, American ancestry analysis; EAS, East Asian ancestry analysis; EUR, European ancestry analysis; TRANSE, trans-ancestral analysis
×
One obvious mechanism by which non-coding variants can influence gene expression is through disruption of TF binding sites. To investigate this possibility, we predicted whether any of the V2G-implicated open proxies that contacted genes contained known TF binding motifs. This approach led to the identification of 288 proxies predicted to alter binding affinity (Fig. 4c, ESM Table 7). Most notably, we observed that several binding sites for zinc finger TF KLF and GLI families were disrupted by variants identified in EndoC-BH1. Seventeen of the TFs with motifs impacted by type 2 diabetes variants are predicted effector transcripts in at least one cell type (HES2, ZNF384, STAT6, NR2C1, ONECUT1, TCF12, SOX15, TP53, HNF1B, TCF4, TCF3, OSR1, PPARG, NFKB1, CREB3, ZBTB6, ZBTB26). We compared the motifs with publicly available chromatin immunoprecipitation (ChIP) data from ENCODE, to determine if there is evidence of TF binding at the genomic location in various cellular contexts (ESM Fig. 3, ESM Table 8). We also checked the expression of implicated TFs and observed 280 of 447 (62.6%) of the predicted TFs with disrupted motifs had transcripts per million mapped reads (TPM) >1 and 187 had TPM >10 in EndoC-BH1 cells, suggesting that a majority of predicted TFs are expressed (ESM Fig. 4).
We then examined the genes contacting proxies regardless of the status of predicted disruption of TF binding sites. While some signals were only predicted to have one proxy interacting with one gene, others yielded multiple proxies with contacts to multiple genes (Fig. 4d). While this could suggest a subset of variants acting through multiple genes, further work is necessary to functionally validate these predictions. We note that we did not observe a trend towards those multi-gene signals corresponding to more statistically significant type 2 diabetes GWAS loci.
Next, we compared our analysis with previous type 2 diabetes functional prioritisation approaches, yielding moderate agreement but highlighting a potentially higher confidence gene set for further investigation [3, 8, 35‐37] (Fig. 4e, ESM Table 9). Additionally, we compared our findings with a recent CRISPR interference (CRISPRi) screen identifying potential type 2 diabetes effectors, noting eight overlapping genes, seven associated with decreased insulin content and one with increased content (decreased: CHD4, PRPF18, GMEB1, CREB3, PITPNM2, SIN3A and ATP6 V1C1; increased: FADS1). However, this overlap did not reach strict statistical significance (Fisher’s test, one-sided, p=0.09). Moreover, we compared our list with one generated using chromatin conformation in pancreatic cells, finding 100 genes annotated as type 2 diabetes effectors in beta cell Hi-C also present in our dataset (ESM Table 10) [9].
Subsequently, we compared our results with eQTL associations of nearby genes [5, 25]. Among 221 eQTL-linked genes associated with type 2 diabetes risk loci in pancreatic/islet tissues, only 12 had open proxies in chromatin contact with their promoters in EndoC-BH1 cells: SMCO4, DOC2A, OPRL1, TUFM, YBEY, MTMR11, UBE2D3, PLEKHA1, GPSM1, RNF6, ABCD9 and AGFG2. Six genes were implicated by both liver eQTLs and our promoter-focused Capture-C in HepG2 cells, while 25 genes were linked to adipose tissue eQTLs and our Capture-C in at least one adipose model (SGBS_undiff, SGBS_diff) (ESM Table 11). Colocalisation analyses between GTEx v7 and DIAMANTE type 2 diabetes GWAS signals [33] revealed 22 genes colocalised (posterior inclusion probability (PIP) >0.85) in the pancreas, with only DOC2A also implicated by our Capture-C approach. We detected 11 genes with eQTL colocalisation in the liver, of which MAN2C1, AP3S2 and CEP68 were also implicated by promoter contacts in HepG2. Additionally, 54 genes were identified in subcutaneous or visceral adipose tissue, with AP3S2, CALR and NDUFAF6 also implicated by our V2G approach in SGBS_diff cells, while PABPC4, PLEKHA1, AP3S2 and DCAF16 were implicated in the SGBS_undiff dataset (ESM Table 12). Although there is limited overlap, the intersection of these methods can better prioritise genes for functional validation.
Inhibition of SMCO4 expression increases glucose-stimulated insulin secretion in EndoC-BH1
In our analysis pipeline, two genes, SMCO4 and FXR2, were among those identified as potential type 2 diabetes candidate genes (Fig. 5a, b) in both EndoC-BH1 and prior primary beta cell chromatin maps [9]. Additionally, an eQTL associated with SMCO4 colocalises with a type 2 diabetes signal marked by rs57235767 in islets [5]. To investigate the impact of these two genes on insulin secretion in EndoC-BH1 cells (Fig. 5c), we performed targeted knockdown experiments of SMCO4 and FXR2. Confirmation of SMCO4 and FXR2 knockdown was achieved using RT-qPCR (Fig. 5d). Our analyses revealed a significant increase in insulin secretion in SMCO4 knockdown cells relative to non-targeting siRNA controls (~18.0% increase, paired two-tailed t test p=0.0243) (Fig. 5e). Notably, there was no significant difference when IBMX was added to the stimulation medium (ESM Fig. 5). IBMX raises cellular levels of cAMP to stimulate secretion of insulin. These results suggest that that SMCO4, but not FXR2, inhibits glucose-stimulated insulin secretion in EndoC-BH1 cells, while having no significant impact on IBMX-stimulated secretion.
×
Discussion
Through the integration of GWAS data with the data generation of the trifecta of promoter-focused Capture-C, ATAC-seq and RNA-seq in type 2 diabetes-related relevant cellular models, we could reveal which settings showed enrichment for loci and V2G genes for subsequent informing of functional studies to pursue candidate causal variants and their corresponding effector genes. Given that prior studies have examined putative effectors of type 2 diabetes-associated SNPs in pancreatic islets using eQTLs and promoter Hi-C [3, 5, 6], along with similar work in adipose [38] and skeletal muscle cells [39], our work complements previous efforts with cellular models particularly amenable for functional experiments to address such complex, polygenic diseases operating across multiple tissue settings.
Partitioned LD score regression analyses revealed differences in cellular composition for metabolic-related traits. Type 2 diabetes and fasting glucose displayed enrichment in EndoC-BH1 cells, which supports evidence from prior studies highlighting the pancreas's role in glucose regulation [7]. BMI enrichment in EndoC-BH1 cells is consistent with the insulin–obesity association [40]. Height exhibited enrichment across all cell types, possibly due to the large sample size and its reflection of various physiological processes [41]. Fasting insulin levels and WHR were enriched in the adipose model SGBS_diff, indicating differing genetic impacts on metabolic physiology between type 2 diabetes and associated risk factors (Fig. 3). No enrichment was found for polycystic ovary syndrome, a disorder leading to infertility that often presents with insulin resistance [42].
V2G mapping identified SMCO4 as a candidate gene affecting insulin secretion. From public expression resources, SMCO4 is expressed in most tissues from RNA-seq and immunohistochemistry data [43]. Further work would be necessary to characterise its role in pancreatic islet cells and other tissues. In addition, we queried several prior studies and found that SMCO4 is not reported to be differentially expressed in islets/beta cells in type 2 diabetes. It is possible that SMCO4 functions early in type 2 diabetes development or in a limited subset of cells. FXR2 knockdown showed less efficiency and no significant effect on insulin secretion, possibly due to either reduced knockdown efficiency or compensation by its paralog FXR1.
Despite the valuable insights gained, our study has notable limitations. While commonly used for molecular validation of disease variants, the use of immortalised cell lines does not fully recapitulate the chromatin landscape of the in vivo state of primary cells/tissues. Moreover, these cells provide snapshots of cellular states under limited conditions and do not take into account possible temporal effects at loci over time. Incorporating multiple lines of evidence facilitates determination of causality of disease-relevant variants. Further comparative analyses under different physiological conditions may offer additional insights into gene expression programmes and expand the list of nominated genes. In our functional assays, while we successfully validated SMCO4 knockdown, we faced challenges in assaying SMCO4 protein knockdown via western blot due to the small size and negative charge of the protein. Further work is warranted to further clarify the function of SMCO4 in beta cells in the context of type 2 diabetes and related traits. In addition, we only assayed for functional consequences of putative effector genes in EndoC-BH1 cells. Future work could assay responsiveness to insulin via downstream molecular readouts and other physiological responses in SGBS and HepG2 cells.
The combined results from our V2G mapping analyses present potential targets for further validation and functional assessment in cellular models, especially when combined with other orthogonal datasets available in the public domain. Additional work is necessary to determine whether these observations represent true cases of pleiotropy. Our datasets were generated to capture the ‘normal context’ of disease-associated susceptibility variants before the disease induces thousands of non-causal changes after onset. However, it has been proposed that some SNPs may act by retethering regulatory elements to ectopic target genes to contribute to disease [44]. High-throughput functional validation methods, such as Perturb-seq, may offer insights into how eQTLs, chromatin conformation and other data contribute to effector gene nominations. Future studies focused on highlighting the dynamic nature of enhancers in a variety of cellular states and disease contexts are warranted, as well as comparing the 3D architecture of larger cohorts with diverse genetic backgrounds, in order to clarify how generalised this model is in shedding light on the genomic aetiology of common metabolic disease pathogenesis.
Acknowledgements
We thank G. Costecalde from Univercell-Biosolutions for providing the EndoC-βH1 cells. We acknowledge B. Johnson from the Children’s Hospital of Philadelphia for assistance with culturing EndoC-BH1 cells.
Data availability
Raw and processed next-generation sequencing data for EndoC-BH1, HepG2, SGBS_undiff and SGBS_diff cells are deposited in GEO under the Superseries accession GSE262484. Promoter-focused Capture-C data are deposited under accession GSE262496. Hi-C data are deposited under accession GSE262481. Bulk ATAC-seq data are deposited under accession GSE262479. Bulk RNA-seq data are deposited under accession GSE262480.
Funding
This work was supported by NIH Grants UM1 DK126194, R01HD056465 and the Center for Spatial and Functional Genomics at CHOP. SFAG is supported by the Daniel B. Burke Endowed Chair for Diabetes Research. We acknowledge that the funders had no input on the experimental design of this study.
Authors’ relationships and activities
The authors declare that there are no relationships or activities that might bias, or be perceived to bias, their work.
Contribution statement
This project was conceived by SFAG. MCP and SFAG directed experimental studies. MCP conducted bioinformatic analyses and analysed experimental results. NAW designed and performed targeted knockdown and glucose-stimulated insulin secretion experiments. JAP, KB, SL, MEL and UWP designed and performed next-generation sequencing experiments. EM and AC developed the bioinformatic pipeline infrastructure. MW provided SGBS cells and was involved in data acquisition. ADW provided supervision and was involved in data acquisition. SFAG and ADW acquired funding. MCP wrote the manuscript together with NAW and SFAG. All authors contributed to, read and approved the final version of the manuscript. SFAG and MCP are guarantors of this work.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Mit e.Med Innere Medizin erhalten Sie Zugang zu CME-Fortbildungen des Fachgebietes Innere Medizin, den Premium-Inhalten der internistischen Fachzeitschriften, inklusive einer gedruckten internistischen Zeitschrift Ihrer Wahl.
Mit e.Med Allgemeinmedizin erhalten Sie Zugang zu allen CME-Fortbildungen und Premium-Inhalten der allgemeinmedizinischen Zeitschriften, inklusive einer gedruckten Allgemeinmedizin-Zeitschrift Ihrer Wahl.
Implicating type 2 diabetes effector genes in relevant metabolic cellular models using promoter-focused Capture-C
verfasst von
Nicholas A. Wachowski James A. Pippin Keith Boehm Sumei Lu Michelle E. Leonard Elisabetta Manduchi Ursula W. Parlin Martin Wabitsch Alessandra Chesi Andrew D. Wells Struan F. A. Grant Matthew C. Pahl
Die subkutane Applikation von Nivolumab ist nach Daten einer Phase-3-Studie ähnlich gut wirksam wie die intravenöse: Die Pharmakokinetik ist vergleichbar, die objektive Response war in der Studie sogar leicht besser als in der Gruppe mit Infusionen.
Für Männer mit metastasiertem hormonsensitivem Prostata-Ca. (mHSPC), die keine Hormonchemotherapie wollen oder vertragen, ist der Androgenentzug plus Darolutamid eine Alternative: Das Progressionsrisiko wird im Vergleich zum alleinigen Androgenentzug fast halbiert.
Verzögertes Wiederanspringen der Darmfunktion ist ein Hauptfaktor dafür, wenn Patientinnen und Patienten nach einer Kolonresektion länger als geplant im Krankenhaus bleiben müssen. Ob man diesem Problem mit Lidocain vorbeugen kann, war Thema einer Studie.
Zum Thema Koronare Herzkrankheit gab es 2024 wichtige neue Studien. Beleuchtet wurden darin unter anderem der Stellenwert von Betablockern nach Herzinfarkt, neue Optionen für eine Lipidsenkung sowie die Therapie bei infarktbedingtem kardiogenem Schock.
In diesem CME-Kurs können Sie Ihr Wissen zur EKG-Befundung anhand von zwölf Video-Tutorials auffrischen und 10 CME-Punkte sammeln. Praxisnah, relevant und mit vielen Tipps & Tricks vom Profi.