Background
Systemic lupus erythematosus (SLE [MIM: 152700]) is a common systemic autoimmune disease characterized by the production of autoantibodies and a complex genetic inheritance. The prevalence of the disease varies according to the population ancestry, with European populations ranging between 30 and 90 cases per 100,000 individuals [
1]. SLE afflicts women at a rate nine times higher than men, and most often appears during childbearing ages. Concordance rate studies in monozygotic and dizygotic twins and recurrence risk estimates in siblings of probands (λ
s), have clearly shown the importance of genetic factors in the development of the disease [
2].
Despite the evidence for a strong genetic contribution, until recently, very few loci were convincingly associated with SLE risk [
3]. With the concurrent identification of common genome variation and the development of genome-wide genotyping technologies, genome-wide association studies (GWAS) have dramatically changed the ability to identify risk variants. In SLE, GWAS have allowed the identification of more than 50 risk loci at a genome-wide significance level (
p value <5 × 10
− 8) [
4‐
6]. These findings are of great relevance since they pinpoint specific biological mechanisms that are relevant for the disease and that otherwise would not have been prioritized for research [
7]. In a severe disease like SLE that is lacking efficacious treatments, genetic studies provide a unique way to expand the number of molecular targets for drug discovery [
8].
To date, the explanation for the inherited risk of SLE is largely unresolved. Including all known risk variants, less than 30% of disease heritability is currently accounted for [
9], In order to identify additional risk variants, GWAS meta-analyses from different countries have proven to be a highly successful approach [
9]. Currently, most Southern European populations have been underrepresented in GWAS of SLE. In Spain, epidemiological studies have shown that there is an increased prevalence of the disease compared to other European regions [
1]. Consequently, the analysis of the genetic variation in this population could be highly useful to identify new genetic variation for SLE risk.
Biological pathways integrate the function of multiple genes and, therefore, provide a higher level of detection of the relevant genetic risk [
10,
11]. To date, different statistical methods have been developed that exploit the biological knowledge in order to leverage the power of GWAS. These analysis methods are designed to aggregate the genetic evidence from multiple risk loci into a single association statistic. The use of cumulative evidence can be a powerful way to detect genetic associations and biological mechanisms that otherwise would have been missed due to low effect size at the single-marker level. Using this complementary GWAS approach, relevant biological insights have been gained in different complex diseases, including autoimmune diseases [
12].
The aim of the current work was to identify new genetic risk loci for SLE by a GWAS meta-analysis using a case-control cohort from a previously untargeted population. After excluding known risk genes, pathway meta-analysis was also performed to identify biologic pathways for SLE risk associated by risk loci as yet unaccounted for.
Discussion
In the present study we have identified five new risk loci for systemic lupus erythematosus. Performing a meta-analysis on 4943 patients with SLE and 8483 controls from different European ancestries, we have identified variants at GRB2, SMYD3, ST8SIA4, LAT2, and ARHGAP27 loci associated with SLE susceptibility. At the pathway level, we have also found four biological pathways associated with SLE risk independently of previously known risk genes.
In the present meta-analysis, we found an association between an intronic SNP in the gene encoding for the growth factor receptor-bound protein GRB2 and SLE (rs36023980,
p = 4.7 × 10
− 9). Analysis of the tissue-specific epigenetic data from the NIH Roadmap Epigenomics Project [
26] for rs36023980 SNP showed a strong regulatory activity in different immune cells, including enhancer evidence in both T and B lymphocytes (Additional file
1: Table S2).
GRB2 encodes for a receptor tyrosine-kinase (RTK) adaptor protein composed of a single SH2 domain and two SH3 domains [
27]. SLE is a disease characterized by the activation of B cells that recognize self-antigens via their B cell receptors (BCR). In B cells, GRB2 functions as an expression adaptor molecule, attenuating the signals that are transduced by the BCR [
28]. Together with Dok-3 and SHIP1, GRB2 forms a trimer protein complex that binds directly to the BCR and prevents downstream signaling by inhibiting PI3K signaling [
29]. Gene expression at different stages of B cell differentiation shows that
GRB2 expression increases in more mature forms, particularly on antigen-experienced memory B cells (Additional file
1: Figure S4). Inadequate control of memory B cell differentiation into plasma cells has been proposed as a trigger for autoimmunity in SLE [
30]. Our results therefore are in line with the relevance of this causal disease mechanism.
In a close functional relationship with
GRB2, we also found a significant association between linker for activation of T cells family member 2 gene (
LAT2) locus and SLE (rs150518861,
p = 4.1 × 10
− 8).
LAT2 encodes for an adaptor molecule that binds GRB2 and, therefore, is also involved in BCR signaling [
31]. The association at the genetic level between SLE and two directly interacting proteins strongly supports the implication of this biological mechanism in SLE risk. B cell dysfunction is a hallmark of SLE pathology [
9], and our study supports downstream regulation after antigen binding as a crucial event in the disease etiology. In the evaluation of sex-specific effects, we found this locus to be differentially associated with SLE risk. The risk variant was associated with SLE in men (
p = 0.0074, β (95% CI) = 1.3 (0.25–2.2)), and it was non-significant in women (
p = 0.58, β (95% CI) = 0.13 (− 0.33 to 0.62)). Previous studies have shown that men require a higher genetic load to develop the disease [
32]. If replicated in an independent cohort, this result would be in line with these findings, confirming the importance of sex in mediating the effect of some genetic risk factors in SLE.
SMYD3 encodes for an H3-Hk histone methyltransferase that has been associated with increased cell proliferation in cancer [
33]. Altered epigenetic patterns have been strongly associated with SLE, mostly at the DNA level [
34]. More recently, however, methylated histones have also been identified as targets of autoantibodies expressed in patients with SLE [
35]. Similar to other frequent nuclear autoantigens in SLE, like double-stranded DNA or ribonucleoproteins, methylated H3-Hk histones are able to trigger autoreactive B cells after antigenic-exposure processes like apoptosis. According to the Roadmap Epigenomics Project data, the associated SNP rs1780813 lies in a site that is DNAse hypersensitive for > 30 different tissues, supporting its role in gene regulation.
To date little is known about the functional role of SLE-associated genes
STS8IA4 and
ARHGAP27. In order to infer the potential biological role of these two genes, we used the GeneNetwork approach, a functional-inference method based on the gene co-expression patterns extracted from microarray data from > 80,000 samples [
36]. With this approach, we found strong evidence that
STS8IA4 is involved in T cell activation (
p value = 2.7 × 10
− 13, Additional file
2: Table S4), and that
ARHGAP27 is implicated in mitogen-activated protein kinase (MAPKinase) signaling (
p value = 3.33 × 10
− 8, Additional file
2: Table S5). Both biological processes have been previously associated with SLE etiology, and our results not only support their involvement in disease risk but also suggest new gene functions. Furthermore, expression quantitative trait locus (eQTL) evidence supports that both SNPs regulate expression of the corresponding genes in
cis. Whole blood eQTL analysis [
37] shows a strong association between variation at rs114038709 and
ARHGAP27 expression (
p = 4.1 × 10
− 134), and the most significant eQTL evidence for rs55849330 is associated to
STS8IA4 expression in immortalized B cells [
38] (
p = 5.6 × 10
− 10).
Using a pathway-based analysis we have identified four biological pathways associated with SLE. Since the objective was to identify new genetic risk variation for SLE, our approach excluded all association signals from previously known SLE genes. We showed that by using biological pathway knowledge, it is still possible to capture genetic variation that is relevant for the disease. One limitation of this approach is that it relies on the specific knowledge of gene functions and pathway definitions, which is still relatively low for a substantial fraction of the genome [
39]. Another limitation is that pathway association is performed on variants within or close to genes. Distant
cis regulation and also
trans regulation variants are also plausible mechanisms of action [
40]. With better knowledge of regulatory effects, particularly on isolated cell types, pathway-based analysis will become an even more powerful approach to detect the missing disease heritability. Despite these limitations, our results are robust since they are based on strongly supported biological knowledge. Also, we provide statistical evidence of pathway association from two independent GWAS cohorts which, to our knowledge, has not been previously performed in SLE.
The BCR signaling pathway had the strongest association with SLE. This result is in agreement with the results found at the single-marker level, where variants at BCR signaling genes
GRB2 and
LAT2 were found to be associated with disease susceptibility. Within the BCR signaling pathway, however, there are multiple other single-marker hits in other genes indicating nominally significant association with disease susceptibility in both cohorts. Given that they belong to an associated biological pathway, these signals are strongly suggestive risk variants for SLE (Table
3). Of relevance, several of the proteins encoded by the genes in this pathway, like
BTK or
CTLA4, are currently being evaluated as therapeutic targets for SLE [
41,
42]. Finding an efficacious treatment in SLE has proven extremely difficult and our results support the importance of targeting this pathway. Genetic evidence, either direct or through associated gene networks, has been shown to improve drug efficacy prediction [
43]. Based on the association signals found in the two cohorts, for example, the proteins encoded by
LYN (
p = 1.17 × 10
− 6) and
NFATC1 (
p = 5.26 × 10
− 6) could also be considered as new drug targets for SLE.
Table 3
Top single-marker hits in genes from the four genetic pathways associated with SLE
BCL10
| rs12084253 | 1 | 85,720,326 | T | 1.11 | 0.020 | 0.0015 | 0.00012 | BCR |
FCER1G
| rs1136224 | 1 | 161,184,097 | G | 0.91 | 0.023 | 0.022 | 0.0024 | VASC |
FCGR2B
| rs182968886 | 1 | 161,642,985 | A | 0.86 | 0.044 | 0.0018 | 0.00023 | BCR |
CD247
| rs113305799 | 1 | 167,416,006 | A | 1.17 | 0.0035 | 0.044 | 0.0022 | CTLA4 |
PROC
| rs6740067 | 2 | 128,156,366 | T | 1.16 | 0.043 | 0.018 | 0.0026 | VASC |
CTLA4
| rs733618 | 2 | 204,730,944 | C | 1.19 | 0.026 | 0.0018 | 0.00016 | CTLA4 |
PPP3CA
| rs13120190 | 4 | 102,056,663 | G | 0.93 | 0.025 | 0.047 | 0.0060 | BCR |
IL2
| rs45522533 | 4 | 123,396,876 | T | 1.16 | 0.016 | 0.0058 | 0.00044 | CTLA4 |
SLC7A11
| rs74843273 | 4 | 139,150,464 | T | 0.81 | 0.025 | 0.0066 | 0.00065 | VASC |
ITK
| rs60714766 | 5 | 156,602,589 | T | 1.07 | 0.015 | 0.043 | 0.0042 | CTLA4 |
CARD11
| rs6461796 | 7 | 3,071,195 | C | 0.94 | 0.027 | 0.033 | 0.0042 | BCR |
LYN
| rs17812659 | 8 | 56,889,862 | G | 0.86 | 0.013 | 2.57E-05 | 1.17E-06 | BCR,VASC |
ANGPT1
| rs79847080 | 8 | 108,293,443 | G | 0.84 | 0.032 | 0.0028 | 0.00031 | VASC |
VAV2
| rs2810536 | 9 | 136,812,625 | G | 1.08 | 0.030 | 0.011 | 0.0013 | BCR |
KRAS
| rs17388587 | 12 | 25,389,220 | G | 1.13 | 0.036 | 0.048 | 0.0075 | BCR, VASC |
PRKCB
| rs11641223 | 16 | 24,020,316 | T | 1.11 | 0.041 | 0.0010 | 0.00012 | BCR |
CD19
| 16:28955702:D | 16 | 28,955,702 | I | 1.06 | 0.0077 | 0.047 | 0.0034 | BCR |
SLC7A6
| rs55856208 | 16 | 68,324,210 | T | 1.08 | 0.045 | 0.049 | 0.0086 | VASC |
PLCG2
| rs11548656 | 16 | 81,916,912 | G | 1.3 | 0.014 | 0.00062 | 0.000035 | BCR |
ATP1B2
| rs1794287 | 17 | 7,578,837 | A | 0.9 | 0.023 | 0.024 | 0.0026 | VASC |
ITGB3
| rs75211989 | 17 | 45,366,261 | G | 1.11 | 0.00014 | 0.020 | 0.00020 | VASC |
GRB2
| rs36023980 | 17 | 73,341,284 | T | 0.85 | 0.00039 | 1.51E-06 | 4.73E-09 | CTLA4, IL4, BCR, VASC |
NFATC1
| rs111354805 | 18 | 77,238,078 | T | 1.21 | 0.027 | 6.58E-05 | 5.26E-06 | BCR |
MAP2K2
| rs350913 | 19 | 4,096,779 | T | 0.94 | 0.029 | 0.030 | 0.0039 | BCR |
CD79A
| rs16975619 | 19 | 42,392,441 | C | 1.52 | 0.020 | 0.0099 | 0.00089 | BCR |
SIRPG
| rs11696739 | 20 | 1,600,925 | A | 0.92 | 0.044 | 0.0050 | 0.00069 | VASC |
RAC2
| rs229566 | 22 | 37,602,131 | A | 1.06 | 0.041 | 0.03 | 0.0047 | BCR |
Two other associated pathways - the CTLA4 co-stimulatory signal and IL4 pathways - are strongly related to B cell activation. CTLA4 is a co-inhibitory molecule expressed on activated helper T - TH2 and follicular - cells. Inhibition of CTLA4 increases B cell activation after antigen binding, resulting in the production of antibodies [
44]. IL-4 is a cytokine that is also expressed in helper T cells and it is essential in the activation of antigen-bound naïve B cells. Similar to the BCR signaling pathway, these two genetically associated biological processes that are deeply related to B cell activation could be the source of new effective drug targets for the disease [
45]. In this regard, a fusion protein including the extracellular domain of CTLA4 (abatacept) is being currently evaluated as a therapy for more severe forms of SLE [
46].