Background
The apolipoprotein L1 protein is a 43 kDa protein belonged to the lipocalin family and has 4 splice variants encoding 3 different isoforms: variants 1 and 3 encode the same apolipoprotein L1 isoform a, variant 2 and 4 encode isoform b and c precursor, respectively. APOL1 plays an important role in the trypanosomal lysis [
1‐
4], autophagic cell death [
5,
6], lipid metabolism [
7‐
9], as well as vascular and other biological activities. The main features that distinguish APOL1 from the other members of APOL genes family are, (a) APOL1 gene is in the opposite orientation to the other three (APOL2, APOL3, APOL4); (b) it encodes the only secreted protein in the family; (c) APOL1 plays an important role in trypanosomal lysis; (d) it also acts as a risk gene for many kinds of kidney diseases.
Apolipoprotein L1 (APOL1) belongs to the family of Apolipoprotein L genes, located at Chromosome 22q13. APOL1 gene (Gene ID: 8542) encompasses a region of 14,461 nucleotides and presents seven exons and six introns, and encodes mRNA of 3039 nucleotides. Considering the full-length mRNA (NM_145343.2), 1245 nucleotides represent coding segment (CDS) encoding 414 amino acids, 274 nucleotides present in 5′ untranslated region (5’UTR) segment, and 1520 nucleotides represent the 3′ untranslated region (3’UTR) segment. The APOL1 CDS is composed of joining segments of six exons. The APOL1 protein has signal peptide (SP), pore forming domain (PFD), membrane-addressing domain (MAD) and SRA-interacting domain [
4,
10]. Part of the signal peptide is encoded by exons 2, exon 3 and exon 4. Exon 6 encodes the PFD. The exon 7 with 2381 nucleotides is 3.7 times as much as the sum of other six exon nucleotides and accounts for 78% of the whole mRNA sequence. Therefore, the exon 7 encodes three function domains that include the part of PFD, full length MAD and SRA-interacting domain (Additional file
1: Table S1).
The influences of APOL1 to innate immunity and susceptibility to kidney disease [
11‐
13] have been extensively studied since its discovery by Duchateau, et al. in 2001 [
14]. Numerous studies have revealed that the functional mutations of APOL1 associated with African narcolepsy [
1,
15], atherosclerosis [
16,
17], schizophrenia [
18,
19], cancer [
20] and other diseases. The two variants (G1: rs73885319 A > G, and rs60910145 T > G; G2: rs71785313 TTATAA/−) of APOL1 has been shown to be associated with an increased susceptibility of multiple kinds of kidney diseases, particularly in African Americans. These kidney diseases included focal segmental glomerulosclerosis (FSGS), hypertensive nephropathy (HTN), human immunodeficiency virus associated nephropathy (HIVAN), etc. [
21‐
25]. The two risk variants could also increase the severity of these kidney diseases and the risk of progress to end-stage renal disease (ESRD) [
21‐
25]. The previous reports showed that APOL1 variants could increase the risk of CKD and ESRD in patients with HIV infection [
24,
26,
27]. But unfortunately, the associations were not well validated in Caucasian and Asian populations.
In our previous study, we haven’t found the two risk variants (G1 or G2) of APOL1 in Chinese CKD patients [
28]. It suggested that there was a significant difference in the variation of APOL1 among different races, and there might be other variations in the APOL1 gene, rather than G1 or G2, associated with the kidney diseases in Caucasian or Asian population. We need to explore the differences in APOL1 variability among different races, in order to provide more information on future genetic studies on APOL1-related kidney disease.
The 1000 Genomes Project is a large survey aiming to sequence the entire genome of thousands of individuals in several populations around the world [
29,
30]. It can help researchers to investigate the relationship between genotype and phenotype and understand the genetic contribution to disease [
31]. In this study, we explored the characteristics of APOL1 gene variation in different races based on the 1000 Genomes Project database. It would be helpful for the future study which concerned the associations between APOL1 gene variations and kidney diseases.
Discussions
A majority of rare disease exhibits monogenic pathogenesis and showed the obviously regional differences, population-based genetic studies have identified lots of kidney diseases which had increased genetic risk of developing and progressing. The prevalence of chronic kidney disease (CKD) was increasing worldwide with apparently racial diversity, it indicated that genetic factors played an important role in the development of CKD, looking for CKD susceptibility genes have been becoming the mainstream. Haplotype analysis can help researchers to determine the diseases susceptible genes, and can make a better understanding about the diseases and the patients genotype.
A previous study have shown that several SNPs of APOL1 were significantly associated with ESKD than all previously reported SNPs in MYH9 [
39]. In our analysis, we found that the variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF > 1%. By describing the data sources and processing of discovery, we found the distribution of these haplotypes were significantly different among different populations, most haplotypes frequency in European present the highest levels than African. The different pattern confirmed the supposition that a stronger signature of balancing selection of APOL1 gene in African.
APOL1 Coding Region
The 11 selected SNPs in coding region included either synonymous mutations or missense variants, most of them located in exon 7 (only the first one located in exon 6). They jointly participated in the polymorphism of APOL1 four different transcripts and impacted three proteins isoform structure. The limited variants of APOL1 coding region mostly distributed in the PFD, MAD and SRA-interacting domain (Additional file
1: Figure S3) indicated that these SNP has some significant impact for the function of APOL1 protein. The previous study indicated that SP was dispensable for its toxicity, PFD was required but not sufficient for APOL1 mediated toxicity, and integrity of MAD and SRA was critical requirement for the cell injury activity of APOL1 protein [
10]. It indicated that those SNPs could contribute a significant effect on the function of APOL1 protein. For example, G1 (rs60910145) played an important role in trypanosome lysis [
40] and susceptibility to kidney diseases [
26,
27,
41,
42] or schizophrenia [
43].
An earlier study reported that about 38% African carried with G1 risk allele [
40], in 1000 Genome Project, the MAF of G1 was 26% in African population but absent in Asian and European. Two SNP of G1 didn’t conform to HWp > 0.05, the frequency of G2 also absent in the initial released 1000 Genomes VCF files (Additional file
1: Table S6). Considering the importance of G1 and G2 in APOL1 function [
44], we extended our haplotype analysis to include the two G1 SNPs.
The distributions of haplotypes with or without G1 variants were significantly different among the four populations (Fig.
3), G1 two SNP that were previously presented a higher frequency actually pull down the haplotypes frequency when considering all populations pooled together (global frequency). We suspected that the main reason was the two variation sites didn’t conform to the HW equilibrium. The conclusion could safely draw from the two results at different conditions that the influence of two SNPs of APOL1 risk allele G1 in coding region haplotypes seemingly not prominent, it may influence the progress of the relevant diseases independently.
APOL1 Upstream regulate region and 3’Untranslated region
We also analyzed variants in the URR and 3′ UTR of APOL1. The previous study indicated that there were many regulatory elements in the APOL1 gene upstream, for example, activating protein-1 (AP1) at −1034, sterol regulatory element binding protein (SREBP) sites at −1185, and large number of zinc finger binding sites (MZF1) distribution at APOL1 gene promoter region [
14]. In this study, we included the nucleotides between −1200 and +892 to cover all upstream regulatory elements as URR and 1497 nucleotides for 3′ UTR for further analysis.
The URR haplotype URR-1 present similar frequency among Admixed populations, Africans and Asians could indicate it play the same role in the three populations. While an opposite pattern is observed for other three haplotypes, their frequency in European populations is higher than other three populations. In addition, URR-2 is very rare in African populations. The results suggest that URR-1 and URR-2 could play different roles in African and European populations respectively.
The 4 haplotypes of 3’UTR haplotype was significant different among the four populations, UTR-1 appear a highest frequency in Africa populations, this finding suggests a stronger association between UTR-1 and APOL1-related disease in African populations. Haplotype UTR-2 and UTR-3 are absent or rare in populations of African ancestry, and haplotype UTR-5 is absent or rare in Asia and present relative higher frequency in Africa. The result shows the diversity of haplotypes effects on susceptibility of APOL1 gene associated diseases.
Considering we have to construct haplotypes on the basis of the accurate genotypes at huge polymorphic sites, HW equilibrium was always checked using Haploview software, and data deviated strongly from the equilibrium were submitted to retyping or discarded. However, there still some shortcomings in this study: (1) The population composition of Admixed group is relatively complex, it may affect the analysis results, data from Africa, Asia, and Europe are more reliable. (2) A little we know about the effect of intron mutations in APOL1 gene, this study did not analyze the mutations in the intron of APOL1 gene.
Conclusions
We compared the variants of APOL1 gene among Africa, Europe, Asia and Admixed populations in this study. The results indicated that the distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could be helpful for the future genetic study of APOL1 related diseases in different populations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.