Introduction
Rheumatoid arthritis (RA) is a systemic and chronic inflammatory disease involving mainly the peripheral joints of hands and feet initially. RA is thought to arise from the combination of both genetic and environmental factors. The largest genetic risk arises from a group of alleles of the HLA-
DRB1 gene, collectively referred to as the shared epitope (SE). Candidate-gene studies and genome-wide association studies (GWASs) have identified more than 30 non-HLA genetic loci to date [
1]. The majority of single-nucleotide polymorphisms (SNPs) associated with RA susceptibility have been identified and validated in patients of European ancestry who were seropositive for either anti-citrullinated protein antibodies (ACPAs) or rheumatoid factor.
Several studies have shown that some validated RA risk alleles contribute to risk in other ethnic groups, in particular of Asian ancestry [
2‐
4], and concluded that some RA susceptibility loci show population-specific associations but that others overlap between Asian and European populations.
Ethnic differences in allele frequencies of autoimmune disease-associated SNPs have been clearly shown [
5], and a lack of association of a particular SNP in a population may be more likely to represent a lack of power due to low allele frequency than to represent a true negative effect at the individual level. The
PTPN22 polymorphism rs2476601, a cardinal example, has an allele frequency of above 10% in healthy individuals of Northern Europe, 3% in Southern European countries, and 2% in Northern African and African-American populations but is absent in Asian populations [
6]. However, most common SNPs associated with RA in persons of European ancestry confer risk of RA in African-Americans, whereas discordant genetic loci appear to be the minority [
7,
8]. Interestingly, the SE is associated with susceptibility to RA in African-Americans through European genetic admixture [
9], and this is likely to be the case for several SNPs outside the HLA-region. African-Americans and Africans, therefore, have to be considered separately for epidemiologic and genetic studies.
Epidemiologic and genetic data on RA in African populations are scarce. RA genetics has been studied mainly in underpowered candidate gene association studies in Northern and Southern Africa, and there have been few reports in Tunisia (
PTPN22 [
10,
11],
TNFAIP3 [
12],
STAT4 [
13],
IRF5 [
14],
DNASE1 [
15],
SUMO4 [
16], and
SLAMF1 [
17]), Morocco (HLA [
18]), Egypt (
PADI4 [
19],
TRAF1/C5, and
STAT4 [
20]), and South Africa (HLA [
21],
IL-10 [
22], and
p53 [
23]). RA has uncommonly been reported in Black Africans in West and Middle Africa and its prevalence there is still unknown [
24,
25]. A few studies performed from the 1960s to the 1990s on the frequency of the disease suggested the rarity of RA in rural West Africa [
26,
27]. Recent reports from West and Middle Africa studied epidemiological, clinical, and serological profiles of RA patients in small series [
25,
28‐
31]. To the best of our knowledge, only two small genetic studies [
29,
32] have been performed so far in West/Middle Africa. A study performed in Senegal showed that the risk of developing RA was associated with HLA-DR10 but not HLA-DR4 [
32]. We recently reported an important difference in the proportion of SE-positive patients in Cameroon compared with European patients, despite a similar proportion of anti-cyclic citrullinated peptide antibody (anti-CCP) positivity, suggesting that the contribution of other non-HLA genetic factors to disease susceptibility could also differ between Caucasians and Black Africans [
29]. However, no genetic study of non-HLA markers of RA susceptibility has ever been performed in patients originating from ethnic groups of West or Middle Africa, although this region of Black Africa comprises over 350 million inhabitants.
Using a previously characterized cohort of Cameroonian patients with RA [
29], we aimed to determine whether Caucasian non-HLA RA susceptibility loci are shared with Black African patients with RA, as has been shown for Black Americans [
7]. Because the small size of the cohort prevented us from drawing any conclusions at the individual SNP level, we computed an aggregated genetic risk score (GRS) [
33] to cumulatively test the association of Caucasian RA susceptibility SNPs as a whole with disease susceptibility in Cameroon.
Discussion
The prevalence of HLA-
DRB1 SE alleles is highly variable across populations. Across Europe, for example, the prevalence of HLA-
DRB1*0401 has been shown to vary from 4.3% in Spain to 24% in Sweden [
45]. A higher prevalence of SE alleles in a population seems to correlate with a higher prevalence of RA [
46]: the frequency of the SE is 59% in the Cree and Ojibway, a North American Native people in central Canada who have a higher prevalence of RA than Caucasians living in the same area. Ethnic variations in both the frequency and types of SE-carrying HLA-
DRB1 alleles have been reported across different populations [
47], but very few studies have been performed in Africa. Previously, we reported a lower prevalence of SE alleles in Cameroon and reviewed the scarce literature on studies performed in Black Africans [
29]. In the present study, we investigated the contribution of non-HLA genetic markers to RA susceptibility in Cameroon. In a first step, we compared the MAFs of those polymorphisms between different populations. In a second step, we tested the association of confirmed Caucasian non-HLA RA susceptibility loci 'in aggregate' with disease susceptibility since the number of Cameroonian patients with RA was very low. No study of non-HLA genetic markers of RA susceptibility has ever been conducted before in Black Africans from Central/West Africa.
We pooled YRI individuals and Cameroonian controls on the basis of the observation that MAFs are very similar between those two datasets. Although samples have been collected from two neighboring countries, population stratification issues are possible. The low number of SNPs and samples tested here prevented us from addressing this possibility directly. There are more than 2,000 distinct language groups in Africa; ethnic origin, language, geographical location, and genetic differences correlate, so that population structure and genetic diversity are much larger on the African continent than anywhere else in the world [
48,
49]. Tishkoff and colleagues [
49] studied 113 geographically diverse African populations, including 37 Cameroonian (including Bamoun, Lemande, and Bulu) and 9 Nigerian (including Yoruba) populations. Phylogenetic and population structure analyses showed that most of the Cameroonian populations are closely related genetically and that the genetic distance between them and Yoruba is small. Most Nigerian and Cameroonian populations belong to the West/Central Africa population cluster. However, the East Africa cluster - containing, for example, the Luhya population in Webuye, Kenya (LWK) - is genetically more distant. As an indirect way to estimate the putative impact of within-population structure on our results, we compared the MAFs of non-HLA RA susceptibility loci between Cameroonian controls and LWK individuals (MAFs of 24 SNPs out of 28 studied here were publically available for LWK; see Materials and methods). MAFs of Cameroonian controls and LWK individuals were as tightly correlated (correlation coefficient = 95.1%,
P = 1.2 × 10
-12) as were MAFs between Cameroonian controls and YRI individuals (correlation coefficient = 93.8%,
P = 1.7 × 10
-13). However, there was no correlation between MAFs of LWK and UK controls (correlation coefficient = 22.6%,
P = 0.29). These findings are compatible with the out-of-Africa hypothesis of human origins and the associated population bottlenecks, which apparently resulted in the alteration of the MAFs studied here. Given the similarity in MAFs across African populations that are more distantly related (West and East Africa), within-population admixture in our case control study of Yoruba/Cameroonian individuals is unlikely to confound the findings. Moreover, population stratification issues are less likely to affect conclusions based on an aggregate GRS than conclusions at the individual SNP level.
Although the sample size is small, there is no doubt that the risk allele frequency of Caucasian RA susceptibility loci is very different in the African samples available. The vast majority of Caucasian susceptibility loci are common genetic variants for reasons related to study design and power issues. Polymorphisms with an allele frequency close to 50% should be detected even in small samples; however, several were barely detectable in the African control samples.
As expected, the Caucasian GRS showed a strong and significant association in the large UK cohort studied here. Even when 1,000 random samples of as few as 43 UK RA cases and 163 UK controls (size of the African sample) were tested for the association of the GRS with RA, the vast majority of them showed an OR of larger than one (997 of 1,000) or a P value of smaller than 0.05 (763 of 1,000). The OR in the African sample was much smaller than the smallest OR in the UK, a situation that is extremely unlikely to be explained by small sample size (P <0.001). The effect size ratio - that is, ORUK/ORAfrica - is 4.04 (95% CI 1.6 to 10.0). Such a striking difference using a small number of Cameroonian RA samples highlights an important difference in the genetic architecture of RA susceptibility between the UK (or Caucasians) and Cameroon (West/Central Africa).
Several hypotheses can be postulated to explain the effect size ratio. First of all, the currently known Caucasian RA susceptibility loci might constitute a minority of the total RA susceptibility loci and might not be shared with African patients. Indeed, many more susceptibility variants, possibly thousands of SNPs, could contribute to the susceptibility to RA [
50]. Secondly, even if those markers are shared, their effect size might be different between Caucasians and Africans. This would explain on its own the difference observed since the Caucasian GRS has been computed by using the Caucasian effect sizes. Thirdly, the confirmed RA susceptibility loci from the meta-analysis by Stahl and colleagues [
1] might not be the causal SNPs in Caucasians but SNPs in linkage disequilibrium (LD) with the causal SNP. Fine-mapping experiments for most of those loci have not been performed yet. Therefore, the causal SNP, even if shared between the two populations, might be in tight LD with the Stahl SNPs in Caucasians but not in Africans. Under this hypothesis, an effect size ratio of 4.04 indicates that many of the Stahl SNPs might not be causal. This difference in the genetic architecture of the human genome between Africans and Caucasians can be further illustrated at the
MMEL1/TNFRSF14 locus; rs10910099 was used as a proxy for the Stahl SNP rs3890745. The LD (r
2) between these two markers is 0.926 when the 1,000 Genome CEU (Caucasian) is used as a reference panel but is 0.662 when the 1,000 Genome YRI panel is used. It is not known which one of the two SNPs, if either, is the causal.
Including the HLA-
DRB1*0401 allele in the computation of the GRS further increased the gap between the association pattern in the UK versus Africa, consistent with previously published observations on the low allele frequency of the HLA-
DRB1*0401 allele in African patients with RA [
29,
32]. Therefore, the genetics of RA differs significantly between the UK and Cameroon in terms of allele frequencies of currently known HLA and non-HLA markers and their aggregate effect on disease susceptibility. Further studies are required in larger samples of African RA patients of different origins to investigate whether this statement can be generalized for Caucasians and Black Africans. Owing to the difference in the LD structure between Caucasians and African populations, well-powered GWASs of patients with RA will greatly increase our understanding of the contribution of genetic factors to disease susceptibility and allow deeper fine-mapping experiments.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
SV performed the statistical analysis and drafted the manuscript. EF performed the genotyping and helped to process collected clinical samples. ML participated in the statistical analysis. JB and SB helped to process collected clinical samples. CG is one of the founders and principal investigators on the Cameroonian cohort and helped to conceive the study and participated in its design and coordination. MS-N is one of the founders and principal investigators on the Cameroonian cohort and collected phenotypic information and clinical samples. AB helped to conceive the study and participated in its design and coordination. All authors read and approved the final manuscript.