Background
Despite some progress over the last decade, malaria continues to be a significant global health burden with a vaccine deemed essential to effectively control the disease in high malaria transmission zones [
1]. The RTS,S vaccine has been rolled out in three African countries [
2] but is < 50% protective [
3], suggesting further iterations are required. There are other candidates in the pipeline that show promise for incorporation into second-generation vaccines. One leading candidate antigen is
Plasmodium falciparum Reticulocyte Binding homologue 5 (Rh5, PF3D7_0424100), which is currently advancing through clinical trials [
4].
Rh5 is the smallest in the Reticulocyte Binding Protein homolog (Rh) family that includes Rh1, Rh2a, Rh2b and Rh4 [
5,
6]. Furthermore, it is the only member of the Rh family without a transmembrane domain. Rh5 has been shown to be refractory to gene knockout experiments, suggesting it plays an essential role in the invasion of erythrocytes [
5,
6] via interactions with the erythrocyte receptor basigin (BSG) [
7]. Both monoclonal and polyclonal anti-Rh5 antibodies inhibit erythrocyte invasion of multiple parasite strains by blocking the Rh5-BSG interaction in vitro [
8‐
11]. Rh5 vaccination field trials in non-human primates,
Aotus monkeys, demonstrated protection from heterologous
P. falciparum challenge [
12], while non-exposed vaccinated human volunteers from a phase 1a clinical trial, generated anti-Rh5 antibodies that blocked merozoite invasion in vitro [
4]. Furthermore, while individuals from malaria endemic regions, who are naturally exposed to
P. falciparum infections develop anti-PfRH5 antibodies at a relatively low prevalence, the presence of these antibodies have been associated with protection from symptomatic malaria in Papua New Guinea, and Mali [
13‐
15]. Based on these findings, Rh5 has been considered as a next generation blood-stage malaria vaccine candidate even though it has low immunogenicity in natural infections.
Rh5 does not function in isolation during erythrocyte invasion, but acts as part of a multi-protein complex with Rh5 interacting protein (Ripr, PF3D7_0323400) [
16], cysteine rich protein antigen (CyRPA, PF3D7_0423800) (17) and P113 (PF3D7_1420700) [
18]. The Rh5-CyRPA-Ripr complex binds better to the erythrocyte cell surface than Rh5 alone [
19], and interaction of Rh5 with its erythrocyte surface protein receptor, basigin, triggers a transient increase in Ca
2+ concentration and alters the erythrocyte cytoskeleton [
20]. Rh5 undergoes proteolytic cleavage, resulting in fragments of approximately 18 kDa and 45 kDa. Rh5 binds directly to P113 (via the smaller Rh5 fragment, [
18] and CyRPA [
17], while Ripr is associated with Rh5 through its interaction with CyRPA [
21]. Therefore, CyRPA forms the contact sites for Rh5 and Ripr. It has been suggested that CyRPA dissociates from the complex and it is excluded from the membrane during binding to basigin. The Rh5-CyRPA-Ripr complex can bind to BSG without interaction with p113. However, P113 anchors Rh5 onto the merozoite membrane, while CyRPA and Ripr do not bind to erythrocytes on their own [
16‐
18,
21].
Similar to Rh5, the genes encoding CyRPA and Ripr cannot be knocked out, suggesting that they are essential for parasite growth [
16,
18], and conditional deletion of either Ripr and CyRPA results in non-invasive merozoites [
19]. Antibodies to all three proteins (Rh5, CyRPA and Ripr) of the complex can inhibit erythrocyte invasion by multiple
P. falciparum strains [
16,
17,
22]. Furthermore, antibodies to CyRPA have been reported to block its interaction with the Rh5/Ripr complex and the formation of the multi-protein complex, leading to invasion inhibition [
17]. In African and Papua New Guinean populations, P113 antibodies have been associated with protection against clinical malaria [
13,
23]. All members of the Rh5 protein complex can, therefore, be considered potential blood-stage vaccine targets.
Polymorphisms are a particular barrier for the development of blood-stage vaccines, as proteins that are exposed to the immune system during invasion are often very diverse, presumably the result of pressure from the immune system [
24]. This problem of diversity has impeded the development of blood-stage vaccines in the past, with AMA1 being a prime example. Like the Rh5 complex, AMA1 is essential for invasion, but it is highly polymorphic, resulting in immune responses that are allele-specific, a fact that may have limited the efficacy of previous Phase IIb trials [
25]. However, Rh5, Ripr and CyRPA have been shown to be highly conserved [
5,
22,
26], although polymorphisms in these genes including p113 have not been intensively investigated. In addition, exploring genetic diversity in all members of the complex in the same infections would identify whether polymorphisms are associated, which would need to be taken into consideration during vaccine design. To explore these questions, we examined all the four Rh5 complex genes by capillary and whole genome sequencing of a cross-sectional sample of parasites from Kilifi.
Discussion
The Rh5 complex is a relatively conserved set of proteins with few polymorphisms. They are not highly immunogenic, as previously shown [
15,
23]. The negative population genetics summary statistics do not indicate balancing selection and show an excess of rare variants. This is consistent with an analysis of genomes from
P. falciparum populations in Africa, which revealed that the majority of genes were associated with a negative Tajima’s D value. Therefore, suggesting there was a historical parasite population expansion in Africa [
39‐
41]. The genes with a significant, negative population genetics summary statistics, indicate that these genes have a limited potential to retain mutations, in particular
p113 and
Ripr, which may be due to the parasite’s need to preserve their function. These proteins are involved in a critical step during the invasion of erythrocytes and this polymorphism data reinforces the fact that they are likely to make good vaccine candidates to inhibit invasion and prevent disease [
42].
Sequence data was obtained using two different methods and resulted in the identification of more SNPs using whole genome sequencing (WGS) analysis than Capillary Sequencing (CS), but there are pros and cons to both approaches. In CS, each read is accompanied by a long (on average 500 bp) chromatogram, which makes it easy to assemble and align to a reference genome in order to manually identify variants, but the process as a whole is low-throughput. In WGS, millions of short reads are produced with each read being accompanied by a quality score. It is thus not feasible to manually check the quality of each nucleotide and quality score cut-offs are set in the bioinformatic pipelines to confidently call a nucleotide. This presents a challenge in identifying indels within repeat regions—because the assembly and alignment of these regions to reference genomes is based on short reads, confidence is often low in these regions, making it difficult to unambiguously determine the numbers of repeat nucleotides [
43]. However, the ability of WGS to generate large numbers of reads and identify SNPs in mixed infections allows more robust identification of SNPs, and it is therefore more reliable in the detection of low frequency variants as compared to CS. The Global MalariaGEN dataset was used to confirm the SNPs identified by the two methods. A large majority (> 65%) of the SNPs described in these samples have also been described in other locations within the Global MalariaGEN data, providing confidence both the high frequency and rare SNPs detected. Furthermore, most SNPs that were only identified by one method were rare variants, making it not surprising that there were missed by the other method, as the two methods were applied to different sample sets. If a rare variant is only present in few infections, the chances of such infections being present in the samples used for both methods is significantly reduced. It is also important to note that the samples utilized in WGS and CS, were obtained in different time points, which are 2005–2007 and 2013, respectively. In addition, the parasites used in obtaining the whole genome sequence data underwent culture-adaptation prior to sequencing, therefore the quality of DNA is expected to be higher in culture adapted parasites due to less contamination by host DNA. Cultured
P. falciparum parasites have been known to differ significantly from source populations due to adaptation to environments that exclude the host immune responses [
44]. There are therefore multiple reasons that could explain why different SNPs were identified in the two different approaches.
The majority of the polymorphisms in this complex or merozoite invasion antigens were rare, which is in contrast to previous findings from surface exposed and abundant merozoite antigens such as apical membrane antigen 1 (AMA1) [
45], merozoite surface protein 1 (MSP1) [
45], MSP3 [
46] and erythrocyte binding antigen-175 (EBA175) [
47], which are under balancing selection and exhibit allele-specific immunity in vaccine trials. In a recent study of samples from Nigeria, only 5 non-synonymous SNPs were identified in Rh5: K62R, T81Q, P197S, C203Y and H240R [
48], of which only the C203Y mutation was identified in our study, while codon 197 was described in the global MalariaGEN dataset, codons 62, 81 and 240 are potentially rare variant sites. Of note, the high frequency sites of codons 147 and 148 in this study were not identified in the Nigerian study. However these aforementioned sites were described alongside codons S197Y, C203Y and I410M as common variants occurring at a frequency above 10% globally [
9]. However, the I410M mutation was a rare variant (< 5%) in our population. It appears that apart from a few high frequency sites that have been consistently identified in previous studies and in our study, most mutations in Rh5 are rare variants. Rh5 antibodies primarily inhibit parasite invasion by disrupting the Rh5-basigin interaction [
38].
This study identified only one Rh5 mutation C203Y at the Rh5-Basigin interface. It has been shown that the Rh5 protein variant with the 203Y mutant binds to recombinant basigin with the same affinity as the Rh5 C203 wild type [
49]. It is therefore likely that other rare Rh5 mutations that cluster around the basigin interface will prevent binding of monoclonal antibodies. Based on monoclonal antibody data [
50], these SNPs fall within the region of a large number of mouse and human antibodies that have shown neutralising activity within codons 26–352, suggesting that the rare variants identified in this study will potentially have an effect on antibody binding epitopes [
9,
11,
17]. A similar scenario is observed with CyRPA, where only 1 SNP (R339S) was identified from a sample of 12 geographically distinct laboratory isolates and 6 field isolates [
22] and again this SNP was not identified in the Kilifi samples. An analysis of 80 Ripr sequences from Uganda, identified 16 SNPs of which two codons (190 and 259) were > 5% in frequency. This study only found 9 of the 16 Ugandan SNPs and the SNPs unique to the Ugandan population were all singletons [
26]. Moreover, Ntege et al. [
26] also showed, like this study, a negative and significant Tajima’s D index. These studies further indicate that these genes tend to contain rare variants. The common variants identified across all the study sites should be considered in future studies to determine if they influence the functionality of the multiple protein complex.
The low immunogenicity of Rh5 complex members in field studies [
12,
15,
22] would suggest limited immune pressure on these antigens and thus a limited need for the parasite to acquire mutations to escape host immune responses. This could explain the limited high frequency polymorphisms and the excess of rare variants observed. Slightly higher responses have been observed for p113 in individuals in Kilifi, when compared to Rh5 [
23]. Beside the role of p113 in invasion by binding to the Rh5 N-terminal region [
18], p113 is also thought to be involved in translocation through association with the
Plasmodium translocon of exported proteins (PTEX), which is known to be a mechanism of immune evasion [
51]. Further investigation is required to understand the effect of P113 polymorphisms on translocation. While there is limited literature on natural immune responses to Ripr, we anticipate similar findings as seen with CyRPA and Rh5, given that Ripr is part of the same Rh5 protein complex. The Rh5 protein complex is hidden within the merozoite apical end during tight junction formation. It is, therefore, likely that these proteins are rarely exposed to the immune system and thus their immunogenicity in individuals living in malaria endemic regions is low. Their role in tight junction formation indicates an important function in merozoite invasion, which has been determined by an inability to genetically disrupt all of the 4 genes and by the protective immune responses generated by antigens like Rh5 and p113 [
50].
Most of the observed SNPs were not in statistically significant LD with the exception of codons 147 and 148 for Rh5 and 985 and 1003 in CyRPA, which are 3 bp and 54 bp apart respectively. The limited LD is likely due to a combination of the fact that most of the SNPs are rare variants and therefore occur at a low frequency, and the limited sample size in this study. Rh5 codons, 147 and 148 are included in the protein structure [
38] on the upstream of the alpha helix, while the structure of Ripr has not been fully resolved. Since they are high frequency SNPs, they may be involved in processes other than protein-proteins interactions, but these are yet to be determined. Only one high frequency SNP at Rh5 codon 203, identified by both CS and WGS, has been shown to be localized in the Rh5-basigin interface [
38].
The development of new tools and adaptation of existing tools for use in malaria elimination and eradication remains a priority, and deeper understanding of polymorphism(s) in vaccine candidate genes is particularly important. This study highlights pros and cons to both CS and WGS approaches to identifying vaccine-relevant polymorphisms. The ideal molecular tool should be able to provide quality and high-throughput sequence reads capable of detecting low frequency variants including indels. One such approach would be amplicon deep sequencing, where longer fragment amplicons can be generated and sequenced using an NGS platform, focussing analysis on the regions of interest rather than the whole genome, but producing deeper and higher quality data than CS. Low frequency mutations should be assessed by functional assays to ascertain their biological and immunological relevance. One of the main obstacles in the development of effective vaccines for malaria is the occurrence of polymorphisms on candidate vaccine targets that result in strain-specific immunity. Among the members of the Rh5 complex, Rh5 is the most advanced in vaccine development. The identification of a limited number of high frequency polymorphisms on Rh5 shows promising prospects of Rh5 based vaccines in this region, but it is still possible that low frequency variants may lead to immune evasion—this needs to be systematically investigated.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.