Background
Malaria, caused by
Plasmodium spp. infections, is one of the most significant life-threatening infectious diseases to humans worldwide. It accounted for more than 216 million cases and approximately 450,000 deaths across the globe in 2017 [
1]. The Greater Mekong Subregion (GMS) including Myanmar has long been one of the most malarious regions in the world [
2]. Among the countries in GMS, Myanmar has the highest malaria burden, accounting for an estimated 77% of malaria cases and approximately 79% of malaria deaths in the GMS [
3]. In spite of recent decreases in malaria cases and deaths, malaria is still a major public health concern in Myanmar [
4].
To date, there is no licensed vaccine against malaria, though many efforts and studies have been performed in order to develop effective vaccines. Various vaccine constructs based on diverse antigens from sexual and asexual stages of
Plasmodium falciparum have been investigated. Among these, RTS,S, currently the most advanced malaria vaccine candidate [
5,
6], is based on circumsporozoite protein of
P. falciparum (PfCSP). RTS,S is comprised of a liposome-based adjuvant (AS01) and hepatitis B virus surface antigen (HBsAg) virus-like particles incorporating a portion of the PfCSP genetically fused to HBsAg. PfCSP is a dominant surface protein of sporozoites, and it plays a critical role in the invasion of hepatocytes by sporozoites [
7‐
9]. PfCSP is divided into three distinct regions: a highly variable central repeat region flanked by a conserved N-terminal region and a C-terminal non-repeat region. The central repeat region, which has been recognized as a major target for antibody-mediated neutralization, is rich in Asn-Ala-Asn-Pro (NANP) tandem repeats and also contains a small number of Asn-Val-Asp-Pro (NVDP) motifs [
10‐
12]. The C-terminal non-repeat region includes two polymorphic sub-regions, Th2R and Th3R, where T cell epitopes were identified. These regions show moderate polymorphisms which might have resulted from natural selection by the host immune system [
13‐
15].
Recent genome sequencing studies have demonstrated that
P. falciparum from different geographic regions have diverse genetic makeup [
16,
17], which emphasizes the importance of comprehensive analysis of parasite genetic diversity and population structure in the global
P. falciparum population. Indeed, most
P. falciparum vaccine candidate antigens including PfCSP have been found to show various genetic and antigenic polymorphisms in global isolates [
18], which can obstruct or reduce the efficacy of vaccines based on PfCSP. Therefore, understanding the genetic nature of vaccine candidate antigens in global
P. falciparum isolates is critical for designing an effective vaccine. In this study, genetic polymorphism and natural selection of PfCSP in
P. falciparum Myanmar isolates were analysed. A comparative analysis of global PfCSP was also performed in order to gain an in-depth understanding of the genetic makeup of PfCSP in the global
P. falciparum population.
Discussion
Diverse kinds of
P. falciparum antigens have been extensively studied as candidate antigens for a malaria vaccine. However, genetic and antigenic variations in vaccine candidates in the global
P. falciparum population have been immense challenges in developing an effective malaria vaccine and to certify the efficacy of the vaccine. Therefore, understanding the genetic nature and antigenic variation of vaccine candidate antigens among global
P. falciparum populations is important since this can provide potential rejoinders on the effects of genetic diversity in the global population for vaccine efficacy and valuable information for designing optimal vaccine formulation [
27]. PfCSP is a leading candidate for a malaria vaccine and recent Phase III RTS,S vaccine trials resulted in significant reduction rates in clinical malaria [
5,
6]. However, the PfCSP antigen formulated in RTS,S is a single variant, and therefore the impact of natural genetic variation in the global PfCSP population on vaccine efficacy remains unclear. In this study, the genetic polymorphism and natural selection in the Myanmar PfCSP and global PfCSP populations were comprehensively analysed.
Myanmar PfCSP had a largely well-conserved N-terminal region, which coincided with PfCSP populations from other geographical areas [
18,
27‐
30]. A few amino acid polymorphisms were identified in global PfCSP populations, but A98G was the only commonly identified amino acid change in global PfCSP N-terminal region, although its frequency differed by country. The most important polymorphic characteristic identified in Myanmar and global PfCSP was a 19 amino acid length insertion (NNGDNGREGKDEDKRDGNN) in the middle of this region. This insertion was identified in all global PfCSP enrolled in this study with the only exception being Indian PfCSP, but the frequency of this insertion varied with PfCSP populations from different geographical regions. The N-terminal region of PfCSP is known to play an essential role in the invasion process of sporozoites to hepatic cells by mediating or facilitating the interaction between sporozoites and host cells [
30‐
32]. A monoclonal antibody that binds to a linear epitope,
81EDNEKLRKPKH
91, in the N-terminal region of PfCSP effectively neutralizes sporozoite infectivity in vivo, suggesting a critical role for this epitope in sporozoite infectivity to hepatocytes [
33]. The functional significance of the 19 amino acids insertion in PfCSP N-terminal region is currently unclear. However, considering that this insertion is essentially located in the front of the
81EDNEKLRKPKH
91 linear epitope, and that global PfCSP, except for the Indian population, had the insertion, a study aimed at understanding the role and evolutionary implication of this insertion is warranted. Most amino acid polymorphisms identified in the N-terminal region of global PfCSP was located in the predicted T-cell epitope region (
84EKLRKPKHKKLKQPADGNPDP
104), indicating that this region is under host immune responses. The N-terminal region of PfCSP has been largely neglected as a potential vaccine target in spite of being a target of inhibitory antibodies and protective T cell responses. The functional importance of the N-terminal region in protective immunity has been demonstrated. Polypeptides flanking the PfCSP N-terminal region evoked the production of inhibitory antibodies for hepatocyte invasion by sporozoites, and these polypeptides are likely to render partial protective immunity in people residing in malaria-endemic regions [
34]. A recent study also suggested that most of the effective antibodies that potently inhibit malaria infection bind not only to the repeat region, but also to a portion of N-terminal junction of PfCSP [
35]. These collectively highlighted the potential of the N-terminal region of PfCSP as a part of PfCSP-based vaccine constructs for malaria vaccine formulation. The low genetic polymorphic nature in the N-terminal region of global PfCSP also supports the notion that the region can be an attractive component of PfCSP-based vaccine.
The central repeat region of PfCSP has been recognized to play crucial roles in sporozoite formation and development [
36]. It has been postulated that the genetic diversity of this region may be maintained by balancing selection, mainly affected by host’s immune responses [
33]. Differing numbers of tetrapeptide repeats have been identified as an important source of genetic polymorphism in PfCSP. As expected, high levels of genetic polymorphisms due to different numbers of repeats were identified in the central repeat region of Myanmar PfCSP, which resulted in 14 different haplotypes. Interestingly, two novel repeats, NANS and NTNP, were identified in two haplotypes of Myanmar PfCSP, although their frequencies were low. Numerous variant forms of repeats including NVVP, NAKP, NAHP, NAIP, NVNP, NANL, NVAD, NPNP, NADP, KANP, and SANP have been reported in global PfCSP [
18], but the effect of these variations is still not clearly understood. The number of repeats in the central repeat region is known to affect PfCSP stability. The stability of the type-β turn structure increases with the number of repeats [
27]. Myanmar PfCSP had a high number of tetrapeptide repeats in the central repeat region, as 86.3% of Myanmar PfCSP had a number of repeats ranging from 40 to 43. Comparative analysis of the number of NANP repeats in Myanmar PfCSP and global PfCSP suggested a differing distribution of the repeats according to geographical origin, with the highest in Asian PfCSP (40–43) and the lowest in African and South American PfCSP (36–37). These suggested that PfCSP may have evolved separately, probably by evolutionary force in order to maintain or enhance protein stability, or to evade host immune response, in different geographical origin
P. falciparum populations, resulting in the differing number of tetrapeptide repeats in the global PfCSP population. The RTS,S, the current malaria vaccine, is composed of 19 NANP tetrapeptide repeats and C-terminal T cell-epitope that are linked to the Hepatitis B surface antigen [
37]. To date, there has been no direct evidence indicating that different numbers of repeats can affect the efficiency of RTS,S. However, considering that highly heterogenous numbers of repeats are maintained in the natural PfCSP population, studies evaluating the effects of polymorphic nature in the central repeat region to RTS,S vaccine efficacy are necessary.
The C-terminal non-repeat region of Myanmar PfCSP displayed limited diversity with only three differing haplotypes among 51 Myanmar PfCSP sequences, coinciding with the previous reports on PfCSP from different geographical origins [
29]. Haplotype 3, which had KHIEQYLKKIQNSL and NKPKDELDYEND in the Th2R and Th3R regions, was the most prevalent haplotype found in Myanmar PfCSP. This allelic variant was also detected at a high frequency in Asian PfCSP populations [
18,
28,
38,
39]. The overall values for haplotype diversity (H) and nucleotide diversity (π) for PfCSP C-terminal region were higher in African PfCSP than in PfCSP from other continents, indicating that African PfCSP had higher level of genetic diversity. Comparative sliding window plot analysis of π in the C-terminal region of global PfCSP revealed similar patterns of nucleotide diversity across the region. Asian PfCSP, African PfCSP, and South American PfCSP displayed relatively similar patterns of π with two peaks at the Th2R and Th3R regions, suggesting that the genetic variations were mainly concentrated at these regions. However, differences were also found between or among PfCSP from different geographical areas. A greater π value was identified at the Th2R region than the Th3R region in Asian, African, and South American PfCSP. Meanwhile, Oceanian PfCSP revealed only a major peak of π value at the Th3R region. Polymorphisms in the Th3R region have been demonstrated as being associated with HLA binding and cytotoxic T cell reactivity [
40,
41], thus these polymorphisms may assist parasites in escaping the host immune pressure. Natural selection analysis of global PfCSP C-terminal region suggests that this region is likely to be under natural selection which may maintain or generate genetic diversity in the global PfCSP population. The dN–dS values for Myanmar PfCSP and global PfCSP were positive, implying that balancing selection might act in this region. The values of Tajima’s D and Fu and Li’s D and F revealed complicated patterns that were distinct between or among global PfCSP. These results suggested that global PfCSP was under a complicated influence of natural selection, in which either positive selection or purifying selection might have occurred in the population, depending on the geographical origin. Possible recombination events in the global PfCSP C-terminal region were also predicted. Higher values of recombination events were found in African PfCSP than in PfCSP from other geographical areas, suggesting that African PfCSP might allow for more opportunity for inter- or intra-allelic recombination than other geographical PfCSP. This might be due to the high multiclonal infection rate of the parasite as well as subsequent cross fertilization and active recombination in mosquitoes in Africa. Interestingly, non-neglectable recombination parameters with a high haplotype diversity were predicted in Vietnamese PfCSP. Compared to the values for recombination parameters and haplotype diversity of other Asian PfCSP populations, these were extremely high in Vietnamese PfCSP.
Considering that Vietnam is a hypoendemic country with a low malaria transmission rate, the reason why Vietnamese PfCSP showed high recombination event and haplotype diversity is unclear and it should be elucidated further. Collectively, the results of genetic diversity analysis in the C-terminal region of global PfCSP suggested that global PfCSP showed limited genetic diversity in the region. However, the genetic diversity pattern of the PfCSP C-terminal region differed slightly according to different geographical origins. Complicated natural selection acts on the global PfCSP C-terminal region, which produces genetic diversity of the region in global PfCSP. Recombination may also contribute to the genetic diversity of global PfCSP, although the recombination parameters differed by geographical origins. These genetic polymorphisms in the C-terminal region of global PfCSP suggest that more concern is required for design formulation of PfCSP-based vaccine.
Haplotype network analysis of 817 global PfCSP sequences indicated that Asian and Oceanian PfCSP formed limited numbers of clusters. Meanwhile, African PfCSP showed highly-branched and complicated patterns of haplotype diversity. No haplotype was identified that fully covers PfCSP from all of the geographic regions analysed in this study. Most singletons were mainly occupied by African PfCSP, supporting the notion that African PfCSP had higher genetic diversity than PfCSP from other geographical regions. The current RTS,S recombinant vaccine was constructed with PfCSP of
P. falciparum NF54/3D7 strain [
42]. Haplotype 7 with a frequency of 1.71%, which was shared by African PfCSP, was identical to 3D7 PfCSP. Many studies on evaluating the effectiveness and safety of RTS,S have been performed in Africa [
5,
43‐
46], and it has been suggested that RTS,S is likely to be effective, at least in Africa. However, its efficacy worldwide may be challenging. As presented in this study, genetic heterogeneity of the PfCSP regions included in RTS,S, as well as the complicated haplotype diversity of PfCSP between and among global PfCSP, suggest that more attention is necessary toward developing a PfCSP-based vaccine, and a new approach for RTS,S that is effective in a variety of areas should be considered. If it is difficult to develop effective vaccine that works against global malaria populations, the development of an individual vaccine that works in particular malaria transmission areas by including genotypes prevalent in the geographical regions can also be considered. For example, considering that H45 and H48 are the most prevalent haplotypes of PfCSP in the Asian and Oceanian PfCSP populations, these haplotypes could be considered in designing a PfCSP-based vaccine for Asian and Oceanian countries.
The limitation of this study is that Myanmar PfCSP sequences analysed in this study were from P. falciparum isolates that collected in restricted areas of Myanmar. Therefore, nation-wide analysis of PfCSP in P. falciparum isolates collected from different regions of Myanmar is needed to clearly understand the overall genetic diversity and population structure of Myanmar PfCSP. Further examination of PfCSP nucleotide and amino acid variations in diverse PfCSP populations with a larger number of global PfCSP sequences would be also necessary to better understand the polymorphic nature of PfCSP.
Authors’ contributions
HGL and JMK carried out genetic analysis of PfCSP. MM, HJ, TTL, JL, MKM and KL contributed the blood sample collection. HGL, JMK, WMS, HJS, TSK and BKN analysed and interpreted the data. BKN designed and supervised the experiments. HGL and BKN wrote the draft of the manuscript. All authors read and approved the final manuscript.