Background
Plasmodium vivax is the most widely spread human malaria parasite, infecting 22 million people each year outside of sub-Saharan Africa.
Plasmodium vivax has a firm hold in South and Southeast Asia, where it accounts for 80 % of the total estimated cases from three countries: India, Indonesia and Pakistan [
1]. Although
P. vivax malaria is acute and excruciating, it is considered a benign form of tertian fever with fewer complications compared with
Plasmodium falciparum [
2]. However, recent evidence has indicated the occurrence of complicated vivax malaria cases, especially in Asia [
3,
4]. The problem is further compounded by the emergence of drug resistance in vivax malaria [
5‐
12]. The partial effectiveness of drug treatments and subsequent recurrent infections contribute to severe anaemia from vivax malaria [
13].
In India, high endemicity in conjunction with the dense population contributes to the burden of vivax malaria, leading to vast social and economic consequences. Recent reports indicated equal prevalence of both
P. vivax and
P. falciparum in India [
14]. Despite its prevalence, the lack of knowledge regarding global endemicity is a major hindrance to controlling vivax malaria [
15]. Investigating the parasite population structure for genetic polymorphisms could aid in understanding the role of genetic diversity in malaria transmission [
16] and is essential for the control and elimination of malaria [
17].
Despite the high prevalence of
P. vivax, few studies have investigated the genetic diversity of natural
P. vivax populations in India. The
P. vivax genome exhibits greater diversity compared with
P. falciparum [
18] and displays high levels of antigenic polymorphisms, which indicates the presence of sophisticated mechanisms to evade the human immune system [
19]. However, recurrent infections that arise from recrudescence, re-infection or relapses make it difficult to interpret the results of clinical
P. vivax studies that aim to determine the efficacy of treatment strategies.
Elucidating the genetic diversity of Indian
P. vivax isolates using antigenic markers could augment the understanding of the biology and epidemiology of
P. vivax. Additionally, understanding host genetics and environmental conditions will further aid in developing strategies for the effective control of malaria. To date, a number of single-copy antigenic genetic markers have been tested and reported for Indian isolates [
20‐
24].
The gene of the
P. vivax merozoite surface protein-3 alpha (
Pvmsp-
3α) is one of the most commonly characterized molecular markers in
P. vivax genotyping studies.
Pvmsp-
3α is a potentially an important vaccine candidate expressed on the merozoite surface of the parasite and encodes a protein with three blocks of alanine-rich domain containing heptad repeats and is predicted to form α-helical coiled-coil tertiary structures [
25]. Sequence polymorphism is concentrated in the central domain and can comprise of numerous point mutations and large insertions and deletions [
26,
27]. These polymorphisms confer antigenic diversity in
Pvmsp-
3α; however, the alanine-rich heptad repeats, which are predicted to form an intramolecular coiled-coil, are conserved [
28]. The
msp3 paralogs from
P. vivax show weak similarity to
msp3 gene family members on
P. falciparum [
29]. Although both PfMSP-3 and PvMSP-3 protein contain blocks of alanine-rich heptad repeats followed by an acidic C-terminus [
30], there is no supporting evidence of msp3 family of
P. falciparum being a homologue to that of
P. vivax [
28]. In India, this polymorphic marker has been characterized in Kolkata, which is a hypo-endemic region [
20], and in Chennai, which exhibits variable endemicity [
21] and extensive polymorphisms in the single-copy gene and encoded protein. High genetic diversity in this locus has also been reported in South and Southeast Asian countries, including Korea [
31], Nepal [
32], Pakistan [
33,
34], Bangladesh [
35], Sri Lanka [
36] and Thailand [
37].
Studying the pattern of variation and distribution of
Pvmsp-
3α polymorphisms in different blocks of MSP-3α among isolates of South Asia and around the world will aid in understanding of the evolutionary mechanisms underlying variation patterns [
38]. Studies on genetic variation in different regions with malaria have revealed numerous alleles and specific variants in different MSP-3α blocks. Additionally, the allelic forms in different blocks were observed in diverse populations worldwide [
39‐
43]. PvMSP3α is composed of four regions which include an N-terminal signal sequence, polymorphic alanine rich repeat region as block I, less variable region as block II and an acidic C-terminus. Deletion within block I region is the basis of size polymorphism in
msp-
3α whereas in block II, the variations was clustered around the two structural motifs (motif I: MSELEK/
LS
KLE
E and motif II: TAANVVKD/
KEA
TAAK
L) [
26,
38,
44]. Among different populations block II was found to be relatively conserved with synonymous and non-synonymous mutations. Synonymous mutations were seen in low frequency and were population specific however, non-synonymous mutations were extensively shared among different parasite populations [
38]. Restriction fragment length polymorphism (RFLP) analysis was used to determine the diversity of this gene using sequenced representative isolates. Patterns and levels of genetic diversity provide insight into population structure and aid in testing for balancing or negative selection acting on this gene.
Analysing
P. vivax population structure is fundamental to understanding the role of genetic diversity in the transmission of malaria. Moreover, knowledge of the magnitude of the genetic polymorphisms within
P. vivax populations is an important element for the development of strategies to effectively control malaria [
45].
Pvmsp-
3α is a potential antigen for vaccine development, as indicated by a study of small children in a malaria-endemic region, Papua New Guinea (PNG) [
46]. Variation in the
msp-
3α allelic patterns of
P. vivax in India provides fundamental knowledge for inferring
P. vivax population structure and therefore information that can be used to help design MSP-3-based malaria vaccines.
In this paper the extent of allelic and sequence diversity in Pvmsp-3α in field isolates collected from different geographical regions of India with varied malaria transmission patterns was investigated. To accomplish this, isolates from different endemic areas were collected and amplified by nested polymerase chain reaction (PCR) and analysed using PCR-RFLP analysis.
Discussion
Malaria remains the most devastating global human parasitic infection, and in 2015, about 214 million malaria cases and an estimated 438,000 malaria deaths were reported [
52]. Additionally, the incidence of malaria in India accounted for 58 % of cases in the South Asian region [
1].
P. vivax is the most geographically widespread human malaria parasite, and it annually accounting for 70–80 million clinical cases throughout tropical and sub-tropical regions worldwide. In India,
P. vivax causes approximately 42–45 % of all malaria infections [
53].
To develop and evaluate suitable novel control strategies against this parasite, it is important to know the extent of the polymorphisms that exist in the population. Despite more than a century of efforts to eradicate or control malaria, parasite diversity and the genetic variability of the
Plasmodium parasites has made it difficult to eradicate this disease. The variability in the polymorphic markers impairs the effectiveness of antibody repertoires generated during previous infections and also hinders development and testing of new drugs and malaria vaccines [
54]. This study was carried out with the intention to further understanding of the genetic structure and estimate the extent of the diversity that exists in
Pvmsp-
3α among the isolates of the Indian subcontinent.
Observations in the present study revealed that
Pvmsp-
3α was a highly diverse and polymorphic marker among the field isolates in India. Analysis of 182 field isolates collected from eight different locations showed different epidemiological patterns, which indicated extensive and diverse variation. A high degree of allelic diversity previously reported by Kim et al. [
20] among the isolates of Kolkata (India), an area with low transmission of malaria, supports the highly diverse and polymorphic nature of
Pvmsp-
3α among Indian isolates. It is interesting to note that there was no significant difference in the observed level of diversity among the isolates of low and high endemic regions of the Indian sub-continent. Similar observations of high diversity among isolates of low-endemic areas in Thailand were made by Cui et al. [
45]. It is suspected that either highly virulent isolates are not present despite high diversity or the malaria control measures in such regions are more effective for restricting the incidence to a low level.
All three of the
Pvmsp-
3α PCR variants (Type A, B and C) observed worldwide were present among the field isolates, although with varied frequencies in the different geographical regions. The largest size variant (Type A) occurred at the highest frequency in all samples of natural
P. vivax populations in India, and the frequencies observed were similar to the previously reported global frequencies. The difference in the proportion of the Type A and C variants in the population might have been due to the deletions found in the smaller length variants, which might have caused a loss of fitness for the parasites that carried the variants [
26]. However, these genotypes may occur in the population to balance any fitness cost associated with large deletions within
Pvmsp-
3α [
55]. Block-I exhibits maximum sequence variability in this region, acts as a placeholder to mimic the larger length variant [
56] and also confers a selective advantage to the parasite, i.e., evasion of preexisting immune responses. The observation that block-II exhibits only slight variation in size and very little polymorphism that is limited to certain portions indicates that it plays an important role in the formation of the protein structure. The Type B and Type C variants exhibited minimal sequence diversity and differed from each other in the nucleotide sequences that form the beta helix turn.
Analysis of the isolates from eight regions of the Indian sub-continent indicated that diversity was not linked to geographical region and a high level of diversity was observed within the same region. New alleles identified in the Indian isolates demonstrated high variability among the field isolates. The high allelic diversity of
Pvmsp-
3α has also been reported in other regions of the globe, including South Asia, Southeast Asia, PNG [
46] and Latin America [
55], including Peru [
57] and Colombia [
58]. Some of the identified alleles are consistent with those from earlier reports, which indicate global distribution of certain parasite genotypes [
59]. Based on msp-3α data, the existence of a common allelic composition in different parts of the globe indicates the presence of a single random-mating population of
P. vivax across the globe with no geographical sub-structuring. However, this may not be true as new molecular approaches such as microsatellite and mitochondrial studies have been used to assess
P.
vivax quantifying parasite diversity and population structure [
16,
60‐
62].
HhaI digestion revealed 56 alleles and H1 allele was most prominent in the Indian sub-continent and was seen in the Type C variant among the Indian isolates. H1 allele in these variants has been reported from other regions of the world, including those from India [
21,
63,
64].
Simultaneous infection of a host by more than one strain of the same parasite is common in malaria and is partially correlated with transmission intensity levels. For
P. vivax malaria, the estimated proportion of mixed-strain infections in PNG is 65 % [
65], compared with 43 % reported in India [
66]. In recent studies in the Kolkata region of India, 10.6 % of recorded cases were multiple infections [
20]. In the present study, 8.2 % of cases were multiple infections based on PCR detection and 22 % based on PCR-RFLP of
Pvmsp-
3α sequences. RFLP analyses for detecting multiplicity of infection may not be completely reliable as incomplete digestion lead to spurious results. Restriction with multiple enzymes may be helpful in detecting multiple clones by RFLP. Further, microsatellite (MS) markers and SNP’s are reliable and important tools for studying multiplicity of infection (MOI) of malaria parasite infections [
28,
67,
68].
Several studies reported extensive microsatellite diversity and high multiplicity of infection in
P.
vivax in regions of moderate endemicity [
61]. The increased multiplicity of
P. vivax infection is attributable to the biological features of the
P. vivax parasite, such as early gametocytogenesis and relapse [
35]. Moreover, the multiplicity of infections is likely to facilitate genetic recombination of parasites, and the generation of novel strains [
45].
The observed high level of Pvmsp-3α diversity among the field isolates was further supported by sequencing data, which revealed that the different genotypes of the sequenced isolates were found throughout the different geographical areas. In particular, the most common variant (Type A) displayed the maximum diversity, which was expected because of the presence of an intact block-IB in this variant, which is the most variable. However, in the smaller length variants (Type B and C), block-IB was either completely or partially deleted, thereby reducing the diversity.
Twenty samples were sequenced, and the restriction sites were analysed. No two resulting
AluI or
HhaI restriction banding patterns were similar in the Type A haplotype, which indicated the extent of the variability. Most of the isolates sequenced revealed new alleles. Based on the restriction site analysis of the sequenced isolates, it can be concluded that each of the Type A haplotypes isolated was a different allele. However, when restriction sites of the sequenced Type B haplotype (identical pattern on the gel) were analysed, the restriction pattern differed slightly in terms of band size. However, the Type C fragment restriction pattern analysis of the sequenced fragment revealed a similar pattern, thus indicating reduced variability in the smaller types. The in silico restriction site analysis of the published
Pvmsp-
3α sequences from various geographical regions was carried out using restriction mapping software. The virtual restriction patterns obtained with either the
HhaI or
AluI enzymes were not analogous to isolates from a single region or close geographical regions, which indicated that higher resolution of the bands obtained by actual restriction mapping could result in numerous alleles, as was obtained from the Indian isolates. This finding signifies the importance of sequence data as a source for estimating the exact variability and its extent among isolates from other areas globally. As reported earlier, higher levels of sequence polymorphism were observed in the region closer to the central alanine-rich domain [
28,
40,
48].
Pvmps-
3α is an essential polymorphic marker for studying the population structure of
P. vivax. Numerous studies from various regions have been carried out to identify the number of
Pvmsp-
3α alleles using the PCR-RFLP method [
20,
21,
32‐
37,
39‐
43]. Several investigators have advocated for this method as a means of identifying the variations available in this allele. However, based on the in silico digestion,
AluI results in complicated patterns may not be a good enzyme for PCR-RFLP purpose [
28]. Virtual digestion of
Pvmsp-
3α sequences revealed greater variability in terms of the fragment sizes obtained, which was not resolved easily on a gel and could result in reaching a biased conclusion regarding the number of alleles. Moreover, the results may vary depending on the electrophoresis conditions. As proposed by Rice et al. [
28], sequencing may be a better option to determine the sequence-level genetic diversity of the
pvmsp-
3α gene. Therefore, sequencing analysis is advocated for analysing the extent of the variability generated by the parasite to evade the host immune system and impart a survival advantage to the parasite [
28,
69].
PvMSP-3α is a potentially important vaccine candidate expressed on the surface of the merozoite in the parasite [
70]. Dual role for PvMSP-3α in both the immunity and pathogenesis of malaria has been suggested [
71]. PvMSP-3α is known to elicit a pronounced antibody response against clinical malaria infections reported from PNG [
46]. Additionally, it has been established that naturally acquired antibodies against the C-terminal block-II of PvMSP3α are associated with protection from symptomatic vivax malaria [
46].
There is limited information available regarding PvMSP3α sequences from India. Sequence polymorphisms in Indian isolates were mainly clustered at the 5′ regions of the marker, as was seen in other isolates worldwide. However, block-II (residues 434–687aa of Belem strain), the region of the alanine-rich core displays less variability. In an earlier study phylogenic analysis based on sequence variation of 237 world sequences of PvMSP-3α block II region resulted in three robust clusters suggesting extensive gene flow between populations. However these clusters did not reveal any geographical structure. The phylogenic grouping was influenced by sequence variations of two motif (motif I: MSELEK/LSKLEE and motif II: TAANVVKD/KEATAAKL) which suggested selective pressure on these motifs [
38].
In the Belem strain the motifs I and II correspond to position 529–534aa and 576–583aa respectively [
26]. Among the Indian isolates, motif I was represented by MSELEK sequence in all the twenty isolates, while motif II exhibited both the sequences at different frequencies. One of the Indian isolates however represented a recombinant type of motif with an amino acid sequence of TEANVAKL, which was seen in the Goan isolate.
Comparison with published nucleotide and protein sequences in NCBI and PlasmaDB using phylogenetic trees of Pvmsp-3α revealed that most subtypes were new alleles. Two distinct clusters, one which included Belem type and the other Salvador I type were formed when the Indian isolates were compared with other isolates worldwide. Genotypes from different geographical areas were distributed in both clusters, which revealed no convincing evidence of geographical grouping based on this marker.