The PfTRAP is one of the major sporozoite antigens that has been reported to generates protective immune response in adults till 7 days post immunization with high T cell response in a clinical trial in Senegal [
31]. However, high polymorphisms acting at crucial domains play a decisive role in the efficacy of the vaccine candidate in the field. No study is done on its ortholog. Thus, in the present study the objective was to genetically characterize the
pktrap gene and study the level of genetic diversity, natural selection acting at the full-length PkTRAP and at its domains from clinical isolates of Malaysia. Sequence alignment of 40 full-length amino acid sequences of
pktrap genes from Malaysia showed that it shares approximately 72.3% sequence identity with its ortholog
pvtrap. The nucleotide diversity was higher compared to its nearest ortholog species
P. vivax. This higher diversity might be probably due to the presence of admixture of
P. knowlesi sub-populations infecting humans in the Malaysian Borneo [
13,
20]. Domain-wise analysis of PkTRAP indicated that the density of the non-synonymous SNPs was higher within the proline/asparagine rich region (SNPs = 22) than the von-Willebrand factor (SNPs = 14) domains. This finding was similar to the findings of PfTRAP from Thailand [
32]. However, the ratio of non-synonymous to synonymous SNPs was highest within the von-Willebrand factor domain indicating the region to be under high natural selection pressure. Test of natural selection using both inter and intra-species tests (MK, Taj’s D and Li and Fu’s D* and F*) test indicated that the von-Willebrand factor is probably under balancing selection and might be under the influence of host immune pressure. Similar reports of diversifying selection of PfTRAP and PvTRAP in field isolates has been found in clinical isolates of varied geographical locations [
32,
33]. Based on the MK test results the full-length PkTRAP gene also appeared to be under the influence of natural selection however, intra-specific neutrality tests did not yield significant and reliable results (Taj D = − 0.38). Sliding window analysis of Taj D values and diversity across the von-Willebrand factor domain of
pktrap identified that regions that had higher non-synonymous SNPs also had higher positive values for Taj D indicating that these regions might be possible epitope regions, which are under high selection pressure. A similar study where positive peaks for Taj D values for TRAP has been reported within the CTL epitope regions for
P. falciparum [
40]. Interestingly, MK tests did not show strong significant results when
P. coatneyi was used as an outgroup sequence (NI = 1.68, P = 0.06) probably because of the presence of dimorphism among the
P. knowlesi sub-populations. These indicate that higher number of samples would probably result in significant MK test. However, codon based site by site analysis did identify five sites which could be potentially under positive/balancing selection. Since these sites were identified in the region where Tajima’s D value had high peaks, these could potentially be the epitope regions within the von Willebrand factor domain A and to confirm these higher numbers of sequences would be required. Pairwise population differentiation index
FST values showed high genetic differentiation within the parasite populations originating from Peninsular Malaysia and Malaysian Borneo. These results are similar to previous findings at the genomic level as well as for specific invasion genes [
14,
20].
The NJ based phylogenetic tree showed separation of the
P. knowlesi TRAP genes from Malaysian Borneo into two clusters while the three laboratory lines (H-strain, Malayan strain and the MR4 strain) from Peninsular Malaysia formed a third cluster. Earlier studies on
P. knowlesi blood stage vaccine candidates such as the DBPαII (PkDBPαII) [
41], PkNBPXa [
20], PkAMA1 [
42] and also a genomic study [
14] from Borneo have also reported similar bifurcation of trees. A population genetic study based on microsatellite markers of
P. knowlesi in humans and macaques indicated that this deep dimorphism was linked to infections from the two natural host the long tailed (
Macaca fascicularis) and the pig tailed (
Macaca nemestrina) macaques [
13] and humans are susceptible to infections through the both natural hosts. Interestingly, among the
P. knowlesi pre-erythrocytic vaccine candidates studied to date balancing selection is observed only in TRAP gene thereby highlighting that this molecule might be under effective immune selection and thus could be studied as candidate for vaccine design. Thus, studies are necessary to assess the diversity as well as functional studies directed towards immune response in patient samples would be necessary. However, a cautioned approach is necessary as extensive diversity observed in antigens [
43] could be the reason for vaccine failure in the field.