Background
Malaria is one of the most common infectious diseases and an important public health problem worldwide. In 2016, there were an estimated 216 million cases and 445,000 deaths of malaria occurred worldwide; and nearly half of the world’s population lived in 91 countries and territories are at risk of malaria transmission [
1]. The majority of malaria cases and deaths (~ 90%) occur in sub-Saharan Africa.
Plasmodium falciparum is the most prevalent malaria parasite in sub-Saharan Africa, while
Plasmodium vivax is the most widespread human malaria with approximately 2.5 billion people at risk of infection worldwide [
2].
Plasmodium vivax is a major cause of anaemia in an area where
P. falciparum and
P. vivax co-exist [
3]. Relapses play an important role in the transmission of
P. vivax in malaria endemic areas [
4].With the scaling up of interventions since 2006, primarily mass distribution of insecticide-treated nets (ITNs), indoor residual spraying (IRS), and artemisinin-based combination therapy (ACT), malaria transmission has declined tremendously in the past decade [
5].
The extent of genetic diversity and multiplicity of infection (MOI) is essential in understanding malaria epidemiological patterns, transmission intensity, host immune system, and parasite virulence for the development of anti-malarial vaccine as well as evaluating the impact of malaria control interventions. For example, MOI has been used for inferring disease epidemiology such as detecting parasite clearance rates subsequent to anti-malarial treatment [
6] and examining the level of anti-malarial drug resistance [
7,
8], the impact of transmission intensity on infection complexity [
9], parasite virulence related to anti-malarial vaccine development [
10,
11], and in-host ecology of malaria infections [
12]. Traditional PCR-based methods, such as microsatellite [
13,
14] and merozoite surface protein (
msp) genotyping [
14‐
17], for assessing MOI estimation can lack both sensitivity and specificity, resulting in the apparent problem of underestimating disease complexity [
18‐
20]. Compared to genotyping methods, amplicon deep sequencing provides a rapid, robust, high-throughput approach to detect sequence variants and estimate allele frequency by sequencing a genomic region multiple times, sometimes hundreds or even thousands of times [
21]. For example, ultra-deep sequencing of amplicons from the ribosomal, mitochondrion, and apicoplast encoded genes revealed a large complexity of coinfections with an unexpectedly high MOI in
Plasmodium ovale and
Plasmodium malariae infections in the endemic areas of Gabon [
22]. Use of length polymorphic genes such as
msp2 in amplicon deep sequencing has been shown to display greater sensitivity in detecting minority clones [
23].
Using
pvmsp1 short amplicon deep sequencing, Lin et al. [
15] identified 67 unique haplotypes from 78 Cambodian
P. vivax samples with an average MOI of 3.6 within each individual. Over half of the recurrent infections were detected as relapse. Compared to the standard PCR based method, next-generation sequencing revealed up to sixfold higher MOI in
Plasmodium infections [
12,
21]. This technology has unquestionably advanced our understanding of the genetics and evolution of multiclonal infection. However, in previous studies, most of amplicon deep sequencing was performed on two platforms, 454/Roche or Ion Torrent with high error rate and short reads due to technological limitation. By contrast, the Illumina MiSeq/HiSeq platform can generate reads of up to 600 bp length with lower sequencing error rate.
Plasmodium merozoite surface protein 1 (
msp1) is a highly abundant and the most polymorphic antigen, which has been extensively studied in the parasite population [
24‐
26].
Plasmodium falciparum has seven variable blocks that are separated either by conserved or semi-conserved regions. The variable block 2 of
pfmsp1 is the most polymorphic region of the antigen [
27].
Plasmodium vivax has nine variable regions that are separated by 10 interspecies conserved or intraspecies conserved blocks [
28]. The variable block 18, located in 42 kDa region of
pvmsp1, has been identified to be the most polymorphic part of the antigen [
11]. These polymorphic regions could be good candidate markers for multiclonal detection of
Plasmodium infection.
The present study was designed to address the following questions: (1) how useful is amplicon ultra-deep sequencing for determining multiplicity of Plasmodium infection and identifying P. vivax relapse? (2) is there any difference in multiplicity of Plasmodium infection between patients of different symptoms, ages, genders, time, and transmission settings? (3) does intensified intervention since 2006 affect MOI? To address the first question, different lengths of P. vivax amplicons and microsatellites for MOI and relapse estimation were compared. For the second and third questions, different groups of P. falciparum and P. vivax infected patients were compared.
Discussion
Multiplicity of infection (MOI), also termed complexity of infection (COI) is defined as the number of different parasite strains co-infecting a single host. MOI can be a useful indicator of immune status and transmission level. Traditionally, MOI was assessed by PCR genotyping of antigen protein genes (
msp1, msp2, and
glurp) and microsatellite markers, which were regarded as the gold standard because of their high polymorphism [
22]. However, these methods were unable to distinguish sequence variation among parasite strains and detect minority clones within a host. By using next-generation amplicon deep sequencing, the minority clone could be detected as low as 0.5% within-host infection frequency [
6,
15]. In the study, the Illumina HiSeq platform combined with Rapid SBS Kit v2 generated reads up to 500 bp with high coverage depth (~ 35 k × for
P. vivax and ~ 100 k × for
P. falciparum). Compared to a previous study by Lin et al. that employed a 117 bp-fragment of
pvmsp1 short amplicon deep sequencing [
15], longer amplicon sequencing, by capturing a greater number of polymorphisms, revealed a higher MOI and improved power to detect multiclonal infections. Interestingly, using microsatellite markers with the same parasite population, multiclonal infections were detected only in 5.2% of the samples with an average MOI of 1.07 [
39], significantly lower than that estimated by longer amplicon sequencing (a mean of 2.16 MOI). One possible reason might be the missed genotyping of polyclonal infections in some of the tested samples with microsatellite analysis. Such contrast suggested that transmission intensity may not be low. Together with high relapse as identified in the present study, there could be a much larger
P. vivax reservoir that sustains continual transmission and makes elimination challenging.
The complexity of infection has been suggested to be associated with ages and symptoms in
Plasmodium infections [
55‐
58]. However, in this study, no significant difference was found in
P. vivax MOI between the symptomatic and asymptomatic infections, adults and children, as well as between male and female groups. Similar patterns were also reported in other studies [
59,
60]. In western Kenya, no notable difference was detected in the multiplicity of
P. falciparum among asymptomatic school children in low transmission areas (highland) and in high transmission areas (lowland) over 10 years. However, in the high transmission areas (lowland), significantly difference in MOI was detected between locations (Kombewa vs Kendu Bay). The temporal changes in complexity of
P. falciparum infections could be varied by transmission intensity and our findings indicated that multiclonal parasite genotypes could have remained steady over time in high transmission areas. Several researchers have reported correlations between clinical symptoms and higher MOI [
60‐
68] while others did not find any associations [
69‐
71]. Some studies reported that a reduced risk of clinical malaria was associated with multiclonal infections [
72‐
74], while other studies reported that mono-infections and very common genotypes are more likely to develop severe malaria than multiclonal infections [
70,
75]. A positive association between the proportion of polyclonal infections and parasite prevalence has been observed in parasite populations from Indonesia [
76] and Papua New Guinea [
77], while in other studies, no association or negative correlation was found between the rate of polyclonal infections and parasite prevalence [
77,
78]. In Ethiopia, reported malaria cases were respectively 2.6 million and 2.2 million in 2011 and 2015, however, proportion of
P. falciparum increased by 5% from 2011 to 2015 (Zhou unpublished data), indicating a relative weak reduction in transmission. In our study areas in Kenya, malaria parasite prevalence in school children in the lowland increased from 40 to 45% from 2011 to 2015 while it decreased from 16% in 2011 to 13% in 2015 in the highlands, results also indicated insignificant changes in transmission in the areas [
37].
Long amplicon deep sequencing of
msp1 offers a sensitive tool to detect relapse, defines multiclonal-infected samples, and elucidates within-host genetic diversity and parasite relationships among infections [
12,
79‐
82]. In the present study, a close genetic relationship was found among
P. vivax clones within-hosts, which explained less than ~ 30% of the total variance when compared to between-host infections (Table
3). This result suggested that
pvmsp1 haplotypes were more genetically similar within than between hosts. Similar pattern was also observed in
P. falciparum [
83]. The close relatedness among the parasite strains within a host could be a result of frequent recombination and/or selection for drug resistant strains. Further investigation is needed to understand the mechanism generating within-host diversity.
Table 3
Analysis of molecular variance (AMOVA) of P. vivax infections using pvmsp1 deep sequencing
Among individuals | 83 | 2276.02 | 27.42 | 8.36 | 70.7 | < 0.001 |
Within individuals | 157 | 543.45 | 3.46 | 3.46 | 29.3 | |
Total | 240 | 2819.47 | | 11.83 | 100 | |
In the study, using long amplicon deep sequencing of high polymorphic makers,
pvmsp1 and
pfmsp1, minority clones were able to be detected in multiclonal infections. However, there are also a few limitations in the study: (1) only a small polymorphic genomic region is amplified, not covered whole genome variants; (2) the threshold for haplotype cluster calls needs to be determined by empirical methods in each study, due to various sequencing error rates in different sequencing platforms and computational strategies; (3) the PCR slippage might be present in early PCR cycle at the microsatellite repeat unit of length polymorphic
pfmsp1 marker, which resulted in increased frequency of minority clones; (4) there was only a subset of samples with replicate PCR. In order to exclude PCR or sequencing errors, it is better to perform experiments in duplicate of all the samples and use appropriate controls in each study to help determine that no false calls are being made; (5) the low percentage of reads was clustered in our clinical samples compared to laboratory strain (> 99%), suggesting DNA template quality is important. This can be improved by removal of host DNA using an enzyme-based DNA degradation method that selectively digests and depletes human DNA contamination from malaria clinical samples [
84]. Another limitation is the lack of a mixed infection positive control, especially for
pfmsp1 with the different product fragment lengths.
Authors’ contributions
DZ and GY conceived and designed the study. DZ, EL, and EH performed the experiments: DY, EL, YA, HA, and AG conducted sample collection and field supervision. DZ, EL, GZ, ML and XW analysed the data. DZ, EL, GZ, YA, DY, XW, EH and GY contributed to writing and refining the manuscript. All authors read and approved the final manuscript.