Background
Despite nearly two decades of progress in control, malaria remains a major public health challenge with an estimated 219 million cases and 435,000 deaths in 2017 globally [
1]. The mainland of Tanzania has heterogeneous transmission of mainly
Plasmodium falciparum, but overall levels of malaria remain high, accounting for approximately 3% of global malaria cases [
1]. However, through a combination of robust vector control and access to efficacious anti-malarial treatment, the archipelago of Zanzibar has been deemed a pre-elimination setting, having only low and mainly seasonal transmission [
2]. Despite significant efforts, however, elimination has been difficult to achieve in Zanzibar. The reasons for Zanzibar’s failure to achieve elimination are complex and likely driven by several key factors: (1) as transmission decreases, the distribution of cases changes and residual transmission is more focal and mainly outdoors [
3]; (2) a significant number of malaria infections are asymptomatic and thus untreated and remain a source for local transmission [
4‐
7]; and (3) the archipelago has a high level of connectivity with the mainland, thus imported malaria through human travel may play an increasing relative role in transmission.
Genomic epidemiology can supplement traditional epidemiological measures in studies of malaria transmission and biology, thereby helping to direct malaria elimination strategies [
8]. Whole-genome sequencing (WGS) can be particularly useful for understanding the history of parasite populations and movement of closely related parasites over geographical distances [
9,
10]. Identity by descent (IBD), the sharing of discrete genomic segments inherited from a common genealogical ancestor, has been found to be a particularly good metric for studying the interconnectivity of parasite populations [
11‐
13]. A major obstacle to studying IBD in microorganisms, and in particular malaria, is the presence of multiple clones in a single infection. In order to address this obstacle, recent algorithms have been developed to deconvolve multiple infections into their respective strains from Illumina sequence data [
14,
15]. These advances now make it tractable to conduct population genetic analysis of malaria in regions of higher transmission, where infections are often polyclonal.
Decreases in malaria prevalence are hypothesized to be associated with increasing inbreeding in the parasite population, decreased overall parasite genetic diversity and a reduced complexity of infection (COI), defined as a decreased number of infecting clones [
8]. This has been shown in pre-elimination settings in Asia as well as in lower transmission regions of Africa [
16‐
18]. It has not been determined if a similar reduction in diversity has occurred in Zanzibar with the significant reduction of malaria in the archipelago. WGS data was used to: (1) characterize the ancestry of parasites in the two regions, (2) determine the levels of genetic diversity and differentiation between archipelago and mainland, (3) determine patterns of relatedness and inbreeding and (4) search for signatures of adaptation and natural selection. Inferred genetic relationships were then examined for evidence of importation of parasites from the higher transmission regions of mainland Tanzania to the lower transmission regions of the Zanzibar archipelago. These findings improve understanding of how importation may affect malaria elimination efforts in Zanzibar.
Discussion
Zanzibar has been the target of intensive malaria control interventions for nearly two decades following the early implementation of ACT therapies in 2003 [
2]. Despite sustained vector control practices and broad access to rapid testing and effective treatment, malaria has not been eliminated from the archipelago [
2]. Here WGS of
P. falciparum isolates from Zanzibar and nearby sites on the mainland was used to investigate ancestry, population structure and transmission in local parasite populations. These data place Tanzanian parasites in a group of east African populations with broadly similar ancestry and level of sequence diversity. There was minimal genome-wide signal of differentiation between mainland and Zanzibar isolates.
The most parsimonious explanation for these findings is a source-sink scenario, similar to a previous report in Namibia [
47], in which importation of malaria from a region of high but heterogeneous transmission (the mainland) is inhibiting malaria elimination in a pre-elimination area (Zanzibar). Using WGS it is shown that the parasite population on the islands remains genetically almost indistinguishable from regions on the mainland of Tanzania. Numerous long haplotypes could be identified that are shared between the populations, on the order of 5 cM, suggesting that genetic exchange between the populations has occurred within the last 10–20 sexual generations. In addition, a Zanzibar isolate is identified that is related at the half-sibling level to a group of mutually-related mainland isolates. This likely represents an imported case and provides direct evidence for recent, and likely ongoing, genetic exchange between the archipelago and the mainland. These observations suggest that parasite movement from the mainland to the archipelago is appreciable and may be a significant hurdle to reaching elimination.
Human migration is critical in the spread of malaria [
48], thus the most likely source for importation of parasites into Zanzibar is through human travel to high-risk malaria regions. Multiple studies have been conducted on travel patterns of Zanzibarian residents as it relates to importation of malaria [
49‐
51], one of which estimated that there are 1.6 incoming infections per 1000 inhabitants per year. This is also in accordance with the estimate of about 1.5 imported new infections out of a total of 8 per 1000 inhabitants in a recent epidemiological study [
2]. None of these studies have leveraged parasite population genetics to understand importation patterns. Though this study is small, the findings are proof of principle for using genetics to identify specific importation events. These data provides a platform for future genetic surveillance efforts by, for example, design of targeted assays for sequence variants that discriminate mainland from Zanzibari parasites. Such surveillance, including of asymptomatic individuals, would clarify the role of importation
versus endemic transmission and potentially identify specific travel corridors to target for interventions. Larger sample sizes would also likely begin to reveal subtle population structure that is not obvious when examining a few dozen isolates.
Malarial infections in Africa are highly polyclonal. This within-host diversity poses technical challenges but also provides information on transmission dynamics. Approximately half of isolates from both the mainland and Zanzibar represent mixed infections (COI > 1), similar to estimates in Malawian parasites with similar ancestry [
15]. It is clear that a widely-used heuristic index (
Fws) is qualitatively consistent with COI estimated by haplotype deconvolution [
52], but has limited discriminatory power in the presence of related lineages in the same host. Furthermore, median within-host relatedness (
FIBD) is ~ 0.25, the expected level for half-siblings, in both mainland and Zanzibar populations. This strongly suggests frequent co-transmission of related parasites in both populations [
40]. Estimates of
FIBD are within the range of estimates from other African populations and add to growing evidence that mixed infections may be predominantly due to co-transmission rather than superinfection even in high-transmission settings [
53,
54]. An important caveat of this work is its dependence on statistical haplotype deconvolution. Direct comparison of statistical deconvolution to direct sequencing of single clones has shown that methods like ‘dEploid’ have limited accuracy for phasing the minority haplotype(s) in a mixed infection. Phasing errors tend to limit power to detect IBD between infections, and may cause underestimation of between-host relatedness.
Intensive malaria surveillance over the past several decades provides an opportunity to compare observed epidemiological trends to parasite demographic histories estimated from contemporary genetic data. Estimates of historical effective population size (
Ne) support an ancestral population of approximately 10
5 individuals that grew rapidly around 10
4 generations ago, then underwent sharp contraction within the past 100 generations to a nadir around 10–20 generations before the present. Stable estimates of the split time between the mainland and Zanzibar populations could not be obtained, either with a coalescent-based method (Fig.
5b) or with method based on the diffusion approximation to the Wright-Fisher process [
55]. This is not surprising given that the shape of joint site frequency spectrum (Additional file
1: Fig. S3), summarized in low
Fst genome-wide, is consistent with near-panmixia. The timing and strength of the recent bottleneck appears similar in mainland Tanzania and Zanzibar isolates and coincides with a decline in the prevalence of parasitemia. However, it should be remembered that the relationship between genetic and census population size—for which prevalence is a proxy—is complex, and other explanations may exist for the observed trends.
Finally, this paper makes the first estimates of the distribution of fitness effects (DFE) in
P. falciparum. Although the impact of selection on genetic diversity in this species has long been of interest in the field, previous work has tended to focus on positive selection associated with resistance to disease-control interventions. The DFE is a more fundamental construct that has wide-ranging consequences for the evolutionary trajectory of a population and the genetic architecture of phenotypic variation [
56]. Purifying selection is pervasive, but most new alleles (~ 75%) are expected to have sufficiently small selection coefficients that their fate will be governed by drift. The proportion of new mutations expected to be beneficial—the “target size” for adaption—is small, on the order 1–2%. Together these observations imply that even in the presence of ongoing human interventions, patterns of genetic variation in the Tanzanian parasite population are largely the result of drift and purifying selection rather than positive selection. It should be noted that these conclusions are based on the core genome and may not hold for hypervariable loci thought to be under strong selection such as erythrocyte surface antigens. Furthermore, the complex lifecycle of
Plasmodium species also departs in important ways from the assumptions of classical population-genetic models [
57]. The qualitative impact of these departures conclusions is hard to determine.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.