Background
Malaria parasite blood-stage infections commonly contain a mixture of different haploid parasite genotypes, particularly in areas of high endemicity where superinfection frequently occurs [
1]. Cross-mating and recombination between different genomes of parasites occurring together in a vector mosquito blood meal is, therefore, most frequent in highly endemic areas, whereas in areas of low endemicity inbreeding may be common as most infections contain single or highly related genotypes [
1‐
6]. In an experimental model of malaria using
Plasmodium chabaudi in mice, multiple genotype infections have been associated with apparent short-term evolution of virulence [
7], alterations to parasite sex ratio and production of gametocytes [
8,
9], and effects on the immune clearance rate [
10]. If such processes occur in human malaria, they might also impact on drug resistance evolution [
3,
11].
Previous analyses of within-host diversity of
P. falciparum, the causative agent of most human malaria cases globally [
12], have typically involved genotyping a small number of highly polymorphic gene loci [
13], or multiple putatively neutral microsatellite marker loci [
1]. These have demonstrated wide variation in the genotypic complexity of infections among geographical populations of
P. falciparum, which inversely correlates with local levels of multilocus linkage disequilbrium [
1,
14‐
17]. Further dissection of the within-host diversity of
P. falciparum infections has been performed using genome-wide single nucleotide polymorphism (SNP) data, showing that a high degree of relatedness is seen among some distinct parasite clones within infections, in comparison to those sampled from separate infections [
4,
18]. This illustrates that a multiple genotype
P. falciparum infection may be comprised of a mixture of closely related, non-identical parasites, or multiple genetically unrelated parasites, or it may be a complex mixture of both.
The relative proportions of SNP alleles in whole-genome sequence data from an infection can be estimated from the proportions of reads mapping to a reference sequence, and this allows computation of a within-isolate fixation index,
F
ws [
19,
20]. This index compares within-host diversity (‘w’) to that which exists in the overall local parasite population (or sub-population, ‘s’). It has a possible range from zero (when the sample from an infection contains all possible diversity) through to 1.0 (when the sample from an infection contains no sequence diversity), and this is profoundly influenced by the relative proportions of genotypes in the case of a mixed infection.
Here, the within-host diversity of P. falciparum within a highly endemic population in West Africa has been characterized using two distinct methods. Firstly, microsatellite PCR-based genotyping was performed with a panel of ten loci widely distributed in the parasite genome, and then whole-genome sequence data from the same samples were used to analyse SNPs in order to compute the F
ws indices. The relationship between these different types of estimates is examined, and the use of small numbers of SNPs to derive F
ws indices is also illustrated, so that the potential value of SNP genotyping may be considered when whole-genome data are not obtainable.
Discussion
A combination of microsatellite locus typing and genome-wide estimation of SNP-based allele frequencies has been used here to characterize
P. falciparum diversity within clinical infections. The results are indicative of a high degree of transmission intensity in the Guinean population studied, and are consistent with previous microsatellite data from other samples taken locally [
16]. Interestingly, the genome-wide SNP data here indicate more than half of all infections to each be composed predominantly of single genotypes, whereas microsatellite genotyping detected additional genotypes within infections. Microsatellite typing allows the sensitive detection of distinct parasite genotypes present at low proportions within an infection, although cloning or single-cell analysis of isolates would be needed to estimate the degree of relatedness among the different parasites [
4,
18,
24].
Random sampling of the genome-wide SNP data shows that the within-isolate F
ws fixation indices may be estimated from modest numbers of SNPs, and correlated with the indices derived from genome-wide data. Therefore, to estimate genotypic mixedness of isolates without whole genome sequencing, it may be feasible to quantitate alleles of between ten and 20 SNPs with other genotyping tools, particularly focusing on SNPs with high overall minor allele frequencies. It is preferable that these SNPs should be neutral, so that estimates are not biased by selection acting on the parasite.
Understanding processes affecting different parasite genotypes within an infection could offer insight into mechanisms that are clinically relevant. Genome sequencing allows broad or deep sampling of diversity within infections [
20,
25], but resolution of individual parasite clone genotypes is currently achievable only through either extensive limiting dilution cloning [
4] or single cell genome analysis [
18]. Previous studies have shown that proportions of clones in peripheral blood of infected humans varies over time, and can show marked differences between successive days [
26]. It is possible that some clones within an infection exist at low proportions due to competitive suppression by other
P. falciparum genotypes, or specific selection by the host due to immunity or receptor polymorphisms. Further dissection of the patterns of parasite genotypic diversity in clinical isolates, and possible interactions between genotypes, may lead to novel understanding of malaria parasites which will be relevant for disease control and potential future elimination [
6,
27].
Conclusions
This study shows that estimates of genotypic complexity of malaria parasite infections using very different methods give correlated and complementary information. The within-infection fixation index F
ws yields a standardized inverse measure (within-infection diversity being 1 − F
ws) which may be derived from genome-wide short read sequence data if this is available, or alternatively can be estimated from a modest number of randomly sampled SNPs which could be genotyped by other methods. Multilocus microsatellite PCR-based genotyping gives estimates of infection complexity that correlate strongly with those from the SNP analyses, while also being more sensitive to detect additional genotypes in some infections that appear to have unmixed sequences. With a wide range of methods now available, studies can choose genotyping and analytical approaches to suit investigational goals, recognizing relative advantages of each in relation to the costs and available resources.
Authors’ contributions
VAM and DJC conceived and designed the study. VAM, EL and KML collected the samples and performed laboratory assays. DPK supported data analysis training for VAM and genome sequence data management. LM, VAM, CWD, SAA, and DJC performed data analysis and interpretation. LM, VAM and DJC wrote the manuscript. All authors read and approved the final manuscript.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.