Experimental procedures for DNA sequencing of eba-175
Genomic DNA was extracted from pretreated peripheral blood samples from patients infected with P. falciparum using a QIAamp Blood Kit (Qiagen, Hilden, Germany). DNA fragments covering a part of the eba-175 coding sequence (regions II and III) of P. falciparum were amplified by PCR using the following sets of primers: EBA175-fragment1, 5′-ggaagaaatacttcatctaataacg-3′ (forward) and 5′-catcctttacttctggacacatcg-3′ (reverse), and EBA175-fragment2, 5′-gagactctgaaggttgaatgcaa-3′ (forward) and 5′-aggtgtattagacatatcttggtc-3′ (reverse). These primers were designed based on the eba-175 reference sequence from P. falciparum (GenBank accession no. X52524). PCR amplification was performed in a 13.0-µL reaction mixture containing 0.125 µL (0.125 µM) each of forward and reverse primers, 0.125 µL TAKARA LA Taq™ (5 units/µL), 1.25 µL 10× LA PCR™ Buffer II (Mg2+ free), 1.25 µL 25 mM MgCl2, 1.25 µL 2.5 mM dNTP mixture, 0.5 µL (5 ng) of genomic DNA template, and 8.375 µL dH2O using a GeneAmp® PCR System 9700 (Applied Biosystems, Foster City, CA, USA). The PCR cycling conditions for each primer pair were 60 s initial denaturation at 94°C, followed by 40 cycles of 30 s denaturation at 94°C, 30 s annealing at 56°C, and 150 s extension at 72°C, and a final step of 5 min extension at 72°C. The PCR products were subsequently sequenced using an ABI Prism 3100 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). Primer sequences used for direct sequencing are available upon request. The isolates showing multiple superimposed electropherogram peaks at a single site following PCR-direct sequencing and a secondary peak greater than 30% of the predominant peak were considered to be mixed infections and excluded from further analyses. Low-quality sequences (i.e., high background noise or too weak signal) were also excluded. As a result, 131 sequences representing the single or most abundant sequence in each DNA sample were included in the analyses.
Data analyses
The nucleotide sequences obtained were aligned and translated into putative amino acid sequences using MEGA v.5.2 [
13]. To examine the phylogenetic relations among 32 distinct
eba-
175 Thai
P. falciparum alleles and two
eba-
175
P. reichenowi alleles (CBXM000000000 and AJ251848), a maximum likelihood (ML) tree was constructed based on the Hasegawa–Kishino–Yano model [
14]. To obtain the ML tree, a nearest-neighbor-interchange (NNI) search was applied. In addition, a neighbor-joining (NJ) [
15] tree was generated using 194
eba-
175 partial region II sequences from
P. falciparum worldwide and two
P. reichenowi sequences (CBXM000000000 and AJ251848), based on the Nei–Gojobori (NG) model [
16] and the Jukes–Cantor (JC) correction [
17]. The construction of phylogenetic trees and estimation of best-fit substitution model for the ML tree were implemented in MEGA v.5.2 [
13]. All the positions containing insertions/deletions were eliminated from the analyses (complete deletion option). Branch support values were computed by bootstrap analyses with 1,000 replications. A network of 194
P. falciparum and two
P. reichenowi
eba-
175 alleles was also constructed based on synonymous substitutions using the neighbor-net method [
18] in SplitsTree4 ver. 4.13.1 [
19].
The time to the most recent common ancestor (tMRCA) of the
eba-
175 alleles from Thai
falciparum was estimated from the linearized tree based on synonymous substitutions among the 32 distinct
eba-
175 alleles by using the MEGA v5.2 [
13]. The neutral substitution rate was calculated assuming that
P. falciparum and
P. reichenowi diverged 6 million years ago (MYA) [
5,
20]. In addition, the MCMCTree program in the PAML 4.8 package [
21] was used to estimate tMRCA based on the amino acid sequences. The minimum and maximum age constraints on the root age (the divergence time between
P. falciparum and
P. reichenowi) were set to 5 and 7 MYA, respectively. The tMRCA estimation was based on a WAG model [
22] for amino acid substitutions. In the MCMC process, sampling occurred every 100 generations for 10,000 generations and the first 50,000 generations were discarded as burn-in.
To detect the signatures of natural selection, the number of nonsynonymous substitutions per nonsynonymous site (
d
N) and synonymous substitutions per synonymous site (
d
S) for all the pairs formed by the 32 distinct Thai
P. falciparum alleles, 16 from chimpanzee
Plasmodium spp., 11 from gorilla
Plasmodium spp., and four from
P. reichenowi sequences were estimated using the NG model with the JC correction in MEGA v.5.2 [
13]. Significant difference between
d
N and
d
S was assessed by Wilcoxon signed-rank test. For all the 131 Thai
P. falciparum isolates, the numbers of nonsynonymous substitutions per nonsynonymous site (
π
N) and synonymous substitutions per synonymous site (
π
S) were also calculated in the same manner as
d
N and
d
S. In addition, the McDonald–Kreitman (MK) test [
23] was performed for detecting natural selection signal using DnaSP v5 software [
24]. Tajima’s
D test [
25] was performed for 131 Thai
P. falciparum
eba-
175 sequences using DnaSP v5 software [
24], where the test statistic was analytically calculated. A two-sided
P value of less than 0.05 was considered statistically significant.
A Wu–Kabat plot was used to estimate the level of amino acid variability for the 32 distinct Thai
P. falciparum eba-
175 alleles [
26]. The Wu–Kabat plot estimates the level of variability for each amino acid position in the sequence alignment, measured as the number of amino acids at each site divided by the maximum frequency of amino acid for all sites.