Background
Malaria is caused by infection with the protozoan parasite
Plasmodium spp. and is responsible for approximately half a million deaths annually. Most of the mortality occurs among children under 5 years of age [
1], and progress in control has recently stalled [
2]. Malaria pathogenesis is characterised by a complex interplay between an antigenically diverse parasite and a constantly evolving immune response in the host. Initial exposure often leads to disease, but subsequent repeated exposures lead to the development of partially protective, non-sterile immunity [
3‐
5]. There is mounting evidence that repeated clinical episodes of malaria result in substantial modification of the host immune system.
P. falciparum (
Pf) infection has been shown to stimulate T regulatory cells [
6,
7] and to significantly alter the phenotype and function of a number of other immune cell populations including dendritic cells [
8], conventional B [
9,
10] and T lymphocytes [
11,
12] and γδ T cells [
13]. In line with this, some
Pf proteins bind the inhibitory receptor LILRB1 found on NK and B cells [
14].
The consequences of such immune modification have not been studied extensively; however, it is interesting to note that a number of vaccine candidates have demonstrated much-reduced efficacy when tested in malaria-endemic populations as compared to malaria-naïve populations [
15,
16]. Although the precise mechanism of this is not fully understood, it suggests that complex interactions between malaria and the immune system affect the ability to elicit appropriate immune responses upon challenge. Whether such immune modification persists in the absence of parasitaemia (steady state) is also not known.
Here, we examined healthy uninfected children living in an endemic area who had been under active surveillance for clinical malaria for 8 years and had experienced either high or low numbers of clinical episodes (relative to the population average). We took a multi-dimensional approach, comprising whole blood transcriptomic, cellular and plasma cytokine analyses to describe the immune systems in these two groups of children, providing a comprehensive description of the effect of repeated episodes of clinical malaria on the steady-state immune system of children living in an endemic area. While insufficient to establish the causal relationship between malaria episodes and any immune modification (differences could reflect inherent immunological differences that predispose certain individuals to increased numbers of episodes), this study represents a necessary first step in furthering our understanding of the complexity of malaria immune responses.
Materials and methods
Study population
The participants for this study were drawn from two previously described cohorts of children who had been under active weekly surveillance for 8 years [
17,
18]. The Junju cohort is in an area of moderate malaria transmission with a
Pf prevalence of approximately 30% [
15,
17] during the rainy season, while the Ngerenya cohort is in an area where malaria transmission has fallen and remained at almost zero since 2004 [
18]. As described elsewhere [
19,
20], children were visited every week by field workers (themselves living within the local community) for the detection of malaria-associated fevers and who were also available to assess any fevers occurring between weekly visits. Any child with an axillary body temperature of greater than 37.5 °C was tested for
Pf parasitaemia by rapid diagnostic test and confirmed by microscopic examination of thin and thick blood smears stained with 10% Giemsa. A clinical episode of malaria was defined as body temperature above 37.5 °C with
> 2500 parasites per microlitre of blood.
For our analysis, 42 children of similar age (7–10.5 years) were selected belonging to 2 categories—“low” and “high” (under active surveillance since 2007) depending on their number of past clinical episodes. An additional 27 age-matched children who had never had clinical malaria (naïve) were selected from Ngerenya (under active surveillance since 1989), where malaria transmission has remained very low since 2004. The low group consisted of children from Junju who had less than 5 recorded episodes of malaria, while the high group (also selected from Junju) had between 8 and 18 recorded episodes of malaria. A single blood sample was taken from each child and processed as described below. All 69 children were genotyped to confirm that none carried the sickle cell trait (haemoglobin AS genotype), a well-characterised polymorphism associated with resistance to malaria infection [
21]. All 69 children were also determined to be negative for
Pf (microscopy and PCR) and had not had a clinical episode within the last 110 days prior to sampling.
Sample collection
Five millilitres of blood was drawn from each child by venesection in March 2015 prior to the start of the major malaria transmission season. One millilitre was immediately placed in a Tempus tube (Thermo Fisher Scientific) and stored for downstream transcriptomic analysis. The remaining blood was transported within 2 h of collection to the laboratory where 200 μL was aliquoted for flow cytometry and 100 μL aliquoted for real-time PCR (to assess Pf status), and the remaining sample was centrifuged to separate the plasma which was stored at − 20 °C.
PCR analysis
For PCR analysis, DNA was first extracted from 30 μL of whole blood using QIAxtractor machine (QIAGEN, Hilden, Germany). The DNA was eluted in 100 μL, from which 5 μL of DNA were amplified by quantitative PCR. This was done using a TaqMan assay for the
Pf multicopy 18S ribosomal RNA genes, as described elsewhere [
22], except that we used a modified probe (5′-FAM-AACAATTGGAGGGCAAG-NFQ-MGB-3′). We used an Applied Biosystems 7500 Real-Time PCR System with quantification by Applied Biosystems 7500 software v2.0.6. Samples were analysed in singlet wells. Three negative control wells and 7 serial dilutions of DNA extracted from in vitro parasite cultures were included as standards on each plate in triplicate. Plates failing quality control standards were repeated. The lower limit of accurate quantification of this method is 10 parasites/mL within the PCR elute. By assessing 1/20 of 30 μL of blood with a gene target present on 3 chromosomes, the method has a theoretical limitation of 4.5 parasites/μL of whole blood, compared with a sensitivity of 50 parasites/μL for thick blood films. PCR standards were monitored through internal quality assurance and use of external quality control standards.
The formol-ether concentration method was used to prepare samples for the detection of helminths or their eggs by microscopy.
Flow cytometry
Two hundred microlitres of whole blood was mixed with a cocktail of monoclonal antibodies specific for human immune cell surface markers. The cocktail consisted of antibodies against CD3, CD4, CD8, CD14, CD16, HLA-DR, CD11c, CD45RO, CD45RA, TCR γδ, CD56, CD19 and CD303 as well as a live/dead stain (see Additional file
1: Table S1 for antibody conjugation information). After staining for 30 min at 4 °C, erythrocytes were lysed using BD FACS Lysing Solution (BD Biosciences, San Jose, CA). Cells were washed and re-suspended in 200 μL of 1× PBS and analysed on a BD Fortessa flow cytometer (BD Biosciences, San Jose, CA) acquiring at least 200,000 leukocyte events per sample. Given the size of the study and the need to limit time between sample collection and FACS analysis, sample collection and FACS were performed in batches over a number of days, with appropriate single-colour controls acquired on each day. All FACS data were however analysed together once all the samples had been collected. Initial compensation and manual gating analysis were performed using FlowJo (FlowJo LLC, Ashland, OR).
Unsupervised FACS analysis
Flow cytometry data was analysed using the integrated analysis pipeline Cytofkit, available as an open-source R/Bioconductor package [
23]. Briefly, fcs files containing all live gated, singlet events from each participant were imported, the expression values of each marker extracted from each fcs file and the extracted data transformed using “automatic logicle transformation”. Expression matrices from all fcs files were then combined into a single matrix, by sampling up to 10,000 events from each fcs file. Dimensionality reduction was performed using the Barnes-Hut variant of the t-SNE algorithm [
24], and cellular subsets were identified using the clustering method proposed by Rodriguez and Laio [
25]. Individual clusters were then manually annotated using a heatmap displaying the median intensity values per cluster for every marker. This heatmap was used to identify each cluster’s defining markers and designate each cluster as a previously described population or unknown population. For each cellular population, we performed a Kruskal-Wallis test between the three groups of children. For significant cell types, we performed a post-hoc Dunn’s test between each group.
Plasma cytokine analysis
One hundred microlitres of plasma from each participant was submitted to Eve Technologies (Calgary, Canada) for analysis using the Human Cytokine/Chemokine 65-plex Discovery Assay. This multiplex assay is based on the Millipore MILLIPLEX cytokine array and is designed to detect and quantify the levels of the following cytokines: EGF, eotaxin, FGF-2, Flt-3 ligand, fractalkine, G-CSF, GM-CSF, GRO, IFN-α2, IFN-γ, IL-10, IL-12 (p40), IL-12 (p70), IL-13, IL-15, IL-17A, IL-1ra, IL-1α, IL-1β, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IP-10, MCP-1, MCP-3, MDC (CCL22), MIP-1α, MIP-1β, PDGF-AA, PDGF-AB/BB, RANTES, TGFα, TNF-α, TNF-β, VEGF, sCD40L, Eotaxin-2, MCP-2, BCA-1, MCP-4, I-309, IL-16, TARC, 6CKine, eotaxin-3, LIF, TPO, SCF, TSLP, IL-33, IL-20, IL-21, IL-23, TRAIL, CTACK, SDF-1α+β, ENA-78, MIP-1d and IL-28A. Cytokine levels were parameterised as log fluorescence and tested using a three-way Kruskal-Wallis test between the naive, low-episode and high-episode groups. Post-hoc Dunn’s tests were performed on cytokines with significant differences.
RNA isolation and library preparation
Tempus/blood mix (1 mL blood with 6 mL Tempus solution) was thawed on ice for 1 h and transferred into a 50-mL Falcon tube. Next, 2 mL of ice cold 1×PBS was added to the samples followed by the addition of 3 mL chilled 100% ethanol. Samples were immediately vortexed for 30 s and then spun down at 15,000 rcf for 60 min at 0 °C. After centrifugation, the supernatant was removed and the emptied tubes blotted on clean absorbent paper to remove the remaining foam. No cell debris pellet was visible within the tube. Next, the cells were lysed by adding 200 μL of freshly prepared lysis/TCEP solution (Perfect Pure kit, 5’PRIME) to the pellet and vortexed immediately for 1.5 min. RNA isolation was performed using the Perfect Pure kit, following the manufacturer’s instructions, and eluted in 40 μL of nuclease-free water. Globin mRNA was depleted from the total RNA using the GLOBINclear kit (Ambion). Indexed libraries were then generated using the KAPA Stranded mRNA-Seq Kit (Roche) on an automated platform with 10 cycles of PCR amplification.
RNA sequencing
Seventy-five samples, comprising 6 replicates of a single European sample (batch controls), 27 samples from naive children and 42 samples from exposed children, were sequenced in a single multiplexed pool using 5 lanes (75 bp PE) of a HiSeq 2500 (Illumina). The reads were combined across lanes for each sample but not across runs and mapped using Kallisto v0.42.3 [
26]. As a reference, we used all cDNA sequences from the GRCh38 human genome. Read counts per gene were calculated by summing over their transcripts. Genes with fewer than 10 read counts in at least 2 samples were removed. Sequence data has been deposited in the European Genome-phenome Archive (EGA)—accession number EGAS00001003167.
Differential expression analysis
Differential expression analysis was performed using DESeq2 [
27] version 1.16.1. The raw RNA-seq counts are modelled as a negative binomial distribution while explicitly normalising for library size.
p values were adjusted for multiple comparisons using the Benjamini-Hochberg correction (false discovery rate (FDR)).
Modular analysis
We applied modular analysis [
28,
29] to our RNA-seq data to ask whether any patterns would distinguish the low- and high-episode groups of children (see study population above). We used previously described clusters (modules) of genes that were co-regulated across nine different transcriptomic data sets obtained from patients with a variety of immune conditions [
28‐
32]. Analogous to previously described methods [
33], we calculated modular over/underexpression (
s) as:
$$ {s}_M=100\frac{1}{\mid M\mid}\sum \limits_{i\in M}D\left({g}_{i,a},{g}_{i,b}\right) $$
where
퐷(gi, a, gi, b)=\( \left\{\begin{array}{c}\operatorname{sign}\left({\mu}_{i,a}-{\mu}_{i,b}\right)\kern0.5em \mathrm{if}\ p\left({g}_{i,a},{g}_{i,b}\right)<0.05\ \\ {}0\kern0.5em \mathrm{otherwise}\end{array}\right. \)
For each gene
i within a module
M, we performed a Mann-Whitney test and calculated the
p value (
p) between child groups
a and
b. Here,
M is the set of genes in a module, and |
M| is the number genes in that module. Child categories include naive, low number of episodes and high number of episodes. If the test yielded a
p value < 0.05, then the sign of the differences in median rlog values (
μ) were added to
sM (sign is a function that returns − 1 for negative numbers, 0 for 0, and + 1 for positive numbers). The list of genes in the modules was obtained from a previously published report [
29].
In a recent study, a modular transcriptional repertoire analysis was used to find markers for malarial immunity following an RTS,S study [
34]. In contrast to modular expression, which describes changes over entire categories of children, we also defined modular response (
r) for individuals as:
$$ {r}_{c,M}=100\frac{1}{\mid M\mid}\sum \limits_{i\in M}\operatorname{sign}\left({g}_{i,c}-{\mu}_i\right) $$
where
rc,M is the response of child
c in module
M, |
M| is the number of genes in module M,
gi,c is the rlog gene expression of gene
i in child
c, and
μi is the median gene expression of gene
i in high and low malaria episode children. We then performed a Mann-Whitney test of response rates for each module between high and low malaria episode children.
Cellular deconvolution
We performed cellular deconvolution to identify cell-specific gene expression profiles. We learned the gene expression profiles from the LM22 set of genes previously used to deconvolve cell populations from microarray data [
35]. To prepare the data for deconvolution, we manually gated cell populations to mirror those used to generate the LM22 gene set. Gene expression was performed on transcripts per million (TPM) as has been previously advocated for RNA-seq measurements [
36]. For each gene, we performed deconvolution over seven cell types determined manually as illustrated in Additional file
2: Figure S1 (NK, neutrophil, B cell, CD8
+ T cell, CD4
+ T cell, γδ T cell, monocytes) and three child categories (universal (all samples), not naïve (high+low), high). Since deconvolving small populations could be more error-prone, we limited our analysis to the seven cell categories that were present in a significant proportion of the children.
For RNA expression of each gene as measured by TPM,
y, we fit a profile (
t) to the fraction of sub-cell types (
F) measured in children. The sub-cell types are separated into three distinct categories: universal (U), not naive (N) and high episodes (H), and arranged into a matrix as
F = [
F(U),
F(N),
F(H)]. The universal fraction
F(U), is the fraction of cells measured for each child. The sum fraction of cells for a child was less than 1, since not all cell events were categorised as a recognisable immune cell. The subsequent terms
F(N) and
F(H) are variations of the universal fraction, defined as:
$$ {F}_{c,i}(P)=\left\{\begin{array}{c}{F}_{c,i}(U)\kern0.5em \mathrm{if}\ c\in P\ \\ {}\begin{array}{cc}0& \mathrm{otherwise}\end{array}\end{array}\right. $$
where
P is a set of children in a category, and
c is an individual child. We modelled the gene expression as the linear set of equations
\( \mathbf{Ft}=\widehat{\mathbf{y}} \). For each gene, we fit a profile with lasso penalty as:
$$ {\mathrm{argmin}}_t{\left(\mathbf{Ft}-\mathbf{y}\right)}^T\left(\mathbf{Ft}-\mathbf{y}\right)-\lambda {\left|\mathbf{t}\right|}^1 $$
We chose the lasso penalty (
λ) that maximised the tenfold cross-validated coefficient of determination (i.e.
R2) to find non-zero cell-specific profiles. This was implemented in Python using scikit-learn [
37]. For this lasso penalty, we then performed a Bayesian lasso fit to obtain
z-scores for the non-zero cell-specific profiles. The model’s parameters were inferred using MCMC [
38]. As further controls, we performed this deconvolution on simulated data. In one set, child RNA-seq measurements were scrambled. The resulting number of positive results was used to estimate false discovery rates.
Gene set enrichment analysis
Gene set enrichment analysis [
39] was performed on the list of genes identified as altered in cell-specific signatures following deconvolution using the Molecular Signatures Database (MSigDB) [
40] and queried Gene Ontology terms, Reactome [
41,
42] and KEGG [
43].
Discussion
In this multi-dimensional assessment of the association between repeated malaria infections and immune phenotype, we combined data from whole blood transcriptomic analysis, multi-parameter flow cytometry, multiplex plasma cytokine analysis and active malaria surveillance to identify the immunological features associated with clinical malaria experience. We observed subtle but detectable differences in gene expression between children who have experienced a high number of episodes compared with others who have experienced fewer episodes. High-episode children were associated with increased expression of genes involved in immune activation and regulation, with modular analysis revealing the enrichment in genes involved in responses to type I and II interferons. The transcriptomic signature of enhanced immune activation in high-episode children is supported by our findings that levels of IL-10 and numbers of a subset of γδ T cells are significantly higher in these children compared to low-episode children. Through cellular deconvolution of the transcriptomic data, we found that high-episode children may have transcriptionally altered CD8+ T cells, B cells and neutrophils.
Notably, we observed a modular transcriptional signature that differs between high- and low-episode children. High-episode children were characterised by higher expression of three modules containing interferon-inducible genes. These three modules (M1.2, M3.4 and M5.12) are part of the transcriptional signature associated with protection of malaria-naïve adults following the administration of the RTS,S malaria vaccine [
34]. They have also been shown to become sequentially activated in systemic lupus erythematosus (SLE) patients [
30] and form part of the transcriptional signature associated with the trivalent influenza vaccine [
29]. While module M1.2 is enriched for genes induced by IFN-α, modules M3.4 and M5.12 are capable of also being driven by IFN-β and IFN-γ [
30]. This appears to suggest a role for both type I and type II interferons in shaping the immune system within high-episode individuals. Cellular immunity to malaria is typically thought to involve IFN-γ produced by Th1 CD4
+ T cells; however, both type I and II interferons have been implicated in the immune response to malaria. Type I interferons, produced by a number of cell types following malaria infection [
31,
32,
45‐
47], have been implicated in regulating CD4
+ T cell responses and promoting the differentiation of IL-10-producing Tr1 cells [
48], which are known to be significantly expanded in highly exposed children [
49]. This immunoregulation is thought to reflect an attempt by the immune system to limit inflammation-induced immunopathology but comes at the cost of limiting anti-parasite immunity and may interfere with the induction of robust vaccine-induced immunity.
Inflammatory innate and adaptive immune responses are crucial for parasite clearance; however, these effector functions can result in significant immunopathology without appropriate regulation [
50,
51]. IL-10 plays a crucial role in modulating the inflammatory response during malaria [
51], and it is notable that even in non-parasitaemic children, of the 65 cytokines measured in plasma, IL-10 was the only cytokine observed at significantly different levels between high- and low-episode individuals. While a number of different cell types produce IL-10, a major source in malaria infections is CD4
+ T cells that co-produce IFN-γ and IL-10 (Tr1 cells) [
49]. These cells are prevalent in children living in endemic areas, and the IL-10 they produce has been shown to inhibit malaria-specific pro-inflammatory cytokine production [
52]. Though we do not address the cellular source of IL-10 in this study, we found that increased plasma levels of IL-10 were significantly associated with increased expression of the “interferon-inducible” signature in high-episode children, in keeping with the known role of interferons in inducing the development of Tr1 cells [
48].
At the cellular level, a subset of γδ T cells (population 24/10) was significantly expanded in high- relative to low-episode children. γδ T cells are activated during malaria, but their function in anti-malarial immunity remains unclear. These cells have been shown to expand during acute malaria infection in previously naïve individuals [
53,
54] and can produce inflammatory cytokines including TNF-α and IFN-γ [
55] in addition to being able to directly kill merozoites in vitro [
56,
57]. More recently, a subset of γδ T cells has been shown to associate with the protection in irradiated sporozoite vaccination [
58]. In this study, we found that a specific subset of γδ T cells, expressing CD11c, accumulates in high-episode children. While not previously reported in the context of malaria, CD11c
+ γδ T cells have been described as a highly activated subset with enhanced effector function and high migratory potential [
59].
Deconvolution analysis integrating cellular proportions (as determined by flow cytometry) with transcriptomic data allowed us to infer altered gene expression profiles in neutrophils, CD8
+ T cells and B cells in high-episode children. Very little is known about the role of neutrophils in malaria although neutrophils isolated from
Pf-infected children in the Gambia were shown to temporarily exhibit reduced effector function until about 8 weeks after infection [
60]. Our results suggest that repeated episodes of malaria result in the development of an activated neutrophil phenotype that persists even in the absence of detectable infection.
Our finding of high levels of B cell expression of genes including TNF receptor superfamily member 13B (
TNFRSF13B), a receptor found on the surface of B cells, responsible for regulating humoral responses and survival of plasma cells [
61], is in line with the studies demonstrating that repeated exposure to malaria is necessary for the development of appropriate humoral responses [
3‐
5]. Furthermore, our finding that B cells from high-episode children also express high levels of IgE supports previous studies showing increased plasma IgE levels in individuals living in high transmission settings [
62,
63]. We also revealed a clear expansion of CD11c
+ B cells in malaria-experienced children. Atypical B cells (a population that includes CD11c
+ B cells) have previously been identified in high frequencies among individuals living in malaria-endemic regions [
9]. While these cells were completely missing in naïve children, we did not observe significant differences in CD11c
+ B cell numbers between the high- and low-episode groups. This suggests that although malaria most certainly leads to the initial expansion of these cells, they may not accumulate with subsequent episodes.
More unexpected was our finding that increased malaria experience results in more activated CD8
+ T cells. CD8
+ T cells have clear roles in the immune response to pre-erythrocytic stages of infection [
64,
65] and have been implicated in mediating pathology in a murine model of cerebral malaria [
66,
67]. There is evidence that CD8
+ T cells specific to blood-stage antigens are activated via cross-presentation by dendritic cells [
68] and may indirectly promote immunity through secretion of IFN-γ [
11,
69]. It is interesting to note that a recent study in Western Kenya has described the expansion of an unconventional innate-like CD8
+ T cell population in children living in an area of high parasite burden [
70]. While future studies will be needed to confirm the presence of this cellular subset among our cohort, this study provides further evidence of transcriptional alteration of CD8
+ T cells in the context of malaria exposure.
In our study, participants were selected on the basis of numbers of preceding episodes of malaria within the past 8 years; however, it is important to note that unsurprisingly from an epidemiological standpoint, these children also differ in two other important aspects. Despite the fact that none of the participants in either group had experienced an episode of malaria for more than 110 days, there was a significant difference in the calculated exposure indices and time to the last episode between the low- and high-episode groups. While the modular transcriptional signature we observed does not appear to correlate with time to the last episode (Spearman correlations: M1.1 = 0.027, p value = 0.87; M1.2 = − 0.05, p value = 0.76; M3.4 = − 0.08, p value = 0.63; M5.12 = − 0.13, p value = 0.43), we cannot discount the possibility that the effects that we observe are due in part at least to the more recent immunological stimulation in the high-episode group rather than the number of previous episodes per se. Carefully designed longitudinal studies would be required to disentangle the contributions of these and other parameters to the development of a malaria immune response. These studies could prospectively relate individual pre-existing immunological status to subsequent risk of clinical infection and thus determine which immune responses are directly related to clinical protection.