Background
Vector management, involving a wide array of interventions, is the primary means of malaria prevention and control in Africa [
1,
2]. Malaria modelling, both mathematical and agent-based, can play important roles to quantify the effects of malaria-control interventions and to answer other interesting research questions. Models can play key roles in selecting appropriate combinations of interventions to interrupt transmission and in setting response timelines and expectations of impact. Mathematical modelling of malaria transmission dates back to the early models of Ross and Macdonald [
3,
4]. Recent mathematical models include a dynamic model of Smith and McKenzie [
5], a weather-driven parasite dynamics transmission model of Hoshen and Morse [
6], an individual-based model of Depinay
et al.[
7], the OpenMalaria epidemiology model [
8,
9], an intervention-based model of Yakob and Yan [
2] and others.
Agent-based models (ABMs) of malaria have also been used to model the basic behaviour of individual mosquitoes, including interactions within agents and to their environment. These interactions, involving a large number of agents, provide the opportunities to explore interesting emerging phenomena, such as population-level characteristics. Recent malaria ABMs include models of Gu and Novak [
10,
11], a transmission-directed model of Eckhoff [
12] and an individual-based simulation model of Griffin
et al.[
13]. A summary comparing model features from some recent malaria models is given in Table
1.
Table 1
Summary of feature comparisons from several malaria models, including new features modelled by this study
Model type | agent-based | mathematical | individual-based | mathematical | agent-based |
Spatial representation | landscape-based | N/A | space can be represented as a lattice of points | N/A | landscape-based |
Automation of landscape generation (e.g., using separate tools) * | no | N/A | no | N/A | yes (VectorLand) |
Boundary type of landscape * | absorbing | N/A | N/A | N/A | non-absorbing |
Average of multiple simulations * | no | no | no | no | yes |
Time-step resolution | daily | daily | hourly | daily | hourly |
Age-specific mortality | no | no | N/A | no | |
Daily mortality rate (immature stages) | fixed, 0.2 | fixed, 0.15 | temperature-dependent | N/A | age-specific (for larvae) |
Daily mortality rate (adult stages) | fixed, 0.2 | fixed, 0.15 | adult life expectancy of 10 days | N/A | age-specific |
Fecundity (eggs/oviposition) | fixed, 80 | N/A | fixed, 100 | N/A | N (170, 30) |
Variability in daily temperature | no | no | yes | yes | yes (25°C) |
Length of individual simulation run | 200 days for LSM, 300 days for ITNs | N/A | > 6 years | N/A | 1 year |
Interventions modelled | LSM, ITNs | LSM, ITNs | IRS, ITNs, larvicides, space spraying | ITNs, IRS | LSM, ITNs |
Time-step of intervention application | day 100 for LSM, day 150 for ITNs | N/A | N/A | N/A | day 100 |
Explores combined interventions | no | yes | yes | yes | yes |
Variability in human population | no | yes | no | yes | yes |
Coverage scheme used for ITNs * | proportion of house-holds with bed nets | proportion of populations sleeping under bed nets | proportion of populations sleeping under bed nets | proportion of populations sleeping under bed nets | partial and complete coverage (see Methods) |
Comparison of coverage schemes for ITNs * | no | no | no | no | yes |
The
Anopheles mosquitoes need to access blood meals and aquatic oviposition sites to complete their life cycle. Availability of these
ecological resources, i.e., the human houses and aquatic habitats, has long been recognized as a crucial determinant of malaria transmission [
3]. Reduced availability of either type of these spatial resources would prolong the gonotrophic cycle of the female mosquito and potentially affect malaria transmission. Also, these resources define landscape features such as spatial heterogeneity, host availability, etc., the importance of which for vector control have been demonstrated by several studies. For example, using an availability-based model, Killeen
et al. showed the influence of host availability on malaria vectors in African communities [
14]. Menach
et al. showed how the heterogeneity in human biting reflects the underlying spatial heterogeneity in the attractiveness, distribution and suitability of human houses and aquatic habitats [
15]. To demonstrate the spatial characteristics of transmission by the
Anopheles gambiae complex in sub-Saharan Africa, Carter
et al. identified some breeding sites as the foci of transmission, which are closely associated with particular locations; and the non-random distribution (clustering) of malaria case incidences in different households [
16]. Conclusions from the above studies naturally lead to habitat-based interventions, which necessitates a landscape approach to incorporate the spatial processes of mosquito foraging for oviposition and host-seeking [
17]. Spatially-explicit models, which permit the refined characterization of resource seeking to predict the impact of habitat-based interventions, can prove valuable to this end [
10,
11,
17]. Earlier, an ABM of malaria, derived from a conceptual entomological model of the
An. gambiae life cycle, was developed [
18]. The model was later extended to have explicit spatial representation [
19,
20]. The ABM is presented here as a runnable program (JAR file as Additional file
1), with a sample input file (as Additional file
2), respectively.
Larval source management (LSM), insecticide-treated nets (ITNs) and indoor residual spraying (IRS) have been extensively used as intervention tactics to reduce and control malaria in sub-Saharan Africa. Impacts of various interventions (including LSM, ITNs and IRS) have been investigated by early and recent studies [
2,
8,
10‐
12,
21‐
23]. LSM (also known as source reduction), one of the oldest tools in the fight against malaria, refers to the management of aquatic habitats in order to restrict the completion of immature stages of mosquito development. In a recent study, Fillinger and Lindsay suggest that LSM can be successfully used for malaria control in African transmission settings by highlighting historical and recent successes, and discuss its potential in an integrated vector management (IVM) approach working towards malaria elimination [
24,
25]. In areas with moderate and focal malaria transmission where larval habitats are accessible and well-defined, LSM is also cost-effective when compared with IRS and LLINs [
26]. For this study, LSM refers to the permanent elimination of targeted aquatic habitats, which may be achieved by various methods that include landscaping, drainage of surface water, land reclamation and filling, coverage of large water storage containers, wells and other potential breeding sites, etc. [
25].
ITNs, particularly the long-lasting insecticidal nets (LLINs), are considered among the most effective vector control strategies currently in use [
2,
27‐
29]. To combat against the major malaria vectors (including
An. gambiae) in Africa, scale-up applications of ITNs, which can offer direct personal protection to users as well as indirect, community protection to non-users (through insecticidal and/or repellent effects), are advocated [
11,
28]. Primarily due to mathematical convenience, earlier models that studied the impact of ITNs on malaria transmission assumed a uniform contact structure between mosquitoes and hosts across the landscape [
30,
31]. However, empirical data indicating limited flight ranges and sensory perception of mosquitoes suggest that proximity between the mosquitoes and their hosts can play a crucial role in the mosquito biting behaviour [
32‐
36]. Hence, spatially-explicit models are needed to analyse the local host-seeking process of the mosquitoes, and to study the responses of mosquitoes to ITNs. Such models can also provide evidence for the need of entomological surveillance for evaluation of scale-up ITN programmes [
11].
Replicability of the
in-silico experiments and simulations performed by various malaria models bear special importance. Although computational science has led to exciting new developments, the nature of the work has also exposed shortcomings in the general ability of the research community to evaluate published findings [
37]. Replication, which is treated as the scientific gold standard to judge scientific claims, allows independent researchers to address a scientific hypothesis and produce evidence
for or
against it [
37,
38]. Replication confirms reproducibility, which refers to the independent verification of prior findings, and is at the core of the spirit of science [
39,
40]. In agent-based modelling and simulation (ABMS), replication is also known as model-to-model comparison, alignment, or cross-model validation. It falls under the broader subject of verification and validation (V&V). One of its goals is to try to align multiple models in order to investigate whether they produce similar results [
41,
42]. When the original models (e.g., the source codes) are available, a stricter form of model verification, known as
docking, may also be performed. In the past, the process of achieving a complete dock between separate implementations of the malaria ABMs was shown [
19,
20].
One of the goals of this study is to replicate the results and extend some assumptions of two published studies performed by the same authors. These studies explore the impact of applying LSM and ITNs as stand-alone interventions using an ABM [
10,
11] (for brevity, the studies are hereafter referred to as GN-LSM and GN-ITN, and the ABM used as GN-ABM, where GN refer to the initials of last names of the authors). Critical examination of these studies reveals that although they provide reasonably plausible results, two major assumptions may be extended regarding: (1) the number of replicated simulation runs, and (2) the boundary type of the landscapes.
Any simulation model which involves substantial stochasticity should conduct sufficient number of replicated runs (with identical parameter settings but different random seeds), and the average and/or aggregate results of these replicated runs should be reported, as opposed to reporting results from a single run. Sufficient number of replications is required to ensure that, given the same input, the average response can be treated as a deterministic number, and not as random variation of the results. This allows to obtain a complete statistical description of the model variables. The same principle also applies to a set of stochastic (Monte Carlo) simulation models in other domains (e.g., traffic flow, financial problems, risk analysis, supply chain forecasting, etc.), where, in most cases, the standard practice is to report the averages and standard deviations of the measures of interest (known as the
Measures of Effectiveness, or MOEs) [
43,
44].
Since most epidemiology models (including ABMs) involve substantial stochasticity in the forms of probability-based distributions and equations, performing sufficient number of replicated runs is also important for validation of the results. In malaria ABMs, decisions are often simulated using random draws from certain distributions. These sources of randomness are used to represent the diversity of model characteristics, and the behaviour uncertainty of the agents’ actions, states, etc., with the goal to mimic/simulate the reality as closely as desired. For example, in the ABM, when a host-seeking mosquito searches for a blood meal in a ITN-covered house, a 50% ITN mortality would mean that it may die with a probability of 0.5, which can be simulated using random draws from a
uniform distribution. As another example, the number of eggs in each egg-batch of a
Gravid mosquito is simulated using random draws from a
normal distribution with
mean (average) = 170 and
standard deviation = 30. The randomness has significant impact on the results of the simulation, and different simulation runs can therefore produce significantly different results, due to a different sequence of pseudo-random numbers drawn from the distributions. So, replicated runs for all simulations reported in this study are performed, as opposed to single runs performed in GN-LSM and GN-ITN [
10,
11].
The second issue, the use of a specific boundary type, may greatly impact the mosquito movement process. In general, three different boundary types are commonly used in ABMS: absorbing, non-absorbing and reflecting. With an
absorbing boundary, mosquitoes are permanently removed (effectively killed) when they hit an edge of the landscape’s boundary. On the other hand, with a
non-absorbing boundary, when mosquitoes hit an edge, they re-enter the landscape from the edge directly opposite of the exiting edge (and thus are not killed due to hitting the edge). Unless the underlying landscape reflects a completely isolated geographic location (e.g., an island far away from the mainlands), in reality, when mosquitoes hit an edge, logical approaches are either to reflect the mosquito back from the same edge (reflecting boundary), or to coerce the mosquito to re-enter from the opposite edge (non-absorbing boundary). However, a non-absorbing boundary may more realistically capture the mosquito population dynamics. This is especially true when the resource densities are high and the resources are more evenly distributed across the landscape. The GN-ABM uses an absorbing boundary for all landscapes. In this study, all landscapes are modelled topologically as 2D torus spaces (a 2D torus is a geometrical surface of revolution generated by revolving a circle in two-dimensional space about an axis coplanar with the circle; in ABM, a toroidal space resembles a donut topology, allowing an agent to re-enter the space from the opposite edge when it moves off one edge), and use a non-absorbing boundary. However, to compare with GN-LSM [
10], results that use an absorbing boundary are reported first.
In malaria literature, multiple definitions of the term
ITN coverage can be found. The Roll Back Malaria (RBM) Partnership uses ITN coverage as the proportion of households owning a bed net or sleeping under a bed net [
45] (this definition is also used by GN-ITN [
11]). On the other hand, the World Health Organization (WHO) reports ITN coverage as the number of bed nets distributed per person at risk [
46]. In some studies, ITN coverage is also defined as the proportion of populations sleeping under treated bed nets [
30], and is used more widely in recent models [
2,
8,
12,
30]. However, this distinction in multiple definitions of ITN coverage, primarily concerning coverage levels of households and individuals, has not been addressed (within a single study) by most recent models. The WHO emphasizes the importance of scale-up ITNs coverage beyond vulnerable population (children under five years of age and pregnant women) as a priority for combating malaria in tropical Africa [
47]. Also, several studies have shown that the patterns of coverage and effective coverage are important determinants of ITN/LLIN success [
13], and simple ITN/LLIN models in which the coverage scheme is not carefully designed can lead to overly optimistic results [
31,
48,
49]. Thus, simulating different definitions of ITN coverage and assessing their relative impacts are important, especially when replicating and validating results of an earlier model that used either of these definitions (e.g., [
11]). Hence, as an extension to GN-ITN [
11], three different definitions/schemes of ITN coverage, which differ by the number of persons actually covered by bed nets in a ITN-covered house, are simulated and compared: 1) household-level
partial coverage with
single chance for host-seeking, 2) household-level
partial coverage with
multiple chances for host-seeking and 3) household-level
complete coverage. All schemes are described in details in Methods (for the purposes of this study, coverage means access to an ITN; however, as described in Methods, household-level coverage and population-level coverage are defined as the proportion of the houses with coverage and the proportion of the people sleeping under ITNs, respectively).
A landscape generator tool,
VectorLand, is also developed to aid in generating landscapes with varying spatial heterogeneity of both types of resources. An earlier version of
VectorLand appeared in [
19]. Here, a runnable program (in a JAR file) is presented as Additional file
3. It is emphasized that
VectorLand is a tool to generate landscapes, which are then used as spatial input to the ABM; and is not a model in itself. A screenshot of
VectorLand is given in Additional file
4.
There is now a consensus that malaria elimination with current tools is far more likely if the best available tools are used in combinations [
27]. The IVM approach, promoted by the WHO, is a rational decision-making process for the optimal use of resources and efficient management for vector control. It actively considers the notion whether multiple interventions can be combined to control vector-borne diseases [
25]. Because of improved efficacy, cost-effectiveness, ecological soundness and sustainability, IVM is increasingly being recommended as an option for sustainable malaria control [
50]. The rationale of using combined interventions is that multiple interventions can offer synergistic effects on top of individual impacts offered by each intervention (when applied alone), thus producing a result that is greater than the sum of their individual effects. Such synergistic effects have been demonstrated by several model-based and field-based studies (if such synergy exists, it would be useful to understand and verify it in the field, and this study may prove helpful to this regard). Using a mathematical model, Yakob and Yan theoretically examined the application of LSM with ITNs in reducing malaria transmission [
2]. The combined impact of ITNs (or LLINs) and IRS is examined by Chitnis
et al. using the OpenMalaria model [
8] and by a recent field-based study in south eastern Tanzania by Okumu
et al.[
51]. Using an ecological model, White
et al. explored the impact of LLINs, IRS, larvicide and pupacide [
52]. Eckhoff used a cohort-based vector simulation model to demonstrate the effects of increasing coverage with perfect IRS, combining IRS and ITNs, and combining larval control (using larvicides) and space spraying [
12]. Using an individual-based simulation model with different combinations of LLINs, IRS, artemisinin-combination therapy (ACT), mass screening and treatment (MSAT) and vaccines, Griffin
et al. showed that the combined interventions can result in substantial declines in malaria prevalence across a wide range of transmission settings [
13]. Kleinschmidt
et al. presented a summary of studies comparing the effect of IRS combined with ITNs [
53]. Some of these studies suggest that when combined interventions are applied, it may be more beneficial to target different stages of the mosquito’s life cycle, rather than applying interventions that may interfere with each other (e.g., LLINs and IRS) [
52].
Two important notions emerged from the conclusion of these studies: (1) when combined interventions are applied, the individual efficacy of each intervention needs to be ensured and (2) attacking different behaviours or life cycle stages of the mosquito may be more synergistic. Based on these, LSM and ITNs are selected, and their combined impacts are explored with the ABM. To ensure (1), the impacts of both are first examined as stand-alone interventions. In doing so, the two GN studies [
10,
11] are replicated, and some of the original assumptions are extended. It is interesting to note that no ABMs ever explored the combined impact of LSM and ITNs before (although some other combinations were explored using ABMs). Since LSM and ITNs primarily affect two different life cycle stages (i.e., larval and adult stages, respectively) and involve two different types of
ecological resources (i.e., aquatic habitats and human houses, respectively), this combination is potentially important.
In this study, using the spatial ABM, the effects of LSM and ITNs are first investigated separately (in isolation), and then are compared to the results reported by the original studies [
10,
11] (the goal of replication is to achieve a qualitative (not absolute) match between results of the ABM and those reported in GN-LSM [
10] and GN-ITN [
11]). Then, using different population profiles to explore the human density effect, the combined impact of LSM and ITNs are investigated, and similar results reported by Yakob and Yan [
2] are also discussed. Lastly, some guidelines for future ABM modellers, summarizing the insights and experience gained from this work of replicating the original models, are recommended. A systematic comparison of some features and assumptions of several recent malaria models, including those that are extended, or modelled for the first time by this study, is given in Table
1.
Discussion
In general, with LSM applied in isolation, the replicated results agree with the major findings by GN-LSM [
10] that LSM coverage of 300 m surrounding all houses can lead to significant reductions in abundance, and, while targeting aquatic habitats to apply LSM, distance to the nearest houses can be an important measure. However, as shown by the model, some of the underlying assumptions in the GN-LSM model could have seriously affected their predicted outcomes. To be specific, reporting results from a single simulation run and the use of an absorbing boundary could lead to substantially different results, invalidating the findings and thereby diminishing the predictive power of the models. Also, without a more sophisticated spatial metric that can capture the interrelations of different resources in different landscapes, simplistic features such as the general arrangement pattern of houses (e.g., diagonal, horizontal and vertical) are insufficient to capture a landscape’s potential to transmit the disease. For example, comparing the most restrictive cases (T3) of LSM application, the reduction in abundance is more prominent with a non-absorbing boundary (from ≈ 10,000 to ≈ 1,800, as shown in subfigure of Figure
6) than with an absorbing boundary (from ≈ 3,000 to ≈ 500, as shown in subfigure 6 of Figure
5). Due to the random distributions of houses and aquatic habitats in the three selected patterns, the reduction effects remain unpredictable, depending on factors such as the proximity of the resources to the boundaries of the landscapes. When applied to different (e.g., more general or specific) conditions, these assumptions may produce misleading results. The modified assumptions, as implemented in this study, provide new insights, and potentially more accurate results under certain conditions.
It is implausible to expect 100% reductions in abundance even with the most restrictive application of LSM (T3 in Figures
5, 6 and Table
5). This is because even with an absorbing boundary, some mosquitoes would always survive by roaming around in different parts of the landscape, instead of hitting the edges of the boundary (and hence dying out). This is observed in the results - the highest PR value obtained is 91.79% with scenario T3 using an absorbing boundary, as opposed to 100% observed in several cases in the GN-LSM study [
10].
In few cases, negative PR values are obtained (see Table
5), suggesting that the abundances actually increase after applying LSM. A closer look at the landscapes (see Additional file
5) reveals that these cases are associated with the removal of a small fraction of all aquatic habitats (4 out of 90 for C1 and T1) by LSM. Recall that in the ABM, abundance is governed by the
CC of aquatic habitats and the density-dependent oviposition mechanism. Removal of only a few nearby habitats may actually save a mosquito from wasting its time trying to search, locate, and compete in laying eggs in the already-crowded habitats, and instead be more productive by finding comparatively less-crowded habitats which are within close vicinity.
This points to an important insight: if the mosquito population in the environment is not unrestricted (i.e., it is restricted to be within the limit of the environment’s overall capacity, as in the ABM), and some stages of the mosquito biology are governed by special mechanisms (e.g., density-dependent oviposition), then removal of only an insufficient number of aquatic habitats may, in some cases, increase the abundance. Thus, before actually applying LSM, it may be crucial to estimate its impact (to achieve the desired level of success) by simulating varying levels of coverage.
As expected, with ITNs, different definitions of ITN coverage can lead to significantly different results. The household-level partial coverage schemes can provide only ≈ 50% reduction in abundance with 100% coverage and 100% mortality. This means that even when each house is equipped with one bed net (which, overall, covers only ≈ 54% of the human population), this scheme cannot perform even anywhere close to suppress abundance. On the other hand, the household-level complete coverage scheme can provide as much as 70% reductions in abundance with ≥ 85% coverage and mortality as low as 25%. With this scheme, when the coverage is 100%, abundance can be completely suppressed even when no mortality is in action (i.e., M = 0.0), as shown in subfigures 3-4 in Figure
9. This is expected: since every person in every house is protected by bed nets, the host-seeking mosquitoes cannot find unprotected hosts to obtain blood meals. While modelling the impact of ITNs, these distinctions should be clearly marked, and the choice of the ITN coverage scheme should be made carefully.
In general, repellence, which drives the host-seeking mosquito away from a house, can have a detrimental effect on vector control when the risk (additional delay in search etc.) of finding an unprotected host in another house is less than that in the same house. With the complete coverage scheme, since every person in the house (with ITN coverage) is protected by bed net, the above turns out to be true. However, as coverage C increases, more houses fall within the range of coverage, and the probability of finding an unprotected host (in another house) during the next search decreases. Thus, with increasing coverage, the negative impact incurred by too high repellence gets reduced, as evident in the first three rows (subfigures 1-9) of Additional file
11.
On the other hand, with household-level partial coverage schemes (both with single or multiple chances), this effect is almost absent (see Figure
8, subfigures 1-2 of Figure
10, and Additional files
9,
10 and
12). Recall that with partial coverage schemes, every person in the same house (with ITN coverage) may not be protected by a bed net. Thus, the mosquito may find an unprotected host in the same house. If it is repelled too often (due to high repellence), it is being deprived of its current positional advantage, and the risk of finding an unprotected host in another house may not be well-justified.
Interestingly, the use of a specific boundary type does not have significant impact for this particular landscape (see Additional file
6). Using absorbing and non-absorbing boundary, three schemes of ITN coverage are simulated and compared (see Methods for the schemes). No significant difference is found if age-dependent DMRs are used with both boundary types (as mentioned before, using fixed DMRs is not practical for the density-regulated ABM).
While applying LSM and ITNs in combination, some synergistic effects are observed in the results. However, as shown in Figure
11, the combined impact is additive (and not multiplicative), and is more effective with high
density
houses
, confirming similar findings in [
2].
With higher
density
houses
, impact of ITN mortality (M) becomes increasingly important. As shown in Figure
11, increasing ITN mortality affects the shape of the low-to-medium range (10-40%) PR isolines. With no insecticidal effect of ITNs (i.e., M = 0.0), looking at row 1 of Additional file
13, as
density
houses
increases, more host-seeking events occur, causing more mosquitoes to seek for aquatic habitats in order to lay eggs. But with increasing LSM coverage, they are denied more opportunities to lay eggs (as more aquatic habitats are eliminated), causing the lower range (10-40%) PR isolines to reduce vertically (down the y-axis). However, as both
density
houses
and ITN coverage increase (but mortality still remains 0), more host-seeking events actually encounter ITNs, but with no mortality in effect, ITNs cannot have significant impact, thus extending the lower range (10-40%) PR isolines horizontally (across the x-axis). As ITN mortality increases (in Figure
11 and rows 2-4 of Additional file
13), this extension effect is gradually reduced, and more impact is seen with higher
density
houses
.
Replication of earlier ABMs (that examined the impact of LSM and ITNs in isolation) poses some unique challenges. The unavailability of source codes of the original models inhibits from performing direct model-to-model comparison (docking). The structural characteristics of ABMs, which are fundamentally different from, for example, equation-based mathematical models, also rule out the possibility of systematic verification of model features, and draw some important V&V issues. The following major sources are identified from which model differences may arise, and/or the process of replication may become more time-consuming and challenging:
-
Conceptual image of the model: the intended logical view of the ABM may be perceived differently by different modellers, thus creating different conceptual, mental images of the logical view.
-
Choice of tools: selection of programming languages and tools (e.g., C++ vs. Java) from the numerous options offered these days may be another potential source. The availability and limitations of a particular programming language, use of specific data structures and other language constructs, and even the coding style of individual modellers, can compound the differences.
-
Availability of additional resources: in some cases, additional resources used by the model (in the forms of artificial maps, object-based landscapes, etc.), if not defined or made explicitly available, pose subtle challenges. Although the importance of these resources may seem somewhat arbitrary in the broader context, goals and output of the original models, for replication, their precise specification still remains important. For example, as shown before, in replicating the landscapes, the absence of a listing of the spatial coordinates of the objects (which may be provided as supplementary materials), not only forces future modellers who try to replicate the landscapes to spend a significant amount of time in reproducing the landscapes (some part of which inevitably rely on best guesses, due to the lack of additional information), it also increases the possibility of judgement errors being introduced in this phase.
Clear, detailed description of the parameter space for all model parameters used by the ABM, including their initial and other time-varying conditions, may substantially help in minimizing the conceptual image gaps. However, as the past experience shows [
19,
20], merely stating model parameters, logical flowcharts, initial conditions etc. cannot entirely solve the above problems, primarily because: (1) the possibility of different logical workflow paths in the programmed code still remains open and (2) many implementation details still cannot be covered. Based on this modelling exercise, the following guidelines are recommended for future ABM modellers of malaria:
-
Code and data sharing: The source code and executable programs of the ABMs should be shared with the research community. The trends of open-access research have become increasingly important and popular in recent years. To ensure a minimum standard of reproducibility in computational sciences, enough information about methods and code should be available for independent researchers to reach consistent conclusions [
37]. Many reputed journals across multiple disciplines have also implemented different code-sharing policies. For example, the journal
Biostatistics[
54] has implemented a policy to encourage authors of accepted papers to make their work reproducible by others. In this journal, based on three different criteria termed as “code”, “data” and “reproducible”, the associate editor for reproducibility (AER) classifies the accepted papers as C, D and/or R, respectively, on their title pages [
37,
55]. As reproducibility is critical to tracking down the bugs of computational science, code-sharing may be specially important for malaria ABMs. Having multiple research groups examining the same model and generating new data or output may lead to robust conclusions [
39]. Some recent malaria models have partially followed this path by providing controlled access to their models. For example, the OpenMalaria epidemiology model [
56] provides a general open-access platform for comparing, fitting, and evaluating different model structures. The EMOD vector ecology model, from Intellectual Ventures Lab [
57], is available within controlled execution environments. However, for certain reasons (e.g., during preliminary design and development phases, exploratory feature testing phases, etc.), it may not always be the ideal case to share the source code. In these cases, it is recommended that for ABM-based studies which are accepted for publication, at least the associated executable programs and/or other tools be made available as supplementary materials (for this study, the ABM, a sample input file and the landscape generator tool are shared as Additional files
1,
2 and
3, respectively, with detailed instructions on how to run).
-
Relevant documentation: Modellers who share the source code and/or executable programs of their ABMs should also provide well-written documentation. Documentation is an important part of software engineering. The journal
PLOS Computational Biology, which publishes articles describing outstanding open source software, emphasizes that the source code must be accompanied with documentation on building and installing the software from the source, including instructions on how to use and test the software on supplied test data [
58]. An ABM documentation may include statements describing the attributes, features and characteristics of the agents and environments of the ABM, the overall architecture or design principles of the code, algorithms and application programming interfaces (APIs), manuals for end-users, interpretation of additional materials (e.g., object-based landscapes), etc. Free and commercial software tools are available which can help automating the process of code annotation, code analysis and software documentation [
59‐
62].
-
Standardized models: The general workflow of the ABM, including the input/output requirements, program logic, etc. should follow a standardized approach. The need for standardization becomes more important when the broader utility of the model is considered within an integrated modelling platform. For example, both OpenMalaria [
56] and EMOD [
57] are currently being integrated within the open-access execution environment of the Vector Ecology and Control Network (VECNet) [
63]. The proposed VECNet cyberinfrastructure (VECNet CI), within a shared execution environment, establishes three modes of access sharing for model developers: (1) shared data: model developers run their models on their own compute resources and upload the output data to the VECNet CI for public consumption; (2) shared execution: model developers share their software with VECNet CI developers only, allowing the CI and its operators to incorporate their model into the CI execution environment; and (3) shared software: model developers share their software at large with the public. Once integrated, these models can utilize other components of the VECNet CI, including the VECNet Digital Library, web-based user interface (UI), tools for visualization, job management, query and search, etc. in order to, for example, import and use malaria-specific data to run specific scenarios or campaigns of interest, and display their output using the visualization and/or the UI tools of the VECNet CI. It is envisaged that most malaria ABMs, in future, will be accommodated within the integrated modelling frameworks of similar cyberinfrastructure platforms. Hence, to expedite the integration process, future malaria ABMs should plan and follow a well-defined integration path from the early phases of model development.
Conclusions
In this study, the individual and combined efficacy of applying LSM and ITNs are explored by using a spatial ABM of malaria that precisely defines the movement rules of adult female mosquitoes in their resource-foraging process in grid-based landscapes. Results of two earlier studies that explored similar research questions [
10,
11] are replicated, and a systematic comparison of the results are presented. By extending some of the original assumptions (e.g., reporting results from single simulation runs, use of an absorbing boundary, etc.), it is shown that the use of these assumptions may lead to less reliable results. With the combined application of LSM and ITNs, the results indicate that varying densities of the human population can affect the degree of synergistic benefits that may be obtained from such efforts, as was previously shown by a mathematical model [
2]. To the best of our knowledge, this is the first ABM-based study to explore this particular combination of LSM and ITNs (acknowledging that some other combinations were explored by other ABMs, e.g., [
12]). Some challenges faced while replicating earlier models are also discussed, and several guidelines (code and data sharing, relevant documentation, and standardized models) obtained from this exercise are recommended for future ABM modellers of malaria.
As the results indicate, replicability of the experiments and simulations performed by malaria models published earlier bear special importance. Due to several factors (including new tools and technologies, massive amounts of data, interdisciplinary research, etc.), the task of replication may become complicated. By sharing the ABM and the landscape generator tool, the importance of open source software for reproducibility and replicability is emphasized.
In the future, seasonality and other weather parameters (e.g., humidity), alternative hosts for blood-feeding (e.g., cattle), aquatic habitats with varying carrying capacities to reflect the variability of habitat attractiveness and productivity, and temporal variability for certain intervention parameters (e.g., repellence and insecticidal effect of ITNs) are planned to be included in the model. Calibrating the assumptions and parameters of the model against data from field-based studies, and exploring the impact of other existing interventions (e.g., IRS, space spraying, etc.), or new interventions (e.g., spatial repellents and/or insecticides, oviposition traps, etc.), both in isolation and in combination, are also planned. Lastly, VectorLand is planned to be improved to aid in generating operational guidelines for targeting of aquatic habitats and houses, and thus to perform a systematic study of the effect of spatial distribution of habitats.