Background
Accurately estimating site-specific mosquito larval abundance is central to vector control strategies for identifying breeding habitat characteristics [
1,
2], determining the need for control and/or evaluating the efficacy of mosquito control programmes [
3‐
5]. Habitat association studies have linked
Anopheles mosquito larval presence or abundance to a suite of environmental factors related to the water body, such as depth [
6‐
8], temperature [
7,
9], algae [
8,
10,
11], riparian vegetation and shading [
1,
5,
10]. Despite this there is still uncertainty regarding the importance of some factors because of the between-study variation in these patterns. Although these differences may be partly explained by selective, cross-sectional sampling [
6] or species-specific preferences [
2], one possibility that has received relatively little attention is the behaviour of the larvae under different environmental conditions. There is increasing evidence that mosquito larvae adjust their behaviour in response to surface disturbances or predation risk [
12,
13], temperature [
14] and water nutrient levels [
14,
15]. Yet, how these and other factors translate into the probability of larvae being sampled, and the subsequent impact on the results of habitat-association studies, has never been explored.
There is a need to incorporate more relevant biological detail into our modelling of malarial mosquito ecology [
16]. One relatively simple way of doing this with regards to larval-habitat associations is to use a framework that indirectly includes environmental effects on larval behaviour: i.e., allows for detection probability to vary with environmental variables. This can be done by extending the general linear model to include a detection parameter calculated from sampling each site multiple times [
17,
18]. Mosquito larval studies are unusual in that they use a data collection method that can be used to calculate the probability of detection without additional sampling effort. Larvae are typically collected using dip sampling, in which multiple samples are collected from each site using a dipper; these are usually combined to give site-level estimates of presence or density [
19]. A simple modification to this method is to separately record the results for each dip sample rather than pooling them [e.g.,
3]; this repeated sampling at each site enables detection probability to be estimated. Thus, instead of directly relating all environmental factors of interest to larval presence, environmental variables can be modelled as influencing presence and/or detection probability. Such models would therefore not only improve the accuracy of presence estimates by accounting for imperfect detection, they would also make more biological sense in that environmental factors that influence larval detectability would not be erroneously linked to predicting mosquito presence.
In this study, data were used from a dip sampling survey in Ethiopia to examine if observations were confounded by imperfect detection, and determine if different environmental variables influenced larval presence compared to detection probability. First, as a comparison to previous studies, the data were analysed using the most common approach [
19]: i.e., aggregating the site data response variable into a single presence/absence value. Second, a less common approach was used that accounts for how many times each site was sampled and how many of these samples contained larvae (binomial distribution: successes per number of trials). There was an expectation of differences between the results of the first and second analyses because the first models only the larval presence and ignores detectability, while the second incorporates some measure of detectability, although this is confounded with presence. Third, a mixture model was developed that separately estimates presence and detection to allow detectability to be explicitly disentangled from presence/absence. This allowed an examination of whether a more complex modelling approach had greater support and predictive capability over simpler methods. In addition, the mixture model also allowed a combination of different environmental variables within the model’s separate presence and detection components to see if there was support for some variables being more important for detection, and some more important for larval presence. Finally, the effects of environmental variables on presence and detection were modelled to estimate how variation in these factors influenced presence and how many dip samples were required to confidently state whether a site contained larvae.
Discussion
Mosquito habitat-association studies aim to identify factors linked to larval presence or abundance as the basis for control programmes or distribution models [
2,
10]. However to ensure these studies are relevant, sampling protocols need to be designed and/or analysed in a way that the relationships between environmental factors and the probability of mosquito larval presence are not systematically biased. In this study, it was shown how the structure of the modelling framework can strongly influence relationships from data based on a common sampling protocol (i.e., fixed effort dip sampling [
19]), particularly when detection probability is less than perfect. Accounting for imperfect detection is a major issue in ecological sampling studies (e.g., [
29,
30]) yet detectability has not been accounted for in habitat-association studies of malarial mosquitoes (or indeed any mosquitoes; see [
19]). The results show that a failure to consider detection probability and the factors that influence it have the potential to impact on results from presence-absence habitat-association models by: (1) underestimating the true occupancy of sites, (2) erroneously linking factors related to detection probability with those of larval presence, and (3) under- or overestimating the importance of factors related to larval presence.
There were some clear differences in the results from the three analytical approaches. The first approach, and the one most commonly used (presence-absence logistic regression [
19]), assumes that detection at the site level in pooled samples is ~100 % and therefore any relationships between explanatory variables and larval presence are unbiased. This may be true if enough samples are taken at each site. However, the number of samples required to achieve this will depend on how easy it is to catch one larva: this will be related to the overall density and distribution of larvae in the water body [
3,
31,
32] and larval behaviour (this study). When density is very low, the effort required to find larvae when present can be immense (e.g., >17,000 dips [
3]). Thus, while rules of thumb on how many samples to collect at a site based on initial numbers of larvae caught may help reduce bias, there is no guarantee that bias will be eliminated unless detection probability is explicitly modelled. Despite this, comparing the results from the presence-absence model to the other models in this study shows that it was able to clearly identify the effect of riparian vegetation (important in all modelling frameworks), with effects of water depth and pH being much less certain.
The second approach (success-trial binomial) used the same data, but retained the success versus trials sampling information within the response variable. Here, the explanatory variable explains not only larval presence at the site, but also the proportion of dips at a site that contain larvae. So it makes sense that the results retain the main effect identified in the first analysis for larval presence (riparian vegetation) and include additional effects that are likely to explain the proportion of dips at a site that contain larvae (i.e., sunlight influences detection probability). This illustrates the importance of understanding the analytical method used when comparing habitat-association studies, as the results from these two methods contain different information despite using the same sample data. Interestingly between the first and second analysis, water depth went from being a factor with some support to a factor with strong support. This suggested that water depth operates on both presence and detection, with the success-trial analysis combining these effects. This interpretation is somewhat supported by the negative effects of depth on both presence and detection in the mixture model (Table
2). The variable of sunshine on the water at the time of collection was assumed a priori to be related only to detection; a comparison of the results of the presence-absence analysis (no effect) to the success-trial analysis (strong effect) shows this assumption is well supported.
The final modelling approach explicitly modelled presence and detection separately, in a way that allowed explanatory variables to influence these estimates. Here some of the patterns from previous analyses are repeated: i.e., riparian vegetation explaining larval presence and sunshine explaining detection probability. Water depth and pH were again highlighted as being of likely importance to larval site presence. Interestingly, water temperature was a very strong predictor of larval detectability, while in the previous analyses temperature showed no indication of being important. Table
2 suggests the reason for these seeming contradictory results; temperature has a negative effect on larval presence but a positive effect on detection. Thus, unless these processes of presence and detection are separately accounted for, the influences of different explanatory variables may be diluted or exaggerated.
Riparian vegetation had a clear negative relationship with larval presence (see also [
5,
10,
33]). The mechanism driving this relationship is uncertain [
2], but vegetation may negatively impact larvae directly or reduce egg laying through shading effects [
5] or vegetative decay impacting on larval health [
34]. Because riparian vegetation also includes farm crops (see also [
1]), this effect could relate to pesticide and fertilizer use. Water depth beyond a few centimetres is known to reduce anopheline larval survival [
21] because larvae will bottom-feed and deeper dives are energetically costly [
35]. In this study, larval presence declined in a similar pattern; however, it is unknown whether lower larval presence/abundance in deeper water is a consequence of reduced egg laying or lower larval survival. Water pH is negatively correlated with larval survival and development [
36]; however, in the range found in this study (pH 7–9) it would be unexpected for there to be strong effects. This suggests if the pH effect is real, it is most likely because pH was correlated with another measure not used in the analysis.
Many
Anopheles species prefer sun-exposed and shallow water bodies, although whether this is because of direct effects on larval development or indirectly through habitat quality is unknown [
2]. The results suggest that this relationship between sunlight and larval abundance may be more complex than previously acknowledged because sunlight and temperature appear to have a large effect on detectability. Because detection in water during warm sunny days versus cool cloudy days was compared, rather than sunny versus shaded sites [
5], this provides confidence that differences in measured occupancy between these sites resulted from differences in detection rather than presence or abundance relating to the site itself. This expectation was verified in the analyses, with sunlight being an important component of the detection function. The same issue also relates to water temperature. Although there are food and temperature-related limits and constraints on mosquito larval development [
37] that might be expected to influence egg laying, there are also temperature-related effects on larval behaviour [
14,
15] that likely influence the probability of being sampled.
Factors highlighted as influencing detection almost certainly operate through their impact on larval behaviour. Conventional dipping methods sample near the water surface [
19] and, thus, anything that changes the vertical distribution or aggregation of larvae can influence the probability of collection [
32,
38]. For example, larvae forage more actively in cooler water, and hence are more likely to dive [
15] and can stay down longer because of reduced metabolic oxygen consumption. This would increase their mixing in the water column and make it more difficult to sample when surface dipping. Likewise, sunshine warms the thin surface water layer (2 mm) where
Anopheles live and often feed [
2], meaning they would tend to remain in this narrow zone when it is sunny and retreat from it when the air temperature cools. Surface algae could influence behaviour by being an important larval food source [
11,
39], since
A. gambiae larvae are more likely to dive to the bottom for food under conditions of lower surface food availability [
15]. In addition, the presence of algae likely supports a higher density of larvae, which increases the probability of sampling at least one larva per dip. Water depth will influence detection simply by providing a larger volume for larvae to distribute in, reducing the probability of being sampled.
Conclusions
These results clearly show that detectability needs to be accounted for when undertaking mosquito larval surveys, especially if the density of larvae is low. Although the analyses focus on a presence–absence survey, detectability issues will also influence abundance estimates. Because environmental factors influence detection in different ways, and these factors are not uniform within the environment, systematic biases may emerge if they are not included when modelling habitat associations and species distributions. This has implications not only for studies of
A. gambiae, but also between-species and between-life-stage comparisons [
3,
4,
8] where factors are likely to influence focal species and life stages in different ways.
Attempts to deal with incomplete detection are partly incorporated into current sampling protocols, with multiple samples collected at each site. This approach will be reasonably effective if enough sampling at each site is undertaken. However when mosquito abundance falls to low levels, the amount of effort to confirm site occupancy increases exponentially [
3]. Thus, accounting for detection uncertainty becomes vital in situations where larval abundance drops below a certain threshold. Identifying these thresholds and how they might vary under different environmental conditions is an important avenue of future studies. This is critical if particular combinations of environmental variables lead to expected low detection rates; in such cases the detection uncertainty and how it relates to environmental variables needs to be incorporated into the modelling framework. At the very least, researchers using dip sampling to analyse mosquito site occupancy must utilize the information from each individual sample to estimate detection probability in order to minimize biases in larval presence estimates.
Authors’ contributions
Study conception: ML and RH. Study design: ML, ATT, RI, SH, RE, VF and RH. Data collection: ATT, RE and VF. Analysis: ML. Writing and editing: ML, ATT, RI, SH, RE, VF and RH. All authors read and approved the final manuscript.