Systematic literature review
To provide support for the claim that malaria epidemiologists generally do not modify their logistic regressions to account for imperfect diagnostic test outcomes, a targeted literature review was conducted. PubMed was searched using different combinations of the search terms ‘malaria’, ‘logistic’, ‘models’, ‘regression’, ‘diagnosis’, and ‘diagnostic’. The search was restricted to studies published between January 2005 and April 2015. Of the 209 search results, 173 articles were excluded because they included authors from this article, were unrelated to malaria, malarial status was either unreported or not the outcome variable in the logistic regression, and/or they relied solely on microscopy. Studies that relied only on microscopy were excluded because this diagnostic method is considered the gold standard in much of the world, with the important exception of locations with relatively low transmission (e.g., Latin America), where PCR is typically considered to be the gold standard method. Detailed information regarding the literature review (e.g., list of articles with the associated reasons for exclusion) is available upon request.
Statistical models and auxiliary data to address misclassification error
To avoid the problem associated with imperfect detection when using logistic regression, one obvious solution is to use a highly sensitive and specific diagnostic test (e.g., the gold standard method) to determine disease status for all individuals. Unfortunately, this is often unfeasible and/or not scalable because of cost or other method requirements (e.g., electricity, laboratory equipment, expertise availability, or time required). Alternatively, statistical methods that specifically address the problem of imperfect detection (i.e., misclassification) can be adopted. Unfortunately, these statistical models contain parameters that cannot be estimated from data collected in regular cross-sectional surveys or cohort studies based on a single diagnostic test. Therefore, these statistical methods are described in detail along with the additional data that are required to fit them.
For all models, JAGS code is provided for readers interested in implementing and potentially modifying these models (see Additional Files
1,
2,
3, and
4 for details). Readers should have no problem adapting the same code to WinBUGS/OpenBUGS, if desired. The benefit of using Bayesian models is that they can be readily extended to account for additional complexities (e.g., random effects to account for sampling design). As a result, the code provided here is useful not only for users interested in this paper’s Bayesian models but also as a stepping stone for more advanced models.
Bayesian model 1
One option is to use results from an external study on the sensitivity and specificity of the diagnostic method employed. Say that this external study employed the same diagnostic method, together with the gold standard method, and reported the estimated sensitivity
\(\widehat{SN}\) and specificity
\(\widehat{SP}\). This information can be used to properly account for imperfect detection. More specifically, Bayesian model 1 assumes that
$$I_{i} \sim Bernoulli\left( {\frac{{{ \exp }\left( {\beta_{0} + \beta_{1} x_{i1} + \beta_{2} x_{i2} + \cdots } \right)}}{{1 + { \exp }\left( {\beta_{0} + \beta_{1} x_{i1} + \beta_{2} x_{i2} + \cdots } \right)}}} \right)$$
where
\(I_{i}\) is the infection status of the ith individual,
\(\beta_{0} ,\beta_{1} ,\beta_{2} , \ldots\) are regression parameters, and
\({\text{x}}_{{{\text{i}}1}} ,x_{i2} , \ldots\) are covariates. It further assumes that:
$$D_{i} \sim Bernoulli(\widehat{SN}) \quad {\text{if}}\;\;I_{i} = 1$$
$$D_{i} \sim Bernoulli\left( {1 - \widehat{SP}} \right) \quad {\text{if}}\;I_{i} = 0$$
where
\(D_{i}\) is the regular diagnostic test result for the ith individual. Finally, different priors can be assigned for the disease regression parameters. A fairly standard uninformative prior is adopted for these parameters, given by:
$$\beta_{j} \sim N(0,10).$$
One problem with this approach, however, is that it assumes that these diagnostic test parameters are exactly equal to their estimates \(\widehat{SN}\) and \(\widehat{SP}\). A better approach would account for uncertainty around these estimates of sensitivity and specificity, as described in Bayesian model 2.
Bayesian model 2
This model is very similar to Bayesian model 1, except that it employs informative priors for sensitivity SN and specificity SP. One way to create these priors is to use the following information from the external study:
-
\(N_{ + }\) number of infected individuals, as assessed using the gold standard method;
-
\(T_{ + }\) number of individuals detected to be infected by the regular diagnostic method among all \(N_{ + }\) individuals;
-
\(N_{ - }\) number of healthy individuals, as assessed using the gold standard method; and
-
\(T_{ - }\) number of individuals not detected to be infected by the regular diagnostic method among all \(N_{ - }\) individuals.
Following the ideas in [
19,
20], these ‘data’ can be used to devise informative priors of the form:
$$SN\sim Beta\left( {T_{ + } + 1,N_{ + } - T_{ + } + 1} \right)$$
$$SP\sim Beta\left( {T_{ - } + 1,N_{ - } - T_{ - } + 1} \right).$$
There are other ways of creating informative priors for SN and SP that do not rely on these four numbers (i.e., \(T_{ - } ,T_{ + } ,N_{ - } ,N_{ + }\)) (e.g., based on estimates of SN and SP with confidence intervals from a meta-analysis) but the method proposed above is likely to be broadly applicable given the abundance of studies that report these four numbers.
Two potential problems arise when using external data to estimate
SN and
SP. First, results from the external study are assumed to aptly apply to the study in question (i.e., ‘transportability’ assumption), which may not necessarily be the case if diagnostic procedures and storage conditions of diagnostic tests are substantially different. Second, the performance of the diagnostic test may depend on covariates (i.e., differential misclassification) [
16]. For instance, microscopy performance for malaria strongly depends on parasite density [
21]. If age is an important determinant of parasite density in malaria (i.e., older individuals are more likely to display lower parasitaemia), then microscopy sensitivity might be higher for younger children than for older children or adults. Another example refers to diagnostic methods that rely on the detection of antibodies. For these methods, sensitivity might be lower for people with compromised immune systems (e.g., malnourished children). In these cases, adopting a single value of
SN and
SP in Bayesian model 1 or 2 might be overly simplistic and may lead to even greater biases in parameter estimates. Bayesian model 3 solves these two problems associated with using external data.
Bayesian model 3
Instead of relying on external sources of information, another alternative is to collect additional information on the study participants themselves (also known as an internal validation sample [
16]). More specifically, due to its higher cost, one might choose to diagnose only a small sub-set of individuals using the gold standard method. This sample enables the estimation of
SN and
SP of the regular diagnostic test (and potentially reveals how these test performance characteristics are impacted by covariates) without requiring the ‘transportability’ assumption associated with using external data.
In Bayesian model 3, the gold standard method is assumed to be employed concurrently with the regular diagnostic method for a randomly chosen sub-set of individuals. Its structure closely follows that of Bayesian models 1 and 2, except that now sensitivity and specificity are allowed to vary according to covariates:
$$D_{i} \sim Bernoulli\left( {SN_{i} = \frac{{{ \exp }\left( {\alpha_{0} + \alpha_{1} x_{i1} + \alpha_{2} x_{i2} + \cdots } \right)}}{{1 + { \exp }\left( {\alpha_{0} + \alpha_{1} x_{i1} + \alpha_{2} x_{i2} + \cdots } \right)}}} \right)\;{\text{if}}\;I_{i} = 1$$
$$D_{i} \sim Bernoulli\left( {1 - SP_{i} = \frac{1}{{1 + { \exp }\left( {\omega_{0} + \omega_{1} x_{i1} + \omega_{2} x_{i2} + \cdots } \right)}}} \right)\;{\text{if}}\;I_{i} = 0$$
where additional regression parameters (
\(\alpha_{0} ,\alpha_{1} ,\alpha_{2} , \ldots\) and
\(\omega_{0} ,\omega_{1} ,\omega_{2} , \ldots\)) determine how sensitivity and specificity, respectively, vary from individual to individual as a function of the observed covariates. Notice that the covariates in these sensitivity and specificity sub-models do not need to be the same as those used to model infection status
\(I_{i}\). Also notice that it is only feasible to estimate all these regression parameters because of the assumption that infection status
\(I_{i}\) is known for a sub-set of individuals tested with the gold standard method. More specifically, it is assumed that
\(I_{i} = G_{i}\) for these individuals, where
\(G_{i}\) is the result from the gold standard method. A summary of the different types of data discussed above and the corresponding statistical models is provided in Table
1.
Table 1
Summary of the proposed statistical models, their assumptions regarding the diagnostic method, and the additional data required to fit these models
Standard logistic regression | None | Perfect detection (i.e., sensitivity and specificity equal to 100 %) |
Bayesian model 1 | Estimate of sensitivity \(\widehat{SN}\) and specificity \(\widehat{SP}\) based on external study | Sensitivity and specificity are perfectly known constants, equal to the estimates from external study |
Bayesian model 2 | Data on sensitivity and specificity (i.e., \(N_{ + } ,T_{ + } ,N_{ - } ,T_{ - }\)) from external study | Sensitivity and specificity are constants and external study provides reasonable prior information on sensitivity and specificity for the target study |
Bayesian model 3 | Subset of individuals diagnosed with the regular and the gold standard method | Sensitivity and specificity can vary as a function of covariates. This model does not rely on data from external study (i.e., does not rely on transportability assumption) |
Simulations
The effectiveness of the proposed Bayesian models in estimating the regression parameters was assessed using simulations. One hundred datasets were created for each combination of sensitivity (SN = 0.6 or SN = 0.9) and specificity (SP = 0.9 or SP = 0.98). Sensitivity and specificity values were chosen to encompass a wide spectrum of performance characteristics of diagnostic methods. Furthermore, it is assumed that sensitivity and specificity do not change as a function of covariates. Each dataset consisted of diagnostic test results for 2000 individuals, with four covariates standardized to have mean zero and standard deviation of one. In these simulations, infection prevalence when covariates were zero (i.e., \(\frac{{\exp \left( {\beta_{0} } \right)}}{{1 + \exp \left( {\beta_{0} } \right)}}\)) was randomly chosen to vary between 0.2 and 0.6 and slope parameters were randomly drawn from a uniform distribution between −2 and 2.
For each simulated dataset, the true slope parameters were estimated by fitting a standard logistic regression (‘Std.Log.’) and the Bayesian models described above. For the methods that relied on external study results, it was assumed that \(N_{ - } = N_{ + } = 100\) and that \(T_{ + } \sim Binomial\left( {N_{ + } ,SN} \right)\) and \(T_{ - } \sim Binomial\left( {N_{ - } ,SP} \right)\). Therefore, the assumption for Bayesian model 1 (‘Bayes 1’) was that sensitivity and specificity were equal to \(\widehat{SN} = \frac{{T_{ + } }}{{N_{ + } }}\) and \(\widehat{SP} = \frac{{T_{ - } }}{{N_{ - } }}\). For Bayesian model 2 (‘Bayes 2’), the set of numbers \(\left\{ {T_{ + } ,T_{ - } ,N_{ + } ,N_{ - } } \right\}\) was used to create informative priors for sensitivity and specificity. Finally, Bayesian model 3 (‘Bayes 3’), assumed that results from the gold standard diagnostic method were available for an internal validation sample consisting of a randomly chosen sample of 200 individuals (10 % of the total number of individuals).
Two criteria were used to compare the performance of these methods. The first criterion assessed how often these methods captured the true parameter values within their 95 % confidence intervals (CI). Thus, this criterion consisted of the 95 % CI coverage for dataset d and method m, given by \(C_{d,m} = \frac{{\mathop \sum \nolimits_{j = 1}^{4} I\left( {\hat{\beta }_{{{\text{j}},{\text{d}},{\text{m}}}}^{\text{lo}} \; < \beta_{j,d} < \hat{\beta }_{{{\text{j}},{\text{d}},{\text{m}}}}^{\text{hi}} } \right)}}{4}\). In this equation, \(\beta_{j,d}\) is the jth true parameter value for simulated data d, and \(\hat{\beta }_{{{\text{j}},{\text{d}},{\text{m}}}}^{\text{lo}}\) and \(\hat{\beta }_{{{\text{j}},{\text{d}},{\text{m}}}}^{\text{hi}}\) are the jth estimated lower and upper bounds of the 95 % CI. The function I() is the indicator function, which takes on the value of one if the condition inside the parentheses is true and zero otherwise. Given that statistical significance of parameters is typically judged based on these CIs, it is critical that these intervals retain their nominal coverage. Thus, \(C_{d,m}\) values close to 0.95 indicate better models.
One problem with the 95 % CI coverage criterion, however, is that a model might have good coverage as a result of exceedingly wide intervals, a result that is undesirable. Thus, the second criterion consisted in a summary measure that combines both bias and variance, given by the mean-squared errors (MSE). This statistic was calculated for dataset d and method m as \(MSE_{d,m} = \frac{{\mathop \sum \nolimits_{j = 1}^{4} E\left[ {\left( {\beta_{j,d} - \hat{\beta }_{j,d,m} } \right)^{2} } \right]}}{4}\), where \(\hat{\beta }_{j,d,m}\) and \(\beta_{j,d}\) are the jth slope estimate and true parameter, respectively. Smaller values of \(MSE_{d,m}\) indicate better model performance.
Case study
Case study data came from a rural settlement area in the western Brazilian Amazon state of Acre, in a location called Ramal Granada. These data were collected in four cross-sectional surveys between 2004 and 2006, encompassing 465 individuals. Individuals were tested for malaria using both microscopy and PCR, regardless of symptoms. Additional details regarding this dataset can be found in [
22,
23].
Microscopy test results were analyzed first using a standard logistic regression model, where the potential risk factors were age, time living in the study region (‘Time’), gender, participation on forest extractivism (‘Extract’), and hunting or fishing (‘Hunt/Fish’). Taking advantage of the concurrent microscopy and PCR results, the outcomes from this standard logistic regression model were then contrasted with that of Bayesian model 3.
Microscopy sensitivity is known to be strongly influenced by parasitaemia. Furthermore, it has been suggested that people in the Amazon region can develop partial clinical immunity (probably associated with lower parasitaemia) based on past cumulative exposure to low intensity malaria transmission [
23‐
25]. Because rural settlers often come from non-malarious regions, time living in the region might be a better proxy for past exposure than age [
23]. For these reasons, microscopy sensitivity was modelled as a function of age and time living in the region.