Background
The primary endpoint in efficacy studies for antimalarials in uncomplicated
Plasmodium falciparum malaria is the risk of recrudescence, defined as the recurrence of peripheral parasitaemia genetically identical to the parasites present before treatment. Molecular analysis of the parasite samples collected at pre-treatment and on the day of recurrence is used to discriminate homologous (recrudescent) from heterologous (new) infections [
1]. When paired analysis of the pre- and post-treatment parasite cannot be determined reliably, treatment outcome is defined as indeterminate (Additional file
1, Section A).
The current WHO guideline for dealing with indeterminate outcomes in antimalarial efficacy trials is to exclude them from the analysis, that is, to carry out a complete case (CC) analysis [
2]. This implicitly assumes that the indeterminate cases are a representative random sample of the study population, ignoring the fact that these indeterminate recurrences must be either a recrudescence or new infection, and may depend on other measured and unmeasured patient and parasite characteristics. The CC analysis is usually supplemented with two extreme sensitivity analyses representing the worst and best scenarios, where all indeterminate recurrences are assumed to be either recrudescences or new infections. As well as biased, such ad hoc single imputation approaches consider the imputed datum as the ‘known observed’ value and uncertainty regarding not knowing the reason for parasite recurrence isn’t fully accounted for. This yields inferences that are over-precise, i.e. standard errors are too small rendering the associated hypothesis tests as invalid [
3‐
5].
Under the multinomial assumption, the maximum likelihood estimate of the proportion of patients with parasitic recrudescence can be obtained as outlined by Little and Rubin (2002) [
6]. Let,
n be the total number of patients who received antimalarial drug, of whom
n0 were cured,
m1 developed new infection,
m2 were recrudescent, and
r recurrences were indeterminate at the end of the planned follow-up. The maximum likelihood estimate of proportion of who failed is then obtained as:
$$ {\hat{\rho}}_{ML}=\left(\frac{m_2}{m_1+{m}_2}\right).\left(\frac{n-{n}_0}{n}\right) $$
(1)
The complement of equation (1) provides an estimate of the cured proportion:
$$ 1-{\hat{\rho}}_{ML}=1-\left\{\left(\frac{m_2}{m_1+{m}_2}\right).\left(\frac{n-{n}_0}{n}\right)\right\} $$
(2)
In the absence of censoring, equation 2 provides a consistent estimate of failure proportion compared to the CC approach (Additional file
1, Section B). When there are censored observations (e.g: due to lost to follow-up or due to new infection), the Kaplan-Meier (K-M) method can be used. The K-M approach is currently the WHO recommended approach for measuring antimalarial failure, whereby individuals with indeterminate parasite recurrence are excluded and individuals with new infections or loss to follow-up are censored [
2].
Alternative approaches for dealing with an indeterminant parasite recurrence outcome are multiple imputation (MI) and inverse probability weighting (IPW), which are statistically principled approaches for handling missing data [
7‐
11] under the assumption that the missing data depends on observed variables. In antimalarial clinical efficacy studies, variables that are commonly recorded and may affect whether or not a recurrence is indeterminant, are transmission intensity, the number of molecular markers used, density of the parasites on day of recurrence and antimalarial treatment administered. Background allelic diversity of the parasite strain is rarely known or reported and therefore it is not possible to test if this variable influences the determination of homologous (recrudescence) or heterologous (new infection) parasite recurrences. As such, MI and IPW assume that the occurrence of indeterminant recurrences does not depend on allelic diversity of the parasite strain and any other unmeasured variables.
The basic principle of MI is to impute the missing values based on the distribution of the observed data and repeat this
m times in order to account for the uncertainty in missing values [
12,
13]. This is essentially a two-step procedure. In the first step, incomplete data are replicated multiple times from a suitable imputation model where values are drawn from the posterior predictive distribution (imputation step) [
14]. In the second (analysis) step, the substantive model (target analysis) of interest is carried out on each of the completed datasets (observed plus imputed data). The final estimates and standard errors are then derived by combining estimates across each of the multiply imputed datasets using Rubin’s combination rules, which incorporates uncertainties within and between imputations [
13]. For IPW, complete cases are weighted by the inverse of their probability of being a complete case, i.e. up-weighting the data from participants who have a low probability of being observed thus creating a pseudo-population [
9]. The final analysis is then carried out using only the complete observations (i.e. for this example indeterminate recurrences are not included), but they are now weighted to rebalance the set of complete cases so that it is representative of the whole sample. Like MI, the IPW approach is also a two-step estimator. In the first step, a missingness model is constructed to estimate the probability of an observation being a complete case and the inverse of these probabilities are used as the weights in the analysis (step 2) of the complete cases.
Multiple Imputation and inverse probability weighting has been increasingly used in the medical and statistical literature in the past decade [
9,
10,
15]. Yet only a handful of studies have considered these missing data methods when dealing with indeterminate outcomes in derivation of antimalarial efficacy in uncomplicated
P. falciparum malaria (only three studies to our knowledge) [
16‐
18]. Machekano et al. (2008) compared the performance of MI and IPW approaches against the CC analysis using a randomised study in Uganda in estimating drug efficacy using proportions [
16]. Mukaka et al. (2016) compared MI against CC when estimating the risk difference between two antimalarial regimens [
17]. In the PREGACT study (2017), MI was used as a sensitivity analysis to assess the robustness of the derived estimate of cured proportion [
18]. None of the studies to date have compared the utility of MI and IPW approaches in handling indeterminate outcomes when deriving Kaplan-Meier (K-M) (
\( \hat{S_{KM}} \)) estimates of drug efficacy for antimalarial regimens.
The aim of this simulation study was to assess the performance of MI and IPW approaches for handling indeterminate recurrences when estimating the day 28 proportion of parasitic recrudescence following antimalarial treatment using K-M survival analysis against those derived using the widely used CC approach. Specifically, the evaluation is based on a large multi-centre trial of four artemisinin-based combination therapies (4ABC trial) [
19] in which we redraw and assign a set proportion (10, 30 and 45%) of known recurrences (recrudescences and new infections) to indeterminate (i.e. missing).
Discussion
Missing data in clinical trials can pose analytical challenges, including undermining the validity and interpretation of the results. In antimalarial studies, indeterminate recurrences resulting from genotyping failure are frequently encountered, especially in the areas of high transmission intensity, where multiple infections are common. Principled approaches for handling missing data have proliferated the medical and statistical literature in recent years [
9,
10,
38], yet the most common approach used by malaria researchers and recommended by the WHO for handling indeterminate cases is to simply exclude these from the analysis. In this article, the performance of MI and IPW were evaluated for handling indeterminate outcomes in the context of estimation of antimalarial efficacy using one of the largest antimalarial studies (the 4ABC study) [
19]. The use of a real dataset to represent the complete (full) data avoided arbitrary choices usually made in simulating covariates and survival data, and provided a rich dataset from multiple endemic settings, with auxiliary covariates for implementation of IPW and MI approaches, thus making the generalisability of results for antimalarial trials more plausible.
Two different missingness mechanisms were investigated and differences in estimates compared for scenarios in which 10, 30 and 45% of the known recurrences were reclassified as missing. In all these scenarios, the current recommendation of excluding indeterminate cases resulted in an upwards biased estimate of day 28 probability of cure (K-M method) by up to a maximum of 1.7% (see Additional file
1, Section E), the magnitude of which was correlated with the proportion of recurrent outcomes classified as indeterminate. Similar findings were observed in Machekano et al. (2008) who reported an absolute overestimation in efficacy of 3.2% by CC approach compared to IPW and MI methods for the antimalarial regimen of chloroquine (CQ) + sulphadoxine-pyrimethamine (SP) and by up to 1.7% for the regimen amodiaquine (AQ) + SP when the observed proportion of missing recurrences were 33% in the CQ + SP arm and 17% for AQ + SP arm [
16]. Like for the estimation of the K-M of probability of cure, the CC analysis was associated with overestimation of proportion cured at day 28. The analytical solution outlined in equation 2 provided the most consistent estimate of the proportion cured compared to the CC estimator.
For the derivation of K-M estimate of day 28 probability of cure, MI and IPW approaches were generally consistent under all missingness scenarios and resulted in an increased precision. The IPW approach provided consistently the least biased estimate of K-M probability of cure of all the approaches for all proportions of missingness; however, it came at a price of marginally inflated standard errors compared to the MI approaches which also corroborate well with the observations of Machekano and colleagues [
16]. However, the current study had two important differences. First, the Kaplan-Meier method, which is currently the preferred approach for estimating drug efficacy, was used (as opposed to the proportion cured reported in Machekano and colleagues). Second, when constructing the missingness model for the IPW implementation, recurrence status was included as a predictor in this analysis. In antimalarial studies, a missing outcome is only possible once a patient experiences recurrent parasitaemia, thus leading to a scenario where recurrence status is a predictor of missing outcome. It was found that the IPW approach where missingness models excluded the predictor recurrence was associated with an increased bias and inflated standard error. This suggests that recurrence should always be included in the missingness model to obtain valid inferences for the IPW estimate.
Like for the IPW, the validity of the inferences derived from MI relies on the correct implementation of the imputation model, hence this approach should include the correct functional form and specify any interactions. Failure to do so could lead to invalid inferences being drawn, especially when the fraction of missing information is large [
11,
31,
39]. In practice, all imputation models are likely to be mis-specified to some extent. Arguably specifying the missingness model correctly is an easier task compared to specifying a correct imputation model [
9,
40], thus making the IPW approach a feasible alternative for handling indeterminate outcomes in estimation of efficacy in antimalarial studies. However, it is important to account for the uncertainty associated with estimation of weights in IPW as the naïve estimate of the standard error ignores this uncertainty, leading to the IPW approach paradoxically appearing far more efficient than MI (See Additional file
1, Section D) [
34]. In addition to being biased and inefficient, the CC estimates also suffered from poor coverage for the estimation of K-M probability compared to MI and IPW methods and for estimation of cured proportion compared to the analytical solution (equation 2). When the missingness was greater than 30%, the coverage for CC approach deteriorated rapidly and fell below 90% for all the missingness mechanisms (regardless of choice of the estimand) whereas for MI, IPW and the analytical approach, the coverage remained near the nominal 95% level.
The current WHO guidelines require that a new regimen should demonstrate at least 95% efficacy to be included in the antimalarial treatment policy, and further investigations are warranted when treatment failure exceeds 10% to examine the possibility of drug resistance [
2]. The results of this study, taken together with the findings of Machekano et al. [
16] highlights that CC approach provides an optimistic view of the treatment efficacy and this can have potentially deleterious consequences when the estimate is at the cusp of these WHO thresholds (in a study where a large proportion of outcomes are indeterminate). From a public health perspective, the false sense of confidence generated from these studies regarding the current status of antimalarial regimens can have important ramifications for the evolution of antimalarial drug resistance. The prolonged usage of a less optimal regimen provides a constant drug selection pressure to the parasites; a scenario highly conducive for emergence of de novo drug resistance. Given the paucity of alternative regimens currently available and the spread of artemisinin resistance across South East Asia [
41], it is important that researchers and policy makers alike are aware of the pitfalls associated with the CC estimate of efficacy when drawing conclusions from routine surveillance studies. The analytical solution outlined in equation 2 provided the most consistent estimate of the failure and could be a useful alternative in scenarios where there is minimal or no censoring. However, when there is censoring (due to lost-to-follow up or when new infection is considered as censored), the K-M approach through the use of principled approaches of MI and IPW would be the most appropriate method for estimation of the day 28 proportion of recrudescences.
This simulation study has a number of limitations. First, it was assumed that the genotyping outcome reflects the true treatment outcomes. The genotyping procedure is prone to misclassification error, particularly in areas of intense transmission where polyclonal infections present formidable challenge [
42‐
45]. A thorough consideration of genotyping adjusted efficacy should incorporate the population allele diversity, which is often unmeasured or not presented; however the potential confounding from this remains beyond the scope of the current analysis. Second, IPW and MI are not the only available approaches for handling missing data. Likelihood based approaches, which use expectation-maximisation (EM) algorithms are alternative approaches, but at present are not implemented in standard software [
46]. The pseudo-value method is increasingly being used and its utility in the context of antimalarial research is yet to be evaluated [
47‐
50]. Third, this simulation study has evaluated the performance of MI and IPW approaches in derivation of K-M estimates and the application of these principled methods for other statistical approaches for estimating efficacy (e.g: competing risk survival analysis approach) was not considered [
51‐
53]. Finally, this study doesn’t represent every missing-data problem which can be encountered in practice and a single method cannot be universally recommended but rather the choice of the method should be guided by the research question and the context of the study.
In the presence of missing data, no statistical methods, simple or sophisticated, can supersede the result, which could have been derived had the data been fully observed. Thus best possible efforts should be made to minimise the missingness through careful design, study management, and adherence to standardised protocols [
54‐
57]. Diligence in sample collection in the field, use of better genotyping method (e.g. capillary based) including appropriate quality control measures through a regular proficiency testing program should be deployed [
58]. Missing data should be anticipated in advance and researchers should strive to collect data on variables which might be related to variables expected to exhibit missing data such as background allelic frequency. When using MI and IPW, researchers should clearly report the details of modelling approaches including the construction of imputation and missingness models [
8,
59].
The definition of recrudescence and new infection depends on the how different sized bands are binned and classified as being the same or different alleles. For example, Cattamanchi et al. (2003) [
60] considered the alleles to be the same if the molecular weights were within 10 base-pair length for merozoite surface protein (
msp)-2 genes whereas Rouse et al. (2008) reported that an identical
msp-2 allele could be different by up to 18 base pairs [
61]. The definition adopted for defining recrudescence or a new infection is critical and researchers should always endeavour to publish the fragment length of the alleles in the pre- and post-treatment samples as done by Plucinski et al. (2017) (see Additional file 1: Table
1 of [
62]).
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.