Background
In studies of meta-analysis of diagnostic test comparing an index test with a reference test, non-evaluable test outcome is an important issue that could potentially lead to biased estimates of index test accuracy. Many papers in the literature discussed missing reference test outcome (missing disease status) and how to correct such bias, so called partial verification bias or work up bias [
1‐
4]. However, index test outcomes can be non-evaluable as well, especially for tests yielding dichotomous results. Different situations were discussed where index test result can be non-evaluable: uninterpretable, intermediate and indeterminate [
5,
6].
For a single study, there are many discussions about how to deal with non-evaluable index test outcomes, such as excluding them [
7], grouping them with positive or negative outcomes [
5,
7], or use 3×2 table to report them as an extension of the standard 2×2 table [
7]. On the other hand, in meta-analysis, there is little discussion on how to deal with missing index test outcomes [
6]. The “classic” 2×2 table models such as the bivariate linear mixed models [
8‐
13], bivariate generalized linear mixed model (GLMM) [
14‐
16] and TGLMM [
17] ignore missing index test outcomes. Recently, a paper by Schuetz et al. [
6] discussed this issue by studying different approaches dealing with index test non-evaluable subjects. The paper conducted a meta-analysis of coronary CT angiography studies and presented an intent-to-diagnose approach together with three commonly applied alternative approaches. The intent-to-diagnose approach takes non-evaluable diseased subjects as false positives and non-diseased subjects as false negatives such that sensitivity and specificity won’t be over-estimated. We name the other three alternative approaches in Schuetz et al. [
6] as Model 1 (non-evaluable subjects are excluded from the study), Model 2 (non-evaluable diseased subjects are taken as true positives and non-diseased subjects are taken as false positives) and Model 3 (non-evaluable diseased subjects are taken as false negatives and non-diseased subjects are taken as true negatives). We use Model 1-3 to denote the above three approaches thoughout the rest of this paper. The authors concluded that excluding the index test non-evaluable subjects (Model 1) leads to overestimation of sensitivity and specificity and recommended the conservative intent-to-diagnose approach by treating non-evaluable diseased subjects as false negatives and non-evaluable non-diseased subjects as false positives. However, no simulation studies have been conducted to evaluate the performance of these approaches. Moreover, the above conclusions can be misleading.
We can treat index test non-evaluable subjects as missing data. Schuetz et al. [
6] concluded that sensitivity and specificity could be over-estimated by excluding non-evaluable subjects. In fact, under a reasonable general assumption, missing at random (MAR), excluding non-evaluable subjects can provide unbiased estimates of sensitivity (Se) and specificity (Sp). Under MAR assumption, the probability of missing only depend on observed information, such as patient characteristics and known true disease status [
18,
19]. For example, when diagnosing extrahepatic cholestasis using percutaneous transhepatic cholangiography, non-diseased subjects can have uninterpretable results more often than diseased patients [
5]. A special case of MAR is missing completely at random (MCAR), where missing is independent of both observed and unobserved variables [
18]. E.g., accidental contamination of a urine sample such that the test result is discarded. Under MAR,
T and
M are independent given disease status
D, where
M=1,0 indicates missingness of index test outcome,
D=1,0 indicates diseased or non-diseased and
T=1,0 represents index test positive or negative. Hence, excluding non-evaluable subjects will have unbiased estimates of Se and Sp:
and
. Similarly, positive and negative likelihood ratios (LR + and LR −) and area under the curve (AUC) are unbiased too. Under MCAR,
P r(
M=1|
D=1)=
P r(
M=1|
D=0), and hence disease prevalence (
π) estimate is also unbiased if non-evaluable subjects are excluded. However, when missing probabilities are not equal between diseased and non-diseased participants, disease prevalence estimate can be biased if non-evaluable subjects are excluded, leading overall estimates of positive predictive value (PPV) and negative predictive value (NPV) biased. PPV and NPV are generally preferred by clinicians as measurements of how well a test predicts true disease status because their interpretations are more intuitive: PPV is the probability that a subject with positive intex test result is truely diseased and NPV is the probability that a subject with negative intex test result is truely non-diseased [
19]. However, none of the approaches discussed in Schuetz et al. [
6] can correct bias in their estimates.
In this article, we propose to extend the TGLMM approach [
17] by treating non-evaluable subjects as missing data to adjust for potential bias. The TGLMM was proposed by Chu et al. [
17] as an extension of the bivariate GLMM [
9,
10,
14]. Sensitivities and specificities are found to be potentially dependent on disease prevalence [
20‐
22]. The TGLMM models disease prevalence together with sensitivity and specificity to account for potential correlations among them. Moreover, once overall disease prevalence is evaluated, other test accuracy indices such as PPV and NPV can be calculated. By extending the TGLMM to account for missing data, potential bias in disease prevalence estimate can be adjusted and thus, bias in PPV and NPV estimates can be avoided.
In the rest of this paper, we first present the extended TGLMM approach in the “Methods” section. Next, in section “Results”, simulation studies are carried out to systematically evaluate the performance of the extended TGLMM, Model 1-3 and the intent-to-diagnose approach when there are non-evaluable index test subjects. The meta-analysis of coronary CT angiography studies is re-evaluated by the extended TGLMM approach. The SAS code for the extended TGLMM is available in the Appendix: SAS code of the extended TGLMM approach: meta-analysis of coronary CT angiography studies. Finally, we conclude the paper with some discussions in section “Conclusions”.
Methods
Assume there are
i=1,…,
N studies in one meta-analysis data set. We generalize the TGLMM approach to account for missing index test outcomes by extending the “classic” 2×2 table to Table
1. Each cell in Table
1 reports the cell count and cell probability corresponding to a combination of index test and disease outcomes in study
i. Let
n
itd
denote the cell counts in study
i with index test outcome
T=
t and reference test outcome
D=
d, where
t=1,0,
m stands for positive, negative and missing, and
d=1,0 denotes positive and negative.
S e
i
,
S p
i
and
π
i
are sensitivity, specificity and prevalence of study
i, respectively. Let
ω
imd
denote the missing probability of index test given disease status
d in study
i:
ω
imd
=
P r(
T=
m|
D=
d). The missing probabilities and disease prevalence are incorporated in the cell probabilities in Table
1. Assuming a multinomial distribution, the likelihood for
θ
i
=(
S e
i
,
S p
i
,
π
i
) and
ω
i
=(
ω
i m1,
ω
i m0) given data (cell counts) is:
(1)
Table 1
3 × 2 table accounting for prevalence and missing index test results
|
n
i11
|
n
i10
|
n
i1+
|
+ | (1−ω
i m1)π
i
S e
i
| (1−ω
i m0)(1−π
i
)(1−S p
i
) | (1−ω
i m1)π
i
S e
i
+(1−ω
i m0)(1−π
i
)(1−S p
i
) |
|
n
i01
|
n
i00
|
n
i0+
|
− | (1−ω
i m1)π
i
(1−S e
i
) | (1−ω
i m0)(1−π
i
)S p
i
| (1−ω
i m1)π
i
(1−S e
i
)+(1−ω
i m0)(1−π
i
)S p
i
|
|
n
i m1
|
n
i m0
|
n
i m+
|
Missing |
ω
i m1
π
i
|
ω
i m0(1−π
i
) |
ω
i m1
π
i
+ω
i m0(1−π
i
) |
|
n
i+1
|
n
i+0
|
n
i++
|
Total |
π
i
| 1−π
i
| 1 |
It is straight forward to tell from (1) that
L(
θ
i
,
ω
i
|Data)∝
L(
θ
i
|Data)×
L(
ω
i
|Data), where the log-likelihood of
θ
i
is:
Let
θ={
θ
i
}. Assuming independence among studies conditional on
θ
i
, the total log likelihood of
θ is:
(2)
Let logit(
π
i
)=
η+
ε
i
, logit(
S e
i
)=
α+
μ
i
and logit(
S p
i
)=
β+
ν
i
, where logit(·) is the logit link function such that logit(
p)=log(
p/(1−
p)), for 0<
p<1. (
η,
α,
β) are the fixed effect parameters such that median
π,
Se and
Sp can be approximated as logit
−1(
η), logit
−1(
α) and logit
−1(
β), respectively, where logit
−1(·) is the inverse logit function such that logit
−1(
x)=1/(1+exp(−
x)). The random effect vector (
ε
i
,
μ
i
,
ν
i
) is assumed to be trivariate normally distributed:
where the diagonal elements in Σ account for between-study variations of π, Se and Sp and the off-diagnonal elements take care of potential correlations among the three parameters.
Median PPV, NPV, LR + and LR − and median area under the curve (AUC
M
) can be approximated as [
16]:
The extended TGLMM can be fitted by standard software like SAS NLMIXED procedure, which implements an adaptive Gaussian quadrature to approximate the log-likelihood in (2) integrated on random effects with dual quasi-Newton optimization techniques. The NLMIXED procedure directly outputs fixed effects estimates , and and can provide median prevalence, Se, Sp, PPV, NPV, LR +, LR − estimates and their confidence intervals through the “estimate” statements. Sample SAS code is available in the Appendix: SAS code of the extended TGLMM approach: meta-analysis of coronary CT angiography studies.
Discussions
Adequate reporting of the missing outcomes in study reports is essential to apply the discussed models. As shown in the simulation studies, different missing scenarios can have different impact on how estimates are biased and more importantly, missing mechanism can indicate whether the MAR assumption holds. When the MAR assumption is violated, i.e., the probability of non-evaluation depends on unobserved index test outcomes, the direction and magnitude of bias are hard to predict. Few sensitivity analysis methods using pattern mixture models and selection models are available for this scenario [
23,
24]. These approaches can be explored in further research. On the other hand, number of non-evaluable results need to be known in order to apply the proposed methods. However, a recent study shows that they are not consistently or adequately reported in published studies [
25].
A reviewer has pointed out that as long as number of non-evaluable subjects are known, disease prevalence can be estimated unbiasedly through an univariate meta-analysis. Consequently, together with unbiased sensitivity and specificity estimates, PPV and NPV estimates are unbiased too. This approach is a simpler method than the proposed extended TGLMM to estimate prevalence, however, can be less efficient by ignoring the potential correlation between prevalence, sensitivity and specificity, which may result in wider confidence intervals.
For an individual patient, different approaches of treating a missing result can have different impact. For example, if index test results are missing due to the same reason of returning a negative result (and thus is MNAR), then treating such patients as disease negatives can yield unbiased estimate of prevalence for a study, and also won’t affect the patients’ diagnosis. On the contrary, if index test missing patients are treated as positives for reasons such as suspicious of serious disease like cancer [
26], it may result in over-estimation of disease prevalence and unnecessary medial cost for the patient. For another example, if index test is repeatable and repeated for subjects with non-evaluable results, then it is appropriate to ignore missing results.
Conclusions
In this paper, we propose an extended TGLMM approach to handle non-evaluable index test subjects in meta-analysis of diagnostic tests. The extended TGLMM is compared to an intent-to-diagnose approach and three alternative approaches proposed by Schuetz et al. [
6] through simulation studies and re-evaluaion of the meta-analysis of coronary CT angiography studies.
In summary, by simulation studies we showed that under MAR assumption, excluding index test non-evaluable subjects (Model 1) will not lead to biased estimates of sensitivity, specificity, LR+, LR − and AUC. Thus in practice, researchers can be confident to apply Model 1 when there is a belief in the MAR assumption. However, when disease prevalence or PPV and NPV are of interest, excluding non-evaluable subjects could lead to biased estimates of these parameters. Under this situation, the extended TGLMM accounting for missingness should be preferred. Even though the extended TGLMM is more theoretically complex than the widely used bivariate random effects model, it is easy to program use SAS NLMIXED procedure. Sample SAS code with an application to the meta-analysis of coronary CT angiography studies is provided in the Appendix: SAS code of the extended TGLMM approach: meta-analysis of coronary CT angiography studies. Model 2, Model 3 and the intent-to-diagnose approach all largely under- or over- estimate sensitivity and specificity, so that they should not be recommended when MAR assumption is not seriously violated.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
All the authors contributed substantively to the study and approved the manuscript submitted for review. XM and HC conceived of the idea of the study, XM was responsible for data analysis. FS, XM and HC all contributed in drafting and revising the manuscript. All authors read and approved the final manuscript.