Category 1. Methods to derive SD/SE/variance algebraically
Walter and Yao [
18] present a readily applicable improvement to a method based on the minimum and maximum observed values of the outcome. This “range” method, whereby the difference between minimum and maximum values is divided by 4 to estimate the SD, was originally presented by Mendehall and colleagues [
19] in the survey sampling context. In this update, a lookup table of conversion factors from range to SD, based on the distributional results of Tippett [
20] for the range, is presented for a variety of sample sizes. They illustrate the method in two example studies of interventions to improve adherence to randomised treatment in rheumatoid arthritis and human immunodeficiency virus. They caution that non-Normality of outcomes, whether through skewness or kurtosis, may invalidate their tabulated conversion factors but note that the very presence of skewness might be the cause of the minimum and maximum being reported instead of the SD. They observe that although in theory the use of certain interior order statistics may perform better on skewed outcomes than their proposed method, in practice such statistics would never be available in trial reports; they conclude that their method offers an acceptable compromise in the absence of the original data being obtainable from the original trial publication authors.
Hozo and colleagues [
6] present a formula (see Table
1) for estimating the variance where values for the minimum, median, maximum and sample size are available. In simulations of outcomes from a range of parametric distributions, they find that their approximation performs best on Normally distributed data when the sample size is very small but methods based on dividing the range by 4 and 6 are superior for sample sizes from 16 to 70 and over 70 respectively. Similar patterns are observed on skewed outcomes simulated from log-Normal, beta, exponential and Weibull distributions. In simulated meta-analyses, they conclude that their variance estimation formula may miss the true value by a margin between 10% and 20%.
Bland [
21] presents a formula for the variance (Table
1) which makes use of the lower quartile and upper quartile in addition to the minimum, median and maximum. In simulations, Bland demonstrates that his formula overestimates the standard deviation at larger sample sizes where the underlying distribution is Normal; the issue is exacerbated for skewed outcomes. Nevertheless, in both situations the formula provides a less biased estimate than that of Hozo et al. [
6]. The over-estimation is attributed to the greater chance of extreme outliers occurring in large sample sizes, thus inflating the estimation of the variance through the minimum and maximum values. He considers that the method will still be useful, since studies in meta-analyses with a small sample size are most likely to be the ones with unreported SD values and source data that cannot be obtained from the trial report authors.
Continuing the theme of estimation based on summary statistics, Wan et al. [
22], due to concerns over the restrictive non-negative data assumption and the arbitrary sample size thresholds guiding choice of formula in the method of Hozo and colleagues [
6], propose an improvement using the same summary statistics:
$$ SD\approx \frac{b-a}{2{\Phi}^{-1}\left(\frac{n-0.375}{n+0.25}\right)} $$
and an enhancement to the approach of Bland [
21] that additionally takes account of sample size with the aim of reducing overestimation at larger sample sizes:
$$ SD\approx \frac{b-a}{4{\Phi}^{-1}\left(\frac{n-0.375}{n+0.25}\right)}+\frac{q3-q1}{4{\Phi}^{-1}\left(\frac{0.75n-0.125}{n+0.25}\right)} $$
Finally, using only the lower quartile, upper quartile and sample size, they propose the following estimate:
$$ SD\approx \frac{q3-q1}{2{\Phi}^{-1}\left(\frac{0.75n-0.125}{n+0.25}\right)} $$
and note its similarity to the Cochrane Handbook [
4] estimator
$$ SD\approx \frac{q3-q1}{1.35} $$
Through simulations, they demonstrate superior estimation properties, for both Normal and skewed data, of their respective extensions to the methods of Hozo and Bland. They also illustrate that a valid estimate of the SD may also be made when the minimum and maximum are unavailable but the upper and lower quartiles are reported.
Kwon and Reis [
17] apply simulation-based approximate Bayesian computation (ABC) in estimating missing SD values based on other summary statistics available in the trial report. The likelihood function for Bayesian inference is unlikely to be evaluable, due to the unavailability of all data points from the source trials in a meta-analysis. They therefore propose ABC, which replaces the likelihood by using a distance measure – in this example the Euclidean distance – to compare summary statistics between the observed and simulated data. The prior distribution for the outcome must be specified: they propose that the underlying probability distribution (for example, Normal or log-Normal) may be determined based on background knowledge of the outcome, and a uniform prior should be placed on each parameter of this distribution, informed by the available summary statistics. Many sets of candidate parameter values are then generated from this prior, and from these, many pseudo-data sets. Each of the pseudo data sets is then compared to the observed summary statistics and accepted if these are sufficiently close (for example in the top 0.1% smallest Euclidean distances). This accepted set of parameter values is then used to estimate the parameter of interest, in this case the SD. In simulations of outcomes from Normal and skewed distributions, they find the ABC method performs consistently better for skewed distributions than the formulae of Hozo, Bland and Wan. ABC does not perform as well when the sample size is less than about 40 and the method of Wan et al. [
22] is superior for Normally distributed outcomes.
Abrams et al. [
9] accommodate differences in methodology among included trials (for example, where outcome is reported at a given follow-up point rather than as a change from baseline). This can lead to missing information on, for example, the SD of mean change from baseline. Their proposed solution (in contrast to single imputation methods or omission of studies not reporting change from baseline summary statistics) is to adopt a fully Bayesian approach in which external information is used to build a prior distribution for the within-patient correlation between baseline and follow-up measures, thus enabling appropriate estimation of the SD of the change from baseline where only the baseline, and follow-up SD values have been reported. A Uniform(0,1) vague prior for the correlation ρ is used. Where external evidence is available, this prior is replaced by performing a Bayesian meta-analysis of the Fisher transformations
S
j
of the observed ρ
j from external studies
j = 1,…,J; a vague Normal prior with mean δ is placed on the
S
j
and the back-transformation
\( \uprho =\frac{e^{2\delta }-1}{e^{2\delta }+1} \) is used. They conclude that such an approach gives a substantial improvement to meta-analysis estimation of the pooled mean difference compared to applying a fixed value for ρ. They note the conclusions are sensitive to the choice of prior distribution, particularly when a limited number of studies is included in the meta-analysis, and that study-level covariates may be included to incorporate greater flexibility in the prior for ρ.
Sung et al. [
10] incorporate continuous outcomes with missing variances in a Bayesian meta-analysis by estimating the distribution of reported variances and applying multiple imputation of missing variances, assuming that these arose from this “parent” log-Normal distribution. Specifically, the missing variance is assumed to be distributed as the true variance, multiplied by a χ
2 distribution divided by its degrees of freedom. The degrees of freedom equal n-1, where n is the sample size of the trial with unreported variance. This is a special case of a gamma distribution, with shape (n-1)/2 and scale (n-1)/(2 * true variance). Sung et al. contend that their approach offers advantages, in comparison to discarding information from studies with unreported variances, and consider it more straightforward to implement in a Bayesian framework using the WinBUGS software than it would be to employ a frequentist equivalent.
In a systematic review of valsartan in the treatment of hypertension, Nixon et al. [
11] impute missing SD values within a Bayesian random effects meta-regression. The missing data imputation model assumes a trivariate normal distribution for the log-transformed baseline SD, follow-up SD and change from baseline SD. The following relationship between the SD measures is exploited:
\( {S}_{di}^2={S}_{1i}^2+{S}_{2i}^2-2{\rho}_{12}{S}_{1i}{S}_{2i} \), where
S
di
,
S1i and
S2i are the change from baseline SD, baseline SD and follow-up SD respectively, and ρ
12 is the within-patient correlation between baseline and follow-up. The variance of observed SDs is weighted by the inverse of the sample size. The meta-regression allows adjustment for study-level characteristics, such as mean baseline value, which may influence the treatment effect. As with other imputation approaches identified in this review, the missing at random assumption applies.
Dakin et al. [
12], in the context of a mixed treatment comparison meta-analysis, perform Bayesian modelling of SDs contained in trial reports. They estimate the gamma distribution that these follow and sample values from that distribution to impute the SD for studies in which this is missing, while still enabling the uncertainty around these imputed values to be taken into account in the meta-analysis. The unreported SD values are assumed to be missing completely at random.
Within the setting of a hierarchical Bayesian meta-analysis of the biogeographical relationship between coral reef loss and populations of fish that rely on coral, MacNeil and Graham [
13] impute missing standard deviations from their posterior predictive distribution based on the observed SD data. Again, any missing SDs are assumed missing completely at random and the uncertainty in imputed values is retained in the subsequent hierarchical meta-analysis.
In the framework of a network meta-analysis, Stevens [
14] generates the posterior predictive distribution of missing variances via Markov Chain Monte Carlo using WinBUGS [
24], assuming a gamma distribution for the observed variances. The log-transformed SD values are given a weak uniform prior. Using an example data set where the true study and treatment group specific SDs are known, he illustrates that the assumption of a common standard deviation (missing completely at random) may not be tenable and that violation of this leads to problems in pooled treatment effect estimation. He further highlights the importance of examining the role of study-specific covariates in predicting the observed SDs. Stevens et al. [
15] implement the same technique in a network meta-analysis of treatments for intermittent claudication.
Boucher [
16] imputes missing variances using a non-linear mixed effects Emax model of SDs over time in the specific scenario where longitudinal measurements of a pain outcome are available but not all SDs are reported. The SD for study
i, treatment group
j, time point
k is modelled as
$$ {SD}_{ijk}={E}_0+\frac{\left({E}_{\mathit{\max}0}\ast \left(1-{I}_j\right)+{E}_{\mathit{\max}1}\ast {I}_j\right)}{et_{50}+{t}_{ijk}}+{\eta}_i+{\xi}_{ijk} $$
where
E0 is the estimated baseline SD,
Emax0 and
Emax1 are the maximum difference over baseline for treatment groups 0 and 1 respectively,
I
j
is an indicator variable for treatment group,
et50 is the time post first dose when 50% of the maximal difference over baseline is reached,
t
ijk
is the time post first dose,
\( {\eta}_i\sim \mathrm{N}\left(0,{\sigma}_{bsd}^2\right) \) is the between study variability, and
\( {\xi}_{ijk}\sim \mathrm{N}\left(0,{\sigma}_{sd}^2/{n}_{ijk}\right) \) is the residual. Maximum likelihood and Bayesian approaches to estimation are investigated. Weak priors are used so that only the observed SDs inform the missing data imputation. A joint model encompassing missing SD imputation and the final meta-analysis ensures that uncertainty in the imputed values is carried forward to the meta-analysis. He concludes that a Bayesian modelling approach holds advantages (in terms of appropriate propagation of uncertainty) over maximum likelihood techniques. Either approach would require unreported SDs to be missing completely at random.
Chowdhry et al. [
25] impute missing variances for meta-analyses of parallel group and cross-over trials using a gamma meta-regression generalised linear mixed model, additionally taking study covariates into account when modelling the variance. The study random effect reflects the reasonable assumption that the between-study variation in variance cannot entirely be explained by the available covariates. They perform inference on the mean treatment difference using multiple imputation. The method depends on the missing at random (MAR) assumption regarding unreported variances. They propose sensitivity analyses via a pattern mixture model if variances are missing not at random (MNAR). The approach may benefit from a large number of trials being included in the meta-analysis: their motivating example covers 84 parallel group trials. Their extensive simulation studies demonstrate the superior performance of the method with regard to Type I error and coverage, in comparison to single imputation approaches such as that of Ma et al. [
23] or the complete case approach (found in 9% of meta-analyses by Wiebe et al. [
5] in which trials with missing variances are omitted from meta-analysis. They conclude that the advantages are smaller than expected, primarily because the missing variances influence only the weighting applied in the meta-analysis.
Category 1. Methods to derive mean algebraically
Hozo and colleagues [
6] present methods for deriving a missing mean value where data are available on the median, minimum, maximum and sample size:
$$ \overline{\mathrm{x}}\approx \frac{a+2m+b}{4}+\frac{a-2m+b}{4n} $$
They note that where n is large, the right-hand term in the equation becomes negligibly small and may be omitted. In simulations they confirm that for Normally-distributed data, the formula closely estimates the true mean (within 4% across all scenarios studied), although for larger sample sizes the median was a more accurate estimator. For skewed data, the counter-intuitive result is that the median is a better estimator of the mean for larger samples (about 25 or more), despite the above formula incorporating additional information on the minimum, maximum and sample size.
Bland [
21] takes account of the extended scenario where information on the lower (q1) and upper (q3) quartiles is also available:
$$ \overline{\mathrm{x}}\approx \frac{\left(n+3\right)a+2\left(n-1\right)\left(q1+m+q3\right)m+\left(n+3\right)b}{8n} $$
Bland additionally notes that such quantile based methods readily apply even if log-transformation of an outcome seems appropriate: the log-transformed quantiles may be used in the above formula to estimate the mean of the log-transformed outcome. This is of particular interest in the case of skewed data, where unreported mean values are more likely. Simulation studies based on Normally-distributed data show that the mean estimation formula of Bland shows minimal bias. For skewed data, the mean estimation approach shows somewhat less than half the bias found in the method of Hozo and colleagues [
6]. Bland observes that in meta-analysis, the interest will often be in the difference between treatment group means and any bias in mean estimation would therefore cancel out as it would be present in both groups.
Wan et al. [
22] provide a further method which applies in situations where the lower (q1) and upper (q3) quartiles are available but the minimum and maximum are not.
$$ \overline{\mathrm{x}}\approx \frac{q1+m+q3}{3} $$
They also provide simplified versions of the equations provided by Hozo et al. [
6] and Bland [
21], as well as an Excel spreadsheet to aid practical implementation which contains all of their formulae as well as those of Hozo et al. [
6] and Bland [
21]. Simulations show the Wan et al. [
22] formula estimates the mean unbiasedly in the case of Normally-distributed data; for skewed data according to the log-Normal, Beta, Weibull or Exponential distributions, at larger sample sizes (greater than
n = 100) it provides a smaller relative error in the mean estimation than the approach of Bland, even though it does not include the minimum and maximum summary statistics.
Kwon and Reis [
17] also apply the simulation-based approximate Bayesian computation (ABC) approach described in the missing variance/SD/SE section to estimate missing mean values. In simulation studies they find that for a sample size above 40 ABC performs consistently better than the methods of Hozo et al.[
6], Bland [
21] and Wan et al. [
22] across all scenarios; its benefit is greatest where the underlying continuous outcome distribution is skewed or heavy-tailed. The average relative error of ABC estimation of the mean is almost zero for sample sizes over 100. The ABC approach does however require the underlying probability distribution (for example, log-Normal) to be specified in advance; the performance of the method under model misspecification was not investigated.
Method evaluation using individual participant data
We implemented illustrative meta-analyses, selecting the subset of methods identified in the systematic reviews that we considered most readily applicable by systematic reviewers without the requirement for specialist software or programming skills. For handling missing SD/SE/variance summaries we selected the single imputation approach of Ma and colleagues [
23], the look-up table method of Walter and Yao [
18] and the Cochrane Handbook [
4] formula (which takes a very similar form to that of Wan et al. [
22]). For dealing with missing mean values, we applied the formula of Hozo and colleagues [
6] as an important reference method; in addition, we implemented the algebraic recalculations presented by Bland [
21] and Wan et al. [
22].
Table
2 shows the results of the illustrative analyses for missing SD/SE/variance, for the selected methods versus the comparators of (1) complete data set analysis and (2) omitting trials with missing data. There was little difference across methods in terms of bias in the estimated mean difference, which was at most 0.23 days in magnitude and varied little across meta-analysis scenarios. In contrast, the imprecision in the estimate of the mean treatment effect increased substantially with the proportion of trials with missing SDs (for example the confidence interval width increased by a factor of 2.19 when 15 of 30 trials had a missing SD and those trials were omitted from the meta-analysis). The method of Walter and Yao gave greatest protection against this increased imprecision, performing better than all alternative methods in every scenario. Its imprecision was at most 1.17, in cases where 15 of 30 trials had a missing SD.
Table 2
GALA results: missing SD
Scenario
| | | | | | | | | | |
5 trials
| − 0.01 | (− 0.87, 0.85) | | | | | | | | |
2 missing SD | | 0.21 | 1.18 | 0.05 | 1.05 | −0.01 | 0.73 | 0.27 | 1.28 |
10 trials
| −0.01 | (− 0.37, 0.35) | | | | | | | | |
2 missing SD | | 0.04 | 1.04 | −0.01 | 1.01 | −0.02 | 1.97 | 0.02 | 1.06 |
5 missing SD | | −0.03 | 1.56 | 0.01 | 1.10 | −0.12 | 2.40 | 0.00 | 1.64 |
20 trials
| 0.00 | (−0.31, 0.30) | | | | | | | | |
5 missing SD | | 0.07 | 1.26 | 0.02 | 1.02 | 0.06 | 2.74 | 0.02 | 1.25 |
10 missing SD | | 0.26 | 2.20 | 0.02 | 1.07 | 0.17 | 3.93 | 0.05 | 1.41 |
30 trials
| −0.01 | (−0.28, 0.25) | | | | | | | | |
5 missing SD | | 0.06 | 1.11 | 0.03 | 1.06 | −0.12 | 2.23 | 0.09 | 1.15 |
10 missing SD | | −0.09 | 1.49 | −0.01 | 1.13 | −0.21 | 2.45 | −0.02 | 1.62 |
15 missing SD | | −0.03 | 1.87 | 0.02 | 1.17 | −0.23 | 2.28 | 0.12 | 2.19 |
Table
3 gives the findings for the missing mean illustrative meta-analyses in a similar format. In general bias was low, with the exception of the Hozo method which showed notable bias in meta-analyses containing 20 trials. The Wan formula exhibited minimal imprecision across all scenarios, outperforming all other methods. The exception was for 5-trial meta-analyses with missing means for two trials, where the Hozo and Bland methods also demonstrated negligible imprecision; however in this case the Wan approach showed lower bias in the estimated treatment effect.
Table 3
GALA results: missing mean
Scenario
| | | | | | | | | | |
5 trials
| −0.01 | (−0.87, 0.85) | | | | | | | | |
2 missing means | | −0.20 | 1.00 | −0.12 | 1.00 | 0.05 | 1.00 | 0.27 | 1.28 |
10 trials
| −0.01 | (− 0.37, 0.35) | | | | | | | | |
2 missing means | | 0.64 | 3.47 | 0.15 | 2.15 | −0.03 | 1.00 | 0.02 | 1.06 |
5 missing means | | 0.12 | 4.35 | −0.06 | 1.58 | 0.04 | 1.00 | 0.00 | 1.64 |
20 trials
| 0.00 | (−0.31, 0.30) | | | | | | | | |
5 missing means | | 1.16 | 3.67 | 0.38 | 2.30 | −0.03 | 1.00 | 0.02 | 1.25 |
10 missing means | | 1.02 | 4.34 | 0.34 | 2.66 | 0.01 | 1.00 | 0.05 | 1.41 |
30 trials
| −0.01 | (−0.28, 0.25) | | | | | | | | |
5 missing means | | 0.02 | 1.43 | 0.01 | 1.15 | 0.01 | 1.00 | 0.09 | 1.15 |
10 missing means | | 0.01 | 2.92 | 0.06 | 1.89 | 0.04 | 1.00 | −0.02 | 1.62 |
15 missing means | | −0.19 | 3.26 | −0.05 | 2.13 | 0.03 | 1.00 | 0.12 | 2.19 |