Background
Method
Survey
Simulation study
Simulated data
Outcome | Dependent variables | ||||||
---|---|---|---|---|---|---|---|
death
| intercept | bweight (g) | sex (male) | gest. age (w) | Apgar | Variance* | ICC |
4 var. | 10.70 | -0.004 | 0.410 | -0.039 | -0.212 | 0.48 | 0.13 |
2 var. | 3.037 | -0.006 | 0.598 | 0.79 | 0.20 | ||
O
2
dep
| intercept | bweight (g) | sex (male) | gest. age (w) | smoking | Variance* | ICC |
4 var. | 14.7 | -0.0045 | 0.954 | -0.058 | 0.654 | 2.80 | 0.47 |
2 var. | 5.315 | -0.006 | 1.001 | 2.95 | 0.48 |
-
Obtain the number of singletons n s and twin pairs n t given the size of the dataset and the percentage of cluster of size two.
-
Obtain n s and n t sets of levels for categorical variables using a multinomial random distribution (for outcome death, sex only and for outcome O 2 dep, sex and smoking).
-
Given a combination of levels, the corresponding vector of means and variance-covariance matrix is used to obtain the continuous covariates using a multivariate normal distribution for one singleton child or for the two siblings in the case of twins.
-
For each cluster, u 0i is obtained from a normal distribution \(N\left (0,\sigma _{u_{0i}}^{2}\right)\).
-
Eq. 1 provides the probability of death or O 2 dep.
-
Given this probability, the outcome is obtained using the corresponding binomial distribution.
Regression models and analysis of results
R
packages glmmPQL, lme4
and geepack
were used.Results
Survey
Simulation study
Convergence
Outome/ | Method of | Random intercept | Perc. of non- | Random intercept | Perc. of non- |
---|---|---|---|---|---|
No of clusters | estimation | variance | convergence* | variance | convergence* |
2 covariates
|
4 covariates
| ||||
death
| PQL | 0.5 to 2 | 3 to 0% | 0.5 to 2 | 53 to 0% |
150 | GH | 0.5 to 2 | 0 to 1% | 0.5 to 2 | 0 to 1% |
GEE | 0.5 to 2 | 3 to 1% | 0.5 to 2 | 3 to 1% | |
death
| PQL | 0.5 to 2 | ≤ 1% | 0.5 to 2 | ≤ 1% |
500 | GH | 0.5 to 2 | 0% | 0.5 to 2 | 0% |
GEE | 0.5 to 2 | ≤ 2% | 0.5 to 2 | ≤ 1% | |
O
2
dep
| PQL | 0.5 to 2 | 0% | 0.5 to 2 | 18 to 0% |
150 | GH | 0.5 | 20 to 27% | 0.5 | 0% |
1 | 19 to 28% | 1 | 0% | ||
2 | 18 to 28% | 2 | 0% | ||
GEE | 0.5 to 2 | 5 to 0% | 0.5 to 2 | 6 to 0% | |
O
2
dep
| PQL | 0.5 to 2 | 0% | 0.5 to 2 | 0% |
500 | GH | 0.5 | 37 to 48 | 0.5 | ≤ 1% |
1 | 35 to 46% | 1 | ≤ 1% | ||
2 | 34 to 46% | 2 | ≤ 1% | ||
GEE | 0.5 to 2 | 0% | 0.5 to 2 | ≤ 1% |
Estimation of regression parameters
Logistic regression | GLMM PQL | GLMM GH | GEE | ||||||
---|---|---|---|---|---|---|---|---|---|
Percent. | Random Int. | Empirical | 95% Conf | Empirical | 95% Conf | Empirical | 95% Conf | Empirical | 95% Conf |
Twins | variance | bias | Interv. | bias | Interv. | bias | Interv. | bias | Interv. |
Outcome Death 4 variables | |||||||||
0.02 | 2.00 | 9.60 | [9.30 ;10.00] | -4.30 | [-7.30 ; -1.30] | 7.10 | [6.60 ; 7.60] | 0.00 | [-0.60 ; 0.60] |
0.05 | 2.00 | 9.90 | [9.60 ; 10.30] | -45.90 | [ -50.10 ; -41.70] | 4.60 | [4.00 ; 5.20] | 0.30 | [-0.20 ; 0.80] |
0.10 | 2.00 | 9.90 | [9.60 ; 10.30] | -44.30 | [ -48.40 ; -40.20] | 0.90 | [ 0.20 ; 1.70] | 0.50 | [0.20 ; 0.90] |
0.20 | 2.00 | 9.80 | [9.50 ; 10.20] | -33.20 | [-36.90 ; -29.50] | -1.20 | [-1.90 ; -0.40] | 0.40 | [0.10 ; 0.80] |
0.02 | 1.00 | 5.60 | [5.30 ; 6.00] | -11.90 | [-15.30 ; -8.50] | 2.20 | [1.60 ; 2.70] | 0.10 | [-0.20 ; 0.50] |
0.05 | 1.00 | 5.60 | [5.20 ; 6.00] | -60.80 | [-65.90 ; -55.80] | -0.70 | [-1.40 ; -0.00] | 0.10 | [-0.30 ; 0.50] |
0.10 | 1.00 | 5.40 | [5.00 ; 5.70] | -61.40 | [ -66.40 ; -56.30] | -4.10 | [-4.80 ; -3.30] | -0.20 | [-0.50 ; 0.20] |
0.20 | 1.00 | 5.60 | [5.20 ; 6.00] | -47.20 | [ -51.90 ; -42.40] | -5.30 | [-6.10 ; -4.50] | -0.00 | [-0.40 ; 0.30] |
0.02 | 0.50 | 2.80 | [2.50 ; 3.20] | -14.30 | [-17.90 ; -10.70] | -1.80 | [-2.40 ; -1.10] | 0.00 | [-0.50 ; 0.50] |
0.05 | 0.50 | 2.70 | [2.30 ; 3.10] | -65.50 | [-71.00 ; -60.10] | -4.80 | [-5.60 ; -4.10] | -0.20 | [-0.60 ; 0.10] |
0.10 | 0.50 | 2.60 | [2.20 ; 2.90] | -64.00 | [-69.30 ; -58.80] | -6.90 | [-7.80 ; -6.10] | -0.50 | [-0.90 ; -0.10] |
0.20 | 0.50 | 2.60 | [2.20 ; 2.90] | -48.90 | [-53.80 ; -43.90] | -7.60 | [-8.50 ; -6.80] | -0.50 | [-0.90 ; -0.20] |
Outcome Death 2 variables | |||||||||
0.02 | 2.00 | 9.60 | [9.30 ; 10.00] | -4.30 | [-7.30 ; -1.30] | 7.10 | [6.60 ; 7.60] | 0.00 | [-0.60 ; 0.60] |
0.05 | 2.00 | 9.90 | [9.60 ; 10.30] | -45.90 | [-50.10 ; -41.70] | 4.60 | [4.00 ; 5.20] | 0.30 | [-0.20 ; 0.80] |
0.10 | 2.00 | 9.90 | [9.60 ; 10.30] | -44.30 | [-48.40 ; -40.20] | 0.90 | [0.20 ; 1.70] | 0.50 | [0.20 ; 0.90] |
0.20 | 2.00 | 9.80 | [9.50 ; 10.20] | -33.20 | [-36.90 ; -29.50] | -1.20 | [-1.90 ; -0.40] | 0.40 | [0.10 ; 0.80] |
0.02 | 1.00 | 5.60 | [5.30 ; 6.00] | -11.90 | [-15.30 ; -8.50] | 2.20 | [1.60 ; 2.70] | 0.10 | [-0.20 ; 0.50] |
0.05 | 1.00 | 5.60 | [5.20 ; 6.00] | -60.80 | [-65.90 ; -55.80] | -0.70 | [-1.40 ; -0.00] | 0.10 | [-0.30 ; 0.50] |
0.10 | 1.00 | 5.40 | [5.00 ; 5.70] | -61.40 | [-66.40 ; -56.30] | -4.10 | [-4.80 ; -3.30] | -0.20 | [-0.50 ; 0.20] |
0.20 | 1.00 | 5.60 | [5.20 ; 6.00] | -47.20 | [-51.90 ; -42.40] | -5.30 | [-6.10 ; -4.50] | -0.00 | [-0.40 ; 0.30] |
0.02 | 0.50 | 2.80 | [2.50 ; 3.20] | -14.30 | [-17.90 ; -10.70] | -1.80 | [-2.40 ; -1.10] | 0.00 | [-0.50 ; 0.50] |
0.05 | 0.50 | 2.70 | [2.30 ; 3.10] | -65.50 | [-71.00 ; -60.10] | -4.80 | [-5.60 ; -4.10] | -0.20 | [-0.60 ; 0.10] |
0.10 | 0.50 | 2.60 | [2.20 ; 2.90] | -64.00 | [-69.30 ; -58.80] | -6.90 | [-7.80 ; -6.10] | -0.50 | [-0.90 ; -0.10] |
0.20 | 0.50 | 2.60 | [2.20 ; 2.90] | -48.90 | [-53.80 ; -43.90] | -7.60 | [-8.50 ; -6.80] | -0.50 | [-0.90 ; -0.20] |
Outcome O2 Dep. 4 variables | |||||||||
0.02 | 2.00 | 9.60 | [9.30 ; 10.00] | -4.30 | [-7.30 ; -1.30] | 7.10 | [6.60 ; 7.60] | 0.00 | [-0.60 ; 0.60] |
0.05 | 2.00 | 9.90 | [9.60 ; 10.30] | -45.90 | [-50.10 ; -41.70] | 4.60 | [4.00 ; 5.20] | 0.30 | [-0.20 ; 0.80] |
0.10 | 2.00 | 9.90 | [9.60 ; 10.30] | -44.30 | [-48.40 ; -40.20] | 0.90 | [0.20 ; 1.70] | 0.50 | [0.20 ; 0.90] |
0.20 | 2.00 | 9.80 | [9.50 ; 10.20] | -33.20 | [-36.90 ; -29.50] | -1.20 | [-1.90 ; -0.40] | 0.40 | [0.10 ; 0.80] |
0.02 | 1.00 | 5.60 | [5.30 ; 6.00] | -11.90 | [-15.30 ; -8.50] | 2.20 | [1.60 ; 2.70] | 0.10 | [-0.20 ; 0.50] |
0.05 | 1.00 | 5.60 | [5.20 ; 6.00] | -60.80 | [-65.90 ; -55.80] | -0.70 | [-1.40 ; -0.00] | 0.10 | [-0.30 ; 0.50] |
0.10 | 1.00 | 5.40 | [5.00 ; 5.70] | -61.40 | [-66.40 ; -56.30] | -4.10 | [-4.80 ; -3.30] | -0.20 | [-0.50 ; 0.20] |
0.20 | 1.00 | 5.60 | [5.20 ; 6.00] | -47.20 | [-51.90 ; -42.40] | -5.30 | [-6.10 ; -4.50] | -0.00 | [-0.40 ; 0.30] |
0.02 | 0.50 | 2.80 | [2.50 ; 3.20] | -14.30 | [-17.90 ; -10.70] | -1.80 | [-2.40 ; -1.10] | 0.00 | [-0.50 ; 0.50] |
0.05 | 0.50 | 2.70 | [2.30 ; 3.10] | -65.50 | [-71.00 ; -60.10] | -4.80 | [-5.60 ; -4.10] | -0.20 | [-0.60 ; 0.10] |
0.10 | 0.50 | 2.60 | [2.20 ; 2.90] | -64.00 | [-69.30 ; -58.80] | -6.90 | [-7.80 ; -6.10] | -0.50 | [-0.90 ; -0.10] |
0.20 | 0.50 | 2.60 | [2.20 ; 2.90] | -48.90 | [-53.80 ; -43.90] | -7.60 | [-8.50 ; -6.80] | -0.50 | [-0.90 ; -0.20] |
Outcome O2 Dep. 2 variables | |||||||||
0.02 | 2.00 | 9.60 | [9.30 ;10.00] | -4.30 | [-7.30 ; -1.30] | 7.10 | [6.60 ; 7.60] | 0.00 | [-0.60 ; 0.60] |
0.05 | 2.00 | 9.90 | [9.60 ; 10.30] | -45.90 | [-50.10 ; -41.70] | 4.60 | [4.00 ; 5.20] | 0.30 | [-0.20 ; 0.80] |
0.10 | 2.00 | 9.90 | [9.60 ; 10.30] | -44.30 | [-48.40 ; -40.20] | 0.90 | [0.20 ; 1.70] | 0.50 | [0.20 ; 0.90] |
0.20 | 2.00 | 9.80 | [9.50 ; 10.20] | -33.20 | [-36.90 ; -29.50] | -1.20 | [-1.90 ; -0.40] | 0.40 | [0.10 ; 0.80] |
0.02 | 1.00 | 5.60 | [5.30 ; 6.00] | -11.90 | [-15.30 ; -8.50] | 2.20 | [1.60 ; 2.70] | 0.10 | [-0.20 ; 0.50] |
0.05 | 1.00 | 5.60 | [5.20 ; 6.00] | -60.80 | [-65.90 ; -55.80] | -0.70 | [-1.40 ; -0.00] | 0.10 | [-0.30 ; 0.50] |
0.10 | 1.00 | 5.40 | [5.00 ; 5.70] | -61.40 | [-66.40 ; -56.30] | -4.10 | [-4.80 ; -3.30] | -0.20 | [-0.50 ; 0.20] |
0.20 | 1.00 | 5.60 | [5.20 ; 6.00] | -47.20 | [-51.90 ; -42.40] | -5.30 | [-6.10 ; -4.50] | -0.00 | [-0.40 ; 0.30] |
0.02 | 0.50 | 2.80 | [2.50 ; 3.20] | -14.30 | [-17.90 ; -10.70] | -1.80 | [-2.40 ; -1.10] | 0.00 | [-0.50 ; 0.50] |
0.05 | 0.50 | 2.70 | [2.30 ; 3.10] | -65.50 | [-71.00 ; -60.10] | -4.80 | [-5.60 ; -4.10] | -0.20 | [-0.60 ; 0.10] |
0.10 | 0.50 | 2.60 | [2.20 ; 2.90] | -64.00 | [-69.30 ; -58.80] | -6.90 | [-7.80 ; -6.10] | -0.50 | [-0.90 ; -0.10] |
0.20 | 0.50 | 2.60 | [2.20 ; 2.90] | -48.90 | [-53.80 ; -43.90] | -7.60 | [-8.50 ; -6.80] | -0.50 | [-0.90 ; -0.20] |
Coverage of the 95% confidence interval
Death 4 covar.
|
Death 2 covar.
|
O2 dep. 4 covar
|
O2 dep 2 covar
| |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Perc. | Random inter. var | Random inter. var | Random inter. var | Random inter. var | ||||||||
Twins | 0.5 | 1 | 2 | 0.5 | 1 | 2 | 0.5 | 1 | 2 | 0.5 | 1 | 2 |
Logistic regression | ||||||||||||
2% | 0.99 | 1.00 | 1.00 | 0.93 | 0.92 | 0.91 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 |
5% | 0.99 | 1.00 | 1.00 | 0.95 | 0.94 | 0.94 | 1.00 | 1.00 | 1.00 | 0.99 | 0.99 | 0.99 |
10% | 0.99 | 1.00 | 1.00 | 0.98 | 0.98 | 0.98 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
20% | 0.99 | 1.00 | 1.00 | 0.99 | 0.98 | 0.98 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Mixed logistic regression, PQL | ||||||||||||
2% | 0.67 | 0.66 | 0.68 | 0.79 | 0.79 | 0.82 | 0.69 | 0.71 | 0.76 | 0.95 | 0.96 | 0.97 |
5% | 0.59 | 0.58 | 0.58 | 0.91 | 0.91 | 0.92 | 0.76 | 0.79 | 0.85 | 1.00 | 1.00 | 0.99 |
10% | 0.57 | 0.56 | 0.58 | 0.96 | 0.96 | 0.97 | 0.87 | 0.89 | 0.93 | 1.00 | 1.00 | 1.00 |
20% | 0.56 | 0.55 | 0.60 | 0.99 | 0.99 | 0.99 | 0.95 | 0.96 | 0.97 | 1.00 | 1.00 | 1.00 |
Mixed logistic regression, Gauss-Hermite | ||||||||||||
2% | 0.99 | 1.00 | 1.00 | 0.94 | 0.96 | 0.97 | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 |
5% | 0.98 | 0.99 | 0.99 | 0.94 | 0.96 | 0.98 | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 |
10% | 0.98 | 0.98 | 0.99 | 0.94 | 0.97 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 |
20% | 0.97 | 0.97 | 0.98 | 0.95 | 0.98 | 0.99 | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 |
GEE | ||||||||||||
2% | 0.98 | 0.98 | 0.98 | 0.50 | 0.56 | 0.60 | 0.99 | 0.99 | 0.99 | 0.59 | 0.67 | 0.77 |
5% | 0.98 | 0.98 | 0.98 | 0.54 | 0.63 | 0.68 | 1.00 | 1.00 | 1.00 | 0.66 | 0.76 | 0.86 |
10% | 0.98 | 0.98 | 0.98 | 0.58 | 0.71 | 0.84 | 1.00 | 1.00 | 1.00 | 0.71 | 0.83 | 0.93 |
20% | 0.98 | 0.98 | 0.98 | 0.65 | 0.80 | 0.92 | 1.00 | 1.00 | 1.00 | 0.78 | 0.90 | 0.97 |
Mean squared error (MSE)
Estimation of the random variance
Death 4 covar.
|
Death 2 covar.
| |||||
---|---|---|---|---|---|---|
Perc. Twins | Random inter. var | Random inter. var | ||||
0.5 | 1 | 2 | 0.5 | 1 | 2 | |
2% | 0.60 | 0.45 | 0.32 | 0.69 | 0.60 | 0.36 |
(2.54) | (2.05) | (1.39) | (2.85) | (0.74) | (1.92) | |
5% | 0.96 | 0.84 | 0.79 | 1.01 | 0.73 | 0.76 |
(3.21) | (2.78) | (2.46) | (3.45) | (1.27) | (2.90) | |
10% | 1.25 | 1.33 | 1.39 | 1.24 | 1.24 | 1.20 |
(3.73) | (3.75) | (3.47) | (3.85) | (3.65) | (3.27) | |
20% | 1.37 | 1.58 | 1.86 | 1.19 | 1.50 | 1.84 |
(3.55) | (3.58) | (3.69) | (3.33) | (3.53) | (3.67) | |
O2 dep. 4 covar
|
O2 dep 2 covar
| |||||
2% | 0.08 | 0.08 | 0.10 | 0.06 | 0.07 | 0.08 |
(0.23) | (0.10) | (0.10) | (0.08) | (0.08) | (0.09) | |
5% | 0.15 | 0.16 | 0.21 | 0.10 | 0.13 | 0.17 |
(0.44) | (0.31) | (0.31) | (0.16) | (0.19) | (0.20) | |
10% | 0.19 | 0.26 | 0.36 | 0.14 | 0.21 | 0.30 |
(0.43) | (0.46) | (0.46) | (0.24) | (0.35) | (0.34) | |
20% | 0.23 | 0.37 | 0.59 | 0.20 | 0.32 | 0.53 |
(0.34) | (0.45) | (0.54) | (0.25) | (0.32) | (0.43) |
Discussion
The analysis of dataset of preterm infants
Reliability of estimates and coverage of 95% confidence interval
Strength, limitation and further work
R
. The bias for the estimates for GEE were obtained using an approximative formula and not the GEE estimates form the real data. Doing so may have given slightly different results for bias estimates. The results are based on a single dataset with two different outcomes and on models with a limited number of covariates. This was due to the restricted availability of statistically significant covariates in the dataset. However there was no evidence of negative effects on the reliability of any of the models when increasing the number of covariates from two to four. The effect of fitting a less parsimonious model remain unknown. There are indications that the probability of the outcome may have an effect on the accuracy of the estimates for the logistic regression and logistic random intercept methods. This was not explicitly tested in this work and further research should be done to assess this.Conclusion
-
Overall GEE method may be a reliable choice but provides population average effects;
-
If the percentage of twins is large (above 10%) then the random intercept model with Gauss-Hermite method of estimation will be more reliable;
-
If the logistic random intercept model does not converge even with a large percentage of twins then one could try to modify the starting value for the estimating algorithm or use either a logistic regression which will provide underestimated effects with small standard errors or use GEE.