A series of simulations are performed using Base SAS 9.4 Software in which cluster cmRCTs are conducted. In order to provide more realistic simulations they are based on an example of a cohort of patients at high risk of developing cardiovascular disease (CVD) and eligible for lipid lowering drugs according to the relevant criteria in the principal UK guidelines [
22]. A novel intervention is tested against treatment as usual with a primary outcome of the time until a CVD event. This is an outcome that is of direct importance to a patient and may be identified with routinely collected data. Three patient characteristics are simulated: probability of refusing the intervention treatment, the risk of having a CVD event, and the time to death or censoring. Different scenarios are created by changing the average refusal probability of the population and changing the correlation between individuals’ risk of having an event and their probability of refusing treatment. The probability of a clinician refusing to offer the treatment to each patient is also simulated, and correlated to varying extents with patient risk. Once the patient characteristics have been generated, trial data is simulated through the same process of a cmRCT: treatment randomisation, refusal of treatment, application of intervention to those who accept and then the generation of times until an event. Weibull distributions are used to generate survival times. Each of the analysis methods explained in
Analysis Methods are then applied to the simulated trial data to estimate the intervention effect. The exact simulation process is detailed in
Simulation procedure.
Analysis methods
Four different methods for the analysis of a cluster cmRCT are tested. The methods are ITT, per protocol (PP) and two IV methods. ITT is the recommended method of analysis in pragmatic trials [
5,
23,
24] analysing the groups based on the random treatment allocation. PP defines the treatment groups on the basis of the actual treatment received, with only those who follow the allocated treatment included in the analysis. The two IV methods tested are the two stage predictor substitution (2SPS) and two stage residual inclusion (2SRI) as outlined practically by Terza et al., [
25]. They are both two stage modelling techniques and start by fitting a first stage model with treatment allocation as the explanatory variable and treatment received as the dependent variable (here treatment allocation acts as the IV). This model is then used to calculate the predicted values for treatment received and the residuals. In 2SPS, a second stage model is fitted to the outcome data using the predicted values for treatment received as the explanatory variables. In 2SRI, the second stage model is fitted to the outcome data using both the residuals and the actual treatment received as explanatory variables. The standard errors of parameter estimates in two stage modelling procedures are too small hence non-parametric bootstrapping [
26] should be used to calculate them. IV estimators were the chosen method to estimate the causal effect as IV methods are believed to perform well in RCTs with non-compliance with assumptions more easily argued to hold. [
27]. There is a wealth of literature on the theoretical properties of causal effect estimates and IVs [
18‐
21,
28] which is not recited in this paper. Instead the performances of the four different analysis methods in a variety of scenarios are evaluated with respect to bias, standard error and statistical power. We define bias as the error in the estimation of the treatment effect as defined in section 1 (effect of accepting treatment).
Simulation procedure
Table
1 contains details of all variables used in the simulations. The cluster size chosen is
J = 620 to match the average number of eligible patients per UK practice. This is calculated using published figures on GP practice size from the Health and Social Care Information Centre (HSCIC) [
29] and statistics on the prevalence of CVD from the National Institute for Health and Care Excellence (NICE) [
30]. The cluster size
J is constant as it has been shown that variable cluster size has no effect on the results in terms of bias [
31,
32]. The variances of the individual and cluster level random effects, σ
ε
2
and
σ
u
2, and the shape and scale of the Weibull distribution for time to CVD event,
λ
c
and
γ
c
, are chosen to match the mean 10-year CVD risk to published figures of 21.1 % (standard deviation 8.6 %) [
22]. The mortality (censoring distribution) shape and scale,
γ
m
and
λ
m
, and the variances σ
ε
2
and
σ
u
2 give censoring of 5 % of all events and a correlation of 0.25 between
T
ik
c
and
T
ik
m
to represent informative censoring.
Table 1
Description of all variables used in simulation
Number of patients in cohort, control arm and intervention arm | N, Ncon, Nint |
Number of clusters in trial | K |
Size of each cluster | J = 620 |
Treatment allocated to kth cluster | Zk = 0/1 for control/intervention |
Treatment received by ith individual from kth cluster | Xik = 0/1 for control/intervention |
Time until CVD event for ith individual from the kth cluster |
\( {\mathrm{T}}_{\mathrm{ik}}^{\mathrm{c}}\sim \mathrm{Weibull}\left({\upgamma}_{\mathrm{c}},{\uplambda}_{\mathrm{c}}{\mathrm{e}}^{-\left(\upbeta {\mathrm{X}}_{\mathrm{ik}}+{\upvarepsilon}_{\mathrm{ik}}+{\mathrm{U}}_{\mathrm{k}}\right)/{\upgamma}_{\mathrm{c}}}\right) \)
|
Time until mortality (censoring distribution) for the ith individual from the kth cluster |
\( {\mathrm{T}}_{\mathrm{ik}}^{\mathrm{m}}\sim \mathrm{Weibull}\left({\upgamma}_{\mathrm{m}},{\uplambda}_{\mathrm{m}}{\mathrm{e}}^{-\left({\upvarepsilon}_{\mathrm{ik}}+{\mathrm{U}}_{\mathrm{k}}\right)\kern0.1em /\kern0.1em {\upgamma}_{\mathrm{m}}}\right) \)
|
Common baseline hazard function for time until CVD event |
\( {\mathrm{h}}^{\mathrm{c}}\left(\mathrm{t}\right)={\upgamma}_{\mathrm{c}}{\mathrm{t}}^{\upgamma_{\mathrm{c}}-1}/\kern0.1em {\uplambda_{\mathrm{c}}}^{\upgamma_{\mathrm{c}}},\kern1em {\upgamma}_{\mathrm{c}}=1.2,{\uplambda}_{\mathrm{c}}=36 \)
|
Common baseline hazard function for time until mortality |
\( {\mathrm{h}}^{\mathrm{m}}\left(\mathrm{t}\right)={\upgamma}_{\mathrm{m}}{\mathrm{t}}^{\upgamma_{\mathrm{m}}-1}/{\uplambda_{\mathrm{m}}}^{\upgamma_{\mathrm{m}}},\kern0.5em {\upgamma}_{\mathrm{m}}=1.2,{\uplambda}_{\mathrm{m}}=55 \)
|
Individual hazard function for time until CVD event |
\( {\mathrm{h}}_{\mathrm{ik}}^{\mathrm{c}}\left(\mathrm{t}\right)={\mathrm{h}}^{\mathrm{c}}\left(\mathrm{t}\right){\mathrm{e}}^{\left({\upvarepsilon}_{\mathrm{ik}}+{\mathrm{U}}_{\mathrm{k}}+\upbeta {\mathrm{X}}_{\mathrm{ik}}\right)} \)
|
Individual hazard function for time until mortality |
\( {\mathrm{h}}_{\mathrm{ik}}^{\mathrm{m}}\left(\mathrm{t}\right)={\mathrm{h}}^{\mathrm{m}}\left(\mathrm{t}\right){\mathrm{e}}^{\left({\upvarepsilon}_{\mathrm{ik}}+{\mathrm{U}}_{\mathrm{k}}\right)} \)
|
Individual level random effects | ɛ
ik
∼ N(0, σ
ɛ
2) |
Cluster level random effects | U
k
∼ N(0, σ
u
2) |
Intervention effect | β = − 0.32 |
Ten year risk of a CVD event | rik = P(T
ik
c
< 10| Xik = 0, εik, Uk) |
Individual and average probability of patient refusing treatment |
\( {p}_{ik},\;p={\displaystyle {\sum}_{i,k}\frac{p_{ik}}{N}} \)
|
Individual and average probability of clinician refusing to offer treatment |
\( {q}_{ik},\kern0.5em q={\displaystyle {\sum}_{i,k}\frac{q_{ik}}{N}} \)
|
Correlation between patient refusal probability and patient risk |
ρ
p
|
Correlation between clinician refusal probability and patient risk |
ρ
q
|
Censoring indicator | C
ik
= I(T
ik
c
≥ min(T
ik
m
, T
max
)) |
Trial follow up time | T
max
= 3 |
Random variable observed for each patient | Y
ik
= min(T
ik
c
, T
ik
m
, T
max
) |
For each scenario detailed previously, the following procedure was implemented:
-
For j = 1,2,…,1000:
1)
Generate the random effects ɛ
ik
and U
k
for each patient and cluster. i = 1,2,…,I. k = 1,2,…,K.
2)
For each patient, calculate unique 10 year risks (under the counterfactual scenario of receiving standard care) of a CVD event, r
ik
.
3)
Assign patient and clinician refusal probabilities
p
ik
and
q
ik
3a)
Order patients by their risk, r
ik
.
3b)
Assign refusal probabilities sequentially in a linear fashion between the lower limit (LL) and upper limit (UL) such that such that ∑ p
ik
/N = p and ∑ q
ik
/N = q.
4)
Randomise treatment allocation Z
k
to control or intervention on a 4:1 basis. Z
k
= 0/1 if assigned to control/intervention.
5)
Generate the treatment received where X
ik
= 0/1 if control/intervention is received. If Zk = 0 then Xik = 0, if Z
k
= 1 then X
ik
= min {Bernoulli(1 − p
ik
), Bernoulli(1 − q
ik
)}.
6)
Apply intervention effect β and random effects to the hazard function, \( {\mathrm{h}}_{\mathrm{ik}}^{\mathrm{c}}\left(\mathrm{t}\right)={\mathrm{h}}^{\mathrm{c}}\left(\mathrm{t}\right)\kern0.1em {\mathrm{e}}^{\left(\upbeta {\mathrm{X}}_{\mathrm{ik}}+{\upvarepsilon}_{\mathrm{ik}}+{\mathrm{U}}_{\mathrm{k}}\right)} \).
7)
Generate survival times T
ik
c
and T
ik
m
. These survival distributions correspond to the respective hazard functions.
8)
Generate the censoring indicator C
ik
, The total observed trial data is then {Y
ik
, C
ik
, Z
ik
, X
ik
}, a set of censored survival data, treatment allocations and treatments received.
9)
Fit a Cox proportional hazards model to the data with respect to the four analysis methods ITT, PP, 2SRI and 2SPS, to produce an estimate \( {\widehat{\upbeta}}_{\mathrm{j}} \) of the intervention effect β, which is the log of the hazard ratio, and record the p-value, p
j
.
When j = 1000, calculate the mean \( \overline{\upbeta}={\displaystyle {\sum}_{\mathrm{j}}{\widehat{\upbeta}}_{\mathrm{j}}/1000} \), the percentage bias \( \left(\overline{\beta}-\beta \right)\kern0.1em /\kern0.1em \beta \) and the statistical power ∑jI(pj < 0.05)/1000. Also, calculate a parametrically bootstrapped standard error of the individual estimate \( s.\;e.\;\left(\widehat{\beta}\right)=s.\;d.\;\left({\widehat{\upbeta}}_{\mathrm{j}}\right) \), the standard error of the mean \( s.\;e.\;\left(\overline{\beta}\right)=s.\;d.\left({\widehat{\upbeta}}_{\mathrm{j}}\right)/1000 \), and a confidence interval for the percentage bias \( \mathrm{C}\mathrm{I}=\Big[100*\left(\left(\overline{\upbeta}-1.96*\kern0.5em s.\;e.\;\left(\overline{\beta}\right)\right)-\beta \right)/\beta, 100*\left(\left(\overline{\upbeta}+1.96*s.\;e.\;\left(\overline{\beta}\right)\right)-\beta \right)/\beta \).
Different scenarios are created by varying the following variables. The intra-cluster coefficient (ICC) takes values 0.025 and 0.05, simulated by (σ
ε
2
, σ
u
2
) = (0.6, 0.2) and (0.57, 0.27) respectively. Average patient and clinician refusal probabilities p and q take values 0.1, 0.2 and 0.3. The correlation between refusal probability and risk, ρ
p
, takes values zero, low, medium and high, simulated by having lower limits and upper limits for individual refusal probabilities as \( \left(LL,UL\right)\in \left\{\left(p,p\right),\left(\frac{2p}{3},\frac{4p}{3}\right),\left(\frac{p}{3},\frac{5p}{3}\right),\left(0,2p\right)\right\} \). The correlation between clinician refusal and risk takes the same set of values. The reason for this structure is to give control over the correlation between individual risk and refusal probabilities. The treatment effect is fixed at β = − 0.32, which equates to on average a 25 % reduction in 10 year risk of CVD. 1000 independent sets of independent trial data are generated for each scenario.
Sample sizes are calculated at a fixed ratio of 4:1 control to study intervention, the type 1 error is 0.05 and required power is 0.8. Sample sizes are calculated through simulation [
33] as sample size formulas for informatively censored clustered survival data are not common. Trials characteristics (effect size, refusal rate, baseline risk) are assumed to be known. Trial data is simulated using the above process and analysed using ITT. For each combination of refusal rates, the smallest N (that is a multiple of J = 620) such that the proportion of
p-values < 0.05 is 80 % is chosen as the required sample size in that scenario. There are then two recruitment methods which alter the required sample size. Recruitment method 1 calculates the sample size assuming no refusal. Recruitment method 2 factors in the refusal rate in the sample size calculation (assuming refusal to be non-informative and independent of individual risks). All simulation scenarios are run using both recruitment methods. The power realised varies from 0.8 as we use the smallest number of clusters that achieve at least a power of 0.8, in recruitment method 2 this changes depending on the refusal rate.
The outcome of interest is the time until a CVD event so cox proportional hazards models are fitted to produce estimates for the intervention effect. To account for the clustering of the data, three types of Cox proportional hazard model are fitted: marginal, lognormal frailty, and gamma frailty models [
34,
35]. The lognormal model is correctly specified because the generated random effects (frailties) are normally distributed (Table
1), whereas the gamma frailty model is miss specified. The output from the robust marginal model has a different interpretation to the frailty models in that the hazard ratio returned is between any two randomly selected patients from the population, as opposed to the hazard ratio of any two people randomly selected from the same cluster [
35]. Clustering is not taken into account in the first stage of the IV model as the inclusion of residuals in the second stage model (2SRI) is expected to take account of variation in refusal rates between clusters.