In this section, we focus on six methods that estimate
τ. We describe each statistical model using the same set of population mean, variance, and covariance parameters defined in section
Methods for homogeneous and heterogeneous scenarios, separately. For each method, we present the closed-form expressions of the point estimator of treatment effect and its variance. It often goes unnoticed in practice that different statistical methods have different types of variances (i.e., conditional vs. unconditional variances) associated with their treatment effect estimators. For example, the OLS model-based variances for ANCOVA are conditional because OLS assumes the baseline weight is fixed. Generally speaking, the baseline weight is random because we rarely enroll participants into randomized trials based on predetermined values of the baseline weight. Thus, the unconditional variance and the corresponding unconditional inference is of greater interest because we want the findings derived from the current sample to be generalizable to the population of interest. We will discuss in details whether the OLS model-based conditional inference (i.e., test statistics and
p-values from standard statistical softwares) for ANCOVA is still valid for unconditional hypothesis testing and the potential fixes that we can use to draw valid unconditional inference if the usual OLS model-based inference is biased.
When the study population is homogeneous
Method 1:ANOVA modeling post treatment measure (“ANOVA-Post”). We model the post-treatment body weight
\( {Y}_{ij{t}_1} \) using the binary treatment indicator
Gij (1 if in the treatment arm; 0 if in the control arm) as follows:
$$ {Y}_{ij{t}_1}={\beta}_0^{(1)}+{\beta}_1^{(1)}{G}_{ij}+{e}_{ij}^{(1)},i=1,2,\dots, {n}_j;j=0,1; $$
(1)
$$ {e}_{ij}^{(1)}\sim N\left(0,{\sigma}_1^2\right), $$
where \( {\beta}_0^{(1)}={\mu}_{0{t}_1}, \)\( {\beta}_1^{(1)}={\mu}_{1{t}_1}-{\mu}_{0{t}_1}= \)τ, and \( {e}_{ij}^{(1)} \) is independently and identically distributed (i.i.d) random error. \( {\beta}_1^{(1)} \) represents the treatment effect. Model (1) is homoscedastic with a constant residual variance \( {\sigma}_1^2 \).
We can fit an ordinary least squares (OLS) regression to estimate the coefficients and standard errors of model (1). The closed-form expressions of the OLS estimator
\( {\hat{\beta}}_{1, ols}^{(1)} \) and its “unconditional” variance, denoted by
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \), are presented in Table
1.
\( {\hat{\beta}}_{1, ols}^{(1)} \) is estimated by the sample group mean difference in the post-treatment weight between two arms.
\( {\hat{\beta}}_{1, ols}^{(1)} \) is unbiased for
τ. The OLS model-based variance of
\( {\hat{\beta}}_{1, ols}^{(1)} \) assuming known
\( {\sigma}_1^2 \) is:
$$ {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right)=\frac{\sigma_1^2}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({G}_{ij}-{G}_{..}\right)}^2}, $$
where
\( {G}_{..}=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{G}_{ij}}{n_0+{n}_1}=\frac{n_1}{n_0+{n}_1} \).
\( {\sigma}_1^2 \) is estimated by
$$ {\hat{\sigma}}_1^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}^{(1)}\right)}^2}{\left({n}_0+{n}_1-2\right)}, $$
where
\( {\hat{y}}_{ij{t}_1}^{(1)}={\hat{\beta}}_{0, ols}^{(1)}+{\hat{\beta}}_{1, ols}^{(1)}{G}_{ij} \) is the predicted value from model (1). We let
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \) denote the OLS model-based variance estimator with
\( {\hat{\sigma}}_1^2 \) substituted for
\( {\sigma}_1^2 \), which is output by standard statistical softwares (Table
1). Since
\( {\sum}_{j=0}^1{\sum}_{i=1}^{n_j}{\left({G}_{ij}-{G}_{..}\right)}^2=\frac{n_0{n}_1}{n_0+{n}_1} \), it follows that
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right)=\mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \). It is well established that
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \) is unbiased for
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \). Thus,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(1)}\right) \). The usual OLS model-based inference (i.e., test statistics
\( t=\frac{{\hat{\beta}}_{1, ols}^{(1)}}{\sqrt{{\hat{va\mathrm{r}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right)\ }} \) and the associated
p-value) is valid for testing
Ho :
τ = 0 unconditionally.
Table 1
Estimators of treatment effect and variance estimators in a homogeneous study population
ANOVA-Post | \( {\hat{\beta}}_{1, ols}^{(1)}={\overline{y}}_{.1{t}_1}-{\overline{y}}_{.0{t}_1} \) | U | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(1)}\right)=\frac{\sigma_1^2}{n_0}+\frac{\sigma_1^2}{n_1} \) | \( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(1)}\right)=\frac{{\hat{\sigma}}_1^2}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({G}_{ij}-{G}_{..}\right)}^2} \) \( {\hat{\sigma}}_1^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}\right)}^2}{\left({n}_0+{n}_1-2\right)} \) |
ANCOVA-Post I | \( {\hat{\beta}}_{1, ols}^{(2)}=\left({\overline{y}}_{.1{t}_1}-{\overline{y}}_{.0{t}_1}\right)-{\hat{\beta}}_{2, ols}^{(2)}\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right) \) | C | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}+\frac{{\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}^2}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_0}-{\overline{y}}_{.j{t}_0}\right)}^2}\right){\sigma}_{\epsilon^{(2)}}^2 \), \( {\sigma}_{\epsilon^{(2)}}^2=\left(1-{\rho}^2\right){\sigma}_1^2 \) | \( {\hat{\mathit{\operatorname{var}}}}_{ols}\Big({\hat{\beta}}_{1, ols}^{(2)}\left|{Y}_{ij{t}_0}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}+\frac{{\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}^2}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_0}-{\overline{y}}_{.j{t}_0}\right)}^2}\right){\hat{\sigma}}_{e_{ij}^{(2)}}^2 \), \( {\hat{\sigma}}_{e_{ij}^{(2)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}\right)}^2}{\left({n}_0+{n}_1-4\right)} \) |
U | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}\right)\left(1-{\rho}^2\right){\sigma}_1^2 \) | |
RM | \( {\hat{\gamma}}_{3,\kern0.5em gls}^{(3)}=\left({\overline{y}}_{.1{t}_1}-{\overline{y}}_{.1{t}_0}\right)-\left({\overline{y}}_{.0{t}_1}-{\overline{y}}_{.0{t}_0}\right) \) | U | \( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{3,\kern0.5em gls}^{(3)}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}\right)\left({\sigma}_1^2+{\sigma}_0^2-2\rho {\sigma}_0{\sigma}_1\right) \) | |
cRM | \( {\hat{\gamma}}_{3,\kern0.5em gls}^{(4)}=\left({\overline{y}}_{.1{t}_1}-{\overline{y}}_{.0{t}_1}\right)-\frac{\rho {\sigma}_0{\sigma}_1}{\sigma_0^2}\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right) \) | U | \( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{3, gls}^{(4)}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}\right)\left(1-{\rho}^2\right){\sigma}_1^2 \) | |
ANOVA-Change | \( {\hat{\beta}}_{1, ols}^{(5)}=\left({\overline{y}}_{.1{t}_1}-{\overline{y}}_{.1{t}_0}\right)-\left({\overline{y}}_{.0{t}_1}-{\overline{y}}_{.0{t}_0}\right) \) | U | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(5)}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}\right)\left({\sigma}_1^2+{\sigma}_0^2-2\rho {\sigma}_0{\sigma}_1\right) \) | \( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(5)}\right)=\frac{{\hat{\sigma}}_{\epsilon^{(5)}}^2}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({G}_{ij}-{G}_{..}\right)}^2}, \) \( {\hat{\sigma}}_{\epsilon^{(5)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({\Delta }_{ij}-{\hat{\Delta }}_{ij}^{(5)}\right)}^2}{\left({n}_0+{n}_1-2\right)} \) |
Method 2:ANCOVA modeling post treatment measure (“ANCOVAI”): We model the post-treatment weight
\( {Y}_{ij{t}_1} \) using the binary treatment indicator
Gij and the baseline weight
\( {Y}_{ij{t}_0} \).
$$ {Y}_{ij{t}_1}={\beta}_0^{(2)}+{\beta}_1^{(2)}{G}_{ij}+{\beta}_2^{(2)}{Y}_{ij{t}_0}+{e}_{ij}^{(2)},i=1,2,\dots, {n}_j;j=0,1; $$
(2)
$$ {e}_{ij}^{(2)}\sim N\left(0,{\sigma}_{\epsilon^{(2)}}^2\right)\ \mathrm{and}\ {\sigma}_{\epsilon^{(2)}}^2=\left(1-{\rho}^2\right){\sigma}_1^{2.} $$
, where \( {\beta}_0^{(2)}={\mu}_{0{t}_1}-\rho \frac{\sigma_1}{\sigma_0}{\mu}_{t_0} \), \( {\beta}_1^{(2)} \) = \( \tau, \kern0.5em {\beta}_2^{(2)} \) = \( \rho \frac{\sigma_1}{\sigma_0} \), and \( {e}_{ij}^{(2)} \) is i.i.d random error. \( {\beta}_1^{(2)} \) measures the treatment effect τ and \( {\beta}_2^{(2)} \) represents the slope of the pre-post association between \( {Y}_{ij{t}_1} \) and \( {Y}_{ij{t}_0} \). Model (2) has a common residual variance \( {\sigma}_{\epsilon^{(2)}}^2 \) and implicitly assumes that two arms share the common baseline mean \( {\mu}_{t_0} \).
The coefficients and standard errors of model (2) are also estimated using an OLS regression. The OLS estimator
\( {\hat{\beta}}_{1, ols}^{(2)} \) is derived as the sample mean difference in the post-treatment weight adjusting for the sample mean difference in the baseline weight between two arms. The group mean difference in the baseline weight can be seen as chance imbalance in a randomized trial.
\( {\hat{\beta}}_{1, ols}^{(2)} \) is unbiased for
τ both conditional on
\( {Y}_{ij{t}_0} \) and unconditionally. The formulas of
\( {\hat{\beta}}_{1, ols}^{(2)} \) and its “unconditional” variance
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}\right) \) are listed in Table
1. However, OLS assumes that the baseline weight
\( {Y}_{ij{t}_0} \) is fixed. OLS targets the conditional variance of
\( {\hat{\beta}}_{1, ols}^{(2)} \), denoted by
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(2)}\left|{Y}_{ij{t}_0}\right) \), instead of
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}\right) \). The formula of
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(2)}\left|{Y}_{ij{t}_0}\right) \) with a known common residual variance
\( {\sigma}_{\epsilon^{(2)}}^2 \) is presented in Table
1. Since
\( {\sigma}_{\epsilon^{(2)}}^2 \) is generally unknown, it is estimated by the following sample residual variance:
$$ {\hat{\sigma}}_{e_{ij}^{(2)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}^{(2)}\right)}^2}{\left({n}_0+{n}_1-3\right)} $$
, where
\( {\hat{y}}_{ij{t}_1}^{(2)}={\hat{\beta}}_{0, ols}^{(2)}+{\hat{\beta}}_{1, ols}^{(2)}{G}_{ij}+{\hat{\beta}}_{2, ols}^{(2)}{Y}_{ij{t}_0} \), the predicted value from model (2). We let
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right) \) denote the OLS model-based variance estimator with
\( {\hat{\sigma}}_{\epsilon^{(2)}}^2 \) substituted for
\( {\sigma}_{\epsilon^{(2)}}^2 \) . Note that
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right) \) is reported by standard statistical softwares (e.g. “proc reg” in SAS). Its formula is presented in Table
1.
Since we want to generalize our conclusions to a general population and
\( {Y}_{ij{t}_0} \) can take different values from those collected in the current sample, we may wonder whether significance tests based on the model-based conditional variance assuming
\( {Y}_{ij{t}_0} \) is fixed (e.g.,
\( t=\frac{{\hat{\beta}}_{1, ols}^{(2)}}{\sqrt{{\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right)\kern0.5em }} \)) is comparable to unconditional inference (e.g.,
\( t=\frac{{\hat{\beta}}_{1, ols}^{(2)}}{\sqrt{\mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}\right)}} \)), in which
\( {Y}_{ij{t}_0} \) is treated as random variable, for testing
Ho :
τ = 0. To establish this equivalence, we need to show: i)
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(2)}\left|{Y}_{ij{t}_0}\right) \); ii)
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}\right) \). The first part is well established in a homoscedastic linear model. The second part holds because we can show that
\( \kern0.50em \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}\right) \) =E(
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(2)}|{Y}_{ij{t}_0}\right) \)) using the law of total variance formula and the fact that
\( {\hat{\beta}}_{1, ols}^{(2)} \) is unbiased for
τ. That is, the unconditional variance of
\( {\hat{\beta}}_{1, ols}^{(2)} \) is the average of its conditional variance over the distribution of the baseline weight. Therefore, the usual model-based standard errors and associated
p-values are valid for unconditional inference [
3,
5,
17].
Method 3:Repeated measures model (“RM”):RM models the baseline and post-treatment weights (
\( {Y}_{ij{t}_0} \),
\( {Y}_{ij{t}_1} \)) jointly using the binary treatment indicator
Gij, the binary time factor
Tij, the time by treatment interaction
Gij ×
Tij as follows:
$$ {Y}_{ij t}={\gamma}_0^{(3)}+{\gamma}_1^{(3)}{G}_{ij}+{\gamma}_2^{(3)}{T}_{ij}+{\gamma}_3^{(3)}{G}_{ij}\times {T}_{ij}+{e}_{ij t}^{(3)},i=1,2,\dots, {n}_j;j=0,1;t={t}_0,{t}_1, $$
(3)
$$ \left(\begin{array}{c}{e}_{ij{t}_0}^{(3)}\\ {}{e}_{ij{t}_1}^{(3)}\end{array}\right)\sim N\left(\left[\begin{array}{c}0\\ {}0\end{array}\right],\sum \right), $$
When t0 = 0 and t1 = 1, \( \kern0.50em {\gamma}_0^{(3)}= \)\( {\mu}_{0{t}_0} \), \( {\gamma}_1^{(3)}={\mu}_{1{t}_0}-{\mu}_{0{t}_0}, \)\( {\gamma}_2^{(3)}={\mu}_{0{t}_1}-{\mu}_{0{t}_0}, \) and \( {\gamma}_3^{(3)}=\left({\mu}_{1{t}_1}-{\mu}_{1{t}_0}\right)-\left({\mu}_{0{t}_1}-{\mu}_{0{t}_0}\right) \). \( {\gamma}_0^{(3)} \) represents the mean baseline weight of the control arm, \( {\gamma}_1^{(3)} \) represents the difference in the mean baseline weights of the treatment and control arms, \( {\gamma}_2^{(3)} \) represents the mean change from baseline in the control arm, and \( {\gamma}_3^{(3)} \) is generally interpreted as the difference in the mean change from baseline in a unit time interval between the treatment and control arms (“difference in difference”), also known as the difference in slopes. We have \( {\mu}_{1{t}_0}={\mu}_{0{t}_0} \) from random allocation and it follows that \( {\gamma}_1^{(3)}=0 \) and \( {\gamma}_3^{(3)}={\mu}_{1{t}_1}-{\mu}_{1{t}_1}=\tau . \) Thus, testing \( {H}_o:{\gamma}_3^{(3)}=0 \) is equivalent to testing Ho : τ = 0.
The generalized least squares (GLS) model with correlated outcomes is routinely used to estimate the coefficients and standard errors of model (3). The GLS estimator of the treatment effect
\( {\hat{\gamma}}_{3,\kern0.5em gls}^{(3)} \) and its variance
\( \mathit{\operatorname{var}}\Big({\hat{\gamma}}_{3,\kern0.5em gls}^{(3)} \)) given known variance and covariance parameters are presented in Table
1.
\( {\hat{\gamma}}_{3,\kern0.5em gls}^{(3)} \) is estimated by the sample mean difference in body weight change between two arms and is unbiased for τ in a large sample. The variance and covariance parameters are generally unknown and need to be estimated using the restricted maximum likelihood (REML). The conventional maximal likelihood estimation (MLE) should be avoided. The REML variance estimator
\( {\hat{\ \mathit{\operatorname{var}}}}_{reml}\Big({\hat{\gamma}}_{3,\kern0.5em gls}^{(3)} \)) is derived by plugging the REML estimators of the variance and covariance parameters (i.e.,
\( {\sigma}_0^2,{\sigma}_1^2,\rho {\sigma}_0{\sigma}_1 \)) into the formula of
\( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{3,\kern0.5em gls}^{(3)}\right) \).We use Kenward and Roger method [
18](“ddfm = kenwardroger” in SAS proc. mixed procedure) to adjust for the potential finite sample bias in
\( {\hat{\ \mathit{\operatorname{var}}}}_{reml}\Big({\hat{\gamma}}_{3,\kern0.5em gls}^{(3)} \)) because of its failure to incorporate variabilities of the REML estimators of the variance and covariance parameters. This adjustment involves inflating the variance and covariance matrix and computing an adjusted approximation degrees of freedom.
Method 4:Constrained Repeated measures Model (“cRM”): By specifying
\( {\gamma}_1^{(3)} \) in the model,
RM model (3) assumes the mean baseline weight is different between two arms. Liang and Zeger [
8] proposed the following
cRM model by fixing
\( {\gamma}_1^{(3)}=0 \) to force the treatment and control arms to have the same intercept. Intuitively,
cRM is more efficient than
RM because
cRM estimates one less parameter. Formally, we model the baseline and post-treatment weights (
\( {Y}_{ij{t}_0} \),
\( {Y}_{ij{t}_1} \)) jointly using the binary factor
Tij, a time by treatment interaction
Gij ×
Tij in the following
cRM model:
$$ {Y}_{ij t}={\gamma}_0^{(4)}+{\gamma}_2^{(4)}{T}_{ij}+{\gamma}_3^{(4)}{G}_{ij}\times {T}_{ij}+{e}_{ij t}^{(4)},i=1,2,\dots, {n}_j;j=0,1;t={t}_0,{t}_1 $$
(4)
$$ \left(\begin{array}{c}{e}_{ij{t}_0}^{(4)}\\ {}{e}_{ij{t}_1}^{(4)}\end{array}\right)\sim N\left(\left[\begin{array}{c}0\\ {}0\end{array}\right],\sum \right), $$
where
\( {\gamma}_0^{(4)}={\mu}_{t_0},{\gamma}_2^{(4)}={\mu}_{0{t}_1}-{\mu}_{0{t}_0} \), and
\( {\gamma}_3^{(4)}=\tau \). Interpretations of
\( {\gamma}_0^{(4)} \),
\( {\gamma}_2^{(4)} \), and
\( {\gamma}_3^{(4)} \) are the same as their counterparts in
RM. The formulas of the GLS point estimator
\( {\hat{\gamma}}_{3,\kern0.5em gls}^{(4)} \) and its variance
\( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{3,\kern0.5em gls}^{(4)}\right) \) are listed in Table
1.
\( {\hat{\gamma}}_{3,\kern0.5em gls}^{(4)} \) is unbiased for
τ asymptotically. The empirical or the model-based variance estimate for
\( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{3,\kern0.5em gls}^{(4)}\right) \) is derived using REML in the same way as a regular
RM model.
Method 5:ANOVA with change score (“ANOVA-Change”): We model change score
\( {\Delta }_{ij}={Y}_{ij{t}_1}-{Y}_{ij{t}_0} \) using the binary treatment indicator
Gij as follows:
$$ {\Delta }_{ij}={\beta}_0^{(5)}+{\beta}_1^{(5)}{G}_{ij}+{e}_{ij}^{(5)},i=1,2,\dots, {n}_j;j=0,1; $$
(5)
$$ {e}_{ij}^{(5)}\sim N\left(0,{\sigma}_{\epsilon^{(5)}}^2\right)\ \mathrm{and}\ {\sigma}_{\epsilon^{(5)}}^2={\sigma}_1^2+{\sigma}_0^2-2\rho {\sigma}_0{\sigma}_1, $$
where
\( {\beta}_0^{(5)}={\mu}_{0{t}_1}-{\mu}_{0{t}_0} \),
\( {\beta}_1^{(5)}=\left({\mu}_{1{t}_1}-{\mu}_{1{t}_0}\right)-\left({\mu}_{0{t}_1}-{\mu}_{0{t}_0}\right) \), and
\( {e}_{ij}^{(3)} \) is i.i.d random error.
\( {\beta}_0^{(5)} \) measures the mean difference score in the control arm.
\( {\beta}_1^{(5)} \) measures the treatment effect
\( \overset{\sim }{\tau } \). Since
\( {\mu}_{1{t}_0}={\mu}_{0{t}_0} \) due to randomization at baseline,
\( {\beta}_1^{(5)} \) is reduced to
τ. The closed-form expressions of
\( {\hat{\beta}}_{1, ols}^{(5)} \) and
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(5)}\right) \) are listed in Table
1.
\( {\hat{\beta}}_{1, ols}^{(5)} \) is derived as the sample mean difference in the change score between two arms (“difference in difference”) and is unbiased for
τ. The OLS model-based variance of
\( {\hat{\beta}}_{1, ols}^{(5)} \) assuming known
\( {\sigma}_{\epsilon^{(5)}}^2 \) is
$$ {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(5)}\right)=\frac{\sigma_{\epsilon^{(5)}}^2}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({G}_{ij}-{G}_{..}\right)}^2}, $$
where
\( {G}_{..}=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{G}_{ij}}{n_0+{n}_1}=\frac{n_1}{n_0+{n}_1} \).
\( {\sigma}_{\epsilon^{(5)}}^2 \) is estimated by
$$ {\hat{\sigma}}_{\epsilon^{(5)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({\Delta }_{ij}-{\hat{\Delta }}_{ij}^{(5)}\right)}^2}{\left({n}_0+{n}_1-2\right)}, $$
where
\( {\hat{\Delta }}_{ij}^{(5)} \) is the fitted value from model (5). We let
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(5)}\right) \) denote the OLS model-based variance estimator with
\( {\hat{\sigma}}_{\epsilon^{(5)}}^2 \) substituted for
\( {\sigma}_{\epsilon^{(5)}}^2 \) Table
1, which is reported by standard statistical softwares. Since
\( {\sum}_{j=0}^1{\sum}_{i=1}^{n_j}{\left({G}_{ij}-{G}_{..}\right)}^2=\frac{n_0{n}_1}{n_0+{n}_1} \), it follows that
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(5)}\right)=\mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(5)}\right) \). It is well established that
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(5)}\right) \) is unbiased for
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(5)}\right) \), and thus for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(5)}\right) \). The usual OLS model-based inference is valid for unconditional hypothesis testing.
When the study population is heterogeneous
Method 6:ANCOVAII: Different variance and covariance structures in the treatment and control arms suggest a baseline measurement by treatment interaction term in ANCOVA [
2,
3,
9,
10]. To estimate
τ using an interaction model, we first compute the mean centered baseline weight
\( {\overset{\sim }{Y}}_{ij{t}_0} \) by subtracting the overall mean baseline weight from individual baseline weights. i.e.,
\( {\overset{\sim }{Y}}_{ij{t}_0}={Y}_{ij{t}_0}-{\mu}_{t_0} \). We then model the post-treatment body weight
\( {Y}_{ij{t}_1} \) using the binary treatment indicator
Gij, the mean centered baseline weight
\( {\overset{\sim }{Y}}_{ij{t}_0} \), and the baseline weight by treatment interaction
\( {G}_{ij}\times {\overset{\sim }{Y}}_{ij{t}_0} \) as follows:
$$ {Y}_{ij{t}_1}={\beta}_0^{(6)}+{\beta}_1^{(6)}{G}_{ij}+{\beta}_2^{(6)}{\overset{\sim }{Y}}_{ij{t}_0}+{\beta}_3^{(6)}{G}_{ij}\times {\overset{\sim }{Y}}_{ij{t}_0}+{e}_{ij}^{(6)},i=1,2,\dots, {n}_j;j=0,1; $$
(6)
$$ {e}_{i0}^{(6)}\sim N\left(0,\kern0.5em {\sigma}_{\epsilon_0^{(6)}}^2\right)\kern0.50em \mathrm{and}\ {\sigma}_{\epsilon_0^{(6)}}^2=\left(1-{\rho}_0^2\right){\sigma}_{01}^2 $$
$$ {e}_{i1}^{(6)}\sim N\left(0,\kern0.5em {\sigma}_{\epsilon_1^{(6)}}^2\right)\ \mathrm{and}\ {\sigma}_{\epsilon_1^{(6)}}^2=\left(1-{\rho}_1^2\right){\sigma}_{11}^2 $$
, where \( {\beta}_0^{(6)}={\mu}_{0{t}_1} \), \( {\beta}_1^{(6)}=\tau, \)\( {\beta}_2^{(6)}={\rho}_0\frac{\sigma_{0{t}_0}}{\sigma_0} \), and \( {\beta}_3^{(6)}= \)\( {\rho}_1\frac{\sigma_{1{t}_1}}{\sigma_0}-{\rho}_0\frac{\sigma_{0{t}_0}}{\sigma_0} \). \( {e}_{i0}^{(6)} \) and \( {e}_{i1}^{(6)} \) are i.i.d random errors in the control and treatment arms. \( {\beta}_1^{(6)} \) measures the treatment effect. \( {\beta}_2^{(6)} \) is the regression slope of the baseline body weight in the control arm. \( {\beta}_3^{(6)} \) measures the difference in the regression slopes of the baseline weight between the treatment and control arms. Model (6) is heteroscedastic because the error terms in the treatment and control arms have different residual variances.
As presented in Table
2, the OLS estimator
\( {\hat{\beta}}_{1, ols}^{(6)} \) is the adjusted mean difference in the post-treatment body weights controlling for a weighted mean difference of the baseline body weights between two arms with unequal weighting coefficients for treatment and control arms (i.e.,
\( {\hat{\beta}}_{2, ols}^{(6)}+{\hat{\beta}}_{3, ols}^{(6)} \) for the treatment group, and
\( {\hat{\beta}}_{2, ols}^{(6)} \) for the control group).
\( {\hat{\beta}}_{1, ols}^{(6)} \) is unbiased for
τ. The conditional variance of
\( {\hat{\beta}}_{1, ols}^{(6)} \), denoted by
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \), incorporates two different residual variances
\( {\sigma}_{\epsilon_0^{(6)}}^2 \) and
\( \kern0.5em {\sigma}_{\epsilon_1^{(6)}}^2 \) (Table
2). Standard statistical softwares such as SAS do not output
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) because OLS incorrectly assumes a common residual variance
\( {\sigma}_{\epsilon^{(6)}}^2 \), which is the following weighted average of
\( \kern0.5em {\sigma}_{\epsilon_0^{(6)}}^2 \) and
\( \kern0.5em {\sigma}_{\epsilon_1^{(6)}}^2 \):
$$ {\sigma}_{\epsilon^{(6)}}^2=\frac{n_0}{n_0+{n}_1}\kern0.5em {\sigma}_{\epsilon_0^{(6)}}^2+\frac{n_1}{n_0+{n}_1}{\sigma}_{\epsilon_1^{(6)}}^2 $$
Table 2
Estimators of treatment effect and variance estimators in a heterogeneous study population
ANCOVA-Post II | \( {\hat{\beta}}_{1, ols}^{(6)}=\left({\overline{y}}_{.1{t}_1}-\left({\hat{\beta}}_{2, ols}^{(6)}+{\hat{\beta}}_{3, ols}^{(6)}\right){\overline{\overset{\sim }{y}}}_{.1{t}_0}\right)-\left({\overline{y}}_{.0{t}_0}-{\hat{\beta}}_{2, ols}^{(6)}{\overline{\overset{\sim }{y}}}_{.0{t}_0}\right) \) | C | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right)=\left(\frac{1}{n_0}+\frac{{\overline{\overset{\sim }{y}}}_{.o{t}_0}^2}{\sum_{i=1}^{n_0}{\left(\tilde{y}_{{i}0{t}_0}-{\overline{\overset{\sim }{y}}}_{.0{t}_0}\right)}^2}\right)\ {\sigma}_{\epsilon_0^{(6)}}^2+\left(\frac{1}{n_1}+\frac{{\overline{\overset{\sim }{y}}}_{.1{t}_0}^2}{\sum_{i=1}^{n_0}{\left(\tilde{y}_{{i}1{t}_0}-{\overline{\overset{\sim }{y}}}_{.1{t}_0}\right)}^2}\right)\ {\sigma}_{\epsilon_1^{(6)}}^2 \) \( \kern0.5em {\sigma}_{\epsilon_0^{(6)}}^2=\left(1-{\rho}_0^2\right){\sigma}_{01}^2 \), \( \kern0.5em {\sigma}_{\epsilon_1^{(6)}}^2=\left(1-{\rho}_1^2\right){\sigma}_{11}^2 \) | \( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}+\frac{{\overline{\overset{\sim }{y}}}_{.o{t}_0}^2}{\sum_{i=1}^{n_0}{\left(\tilde{y}_{{i}0{t}_0}-{\overline{\overset{\sim }{y}}}_{.0{t}_0}\right)}^2}+\frac{{\overline{\overset{\sim }{y}}}_{.1{t}_0}^2}{\sum_{i=1}^{n_0}{\left(\tilde{y}_{{i}1{t}_0}-{\overline{\overset{\sim }{y}}}_{.1{t}_0}\right)}^2}\right){\hat{\sigma}}_{\epsilon^{(6)}}^2 \) \( {\hat{\sigma}}_{\epsilon^{(6)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}\right)}^2}{\left({n}_0+{n}_1-5\right)} \) |
| | U | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}\right)=\frac{1}{n_0}\left(1-{\rho}_0^2\right){\sigma}_{01}^2+\frac{1}{n_1}\left(1-{\rho}_1^2\right){\sigma}_{11}^2+{\left({\rho}_1\frac{\sigma_{11}}{\sigma_0}-{\rho}_0\frac{\sigma_{01}}{\sigma_0}\right)}^2\frac{\sigma_0^2}{n_0+{n}_1} \) | |
ANCOVA-Post I | \( {\hat{\beta}}_{1, ols}^{(7)}=\left({\overline{y}}_{.1{t}_1}-{\overline{y}}_{.0{t}_1}\right)-{\hat{\beta}}_{2, ols}^{(7)}\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right) \) | C | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right)=\left(\frac{1}{n_0}+\frac{\sum \limits_{i=1}^{n_0}{\left({y}_{i1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}^2\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_0}-{\overline{y}}_{.j{t}_0}\right)}^2}\right)\kern0.5em {\sigma}_{\epsilon_0^{(7)}}^2+\left(\frac{1}{n_1}+\frac{\sum \limits_{i=1}^{n_1}{\left({y}_{i1{t}_0}-{\overline{y}}_{.1{t}_0}\right)}^2\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}{\sum \limits_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{i1{t}_0}-{\overline{\overset{\sim }{y}}}_{.1{t}_0}\right)}^2}\right)\kern0.5em {\sigma}_{\epsilon_1^{(7)}}^2 \) \( \kern0.5em {\sigma}_{\epsilon_0^{(7)}}^2=\left(1-{\rho}_0^2\right){\sigma}_{01}^2 \), \( \kern0.5em {\sigma}_{\epsilon_1^{(7)}}^2=\left(1-{\rho}_1^2\right){\sigma}_{11}^2 \) | \( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right)=\left(\frac{1}{n_0}+\frac{1}{n_1}+\frac{\sum \limits_{i=1}^{n_0}{\left({y}_{i1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}^2\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_0}-{\overline{y}}_{.j{t}_0}\right)}^2}+\frac{\sum \limits_{i=1}^{n_1}{\left({y}_{i1{t}_0}-{\overline{y}}_{.1{t}_0}\right)}^2\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{.0{t}_0}\right)}{\sum \limits_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{i1{t}_0}-{\overline{\overset{\sim }{y}}}_{.1{t}_0}\right)}^2}\right){\hat{\sigma}}_{\epsilon^{(7)}}^2 \) \( {\hat{\sigma}}_{\epsilon^{(7)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}\right)}^2}{\left({n}_0+{n}_1-4\right)} \) |
| | U | \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}\right)=\frac{1}{n_0}\left[\left(1-{\rho}_0^2\right){\sigma}_{01}^2+{\left(\left({\rho}_1\frac{\sigma_{11}}{\sigma_0}-{\rho}_0\frac{\sigma_{01}}{\sigma_0}\right){p}_1\right)}^2{\sigma}_0^2+\frac{1}{n_1}\right[\left(1-{\rho}_1^2\right)\ {\sigma}_{11}^2+{\left(\left({\rho}_1\frac{\sigma_{11}}{\sigma_0}-{\rho}_0\frac{\sigma_{01}}{\sigma_0}\right){p}_0\right)}^2{\sigma}_0^2 \)] | |
cRM | \( {\hat{\gamma}}_{3,\kern0.5em gls}^{(4)}=\left({\overline{y}}_{.1{t}_1}-{\overline{y}}_{.0{t}_1}\right)-\Big(\frac{\rho_0{\sigma}_0{\sigma}_{01}}{\sigma_0^2}\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{..{t}_0}\right)-\frac{\rho_1{\sigma}_0{\sigma}_{11}}{\sigma_0^2}\left({\overline{y}}_{.1{t}_0}-{\overline{y}}_{..{t}_0}\right) \)) | U | \( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{3,\kern0.5em gls}^{(4)}\right)=\frac{1}{n_0}\left[\left(1-{\rho}_0^2\right){\sigma}_{01}^2+{\left(\left({\rho}_1\frac{\sigma_{11}}{\sigma_0}-{\rho}_0\frac{\sigma_{01}}{\sigma_0}\right){p}_1\right)}^2{\sigma}_0^2+\frac{1}{n_1}\right[\left(1-{\rho}_1^2\right)\ {\sigma}_{11}^2+{\left(\left({\rho}_1\frac{\sigma_{11}}{\sigma_0}-{\rho}_0\frac{\sigma_{01}}{\sigma_0}\right){p}_0\right)}^2{\sigma}_0^2 \)] | |
We let
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) denote the OLS model-based conditional variance of
\( {\hat{\beta}}_{1, ols}^{(6)} \) incorporating
\( {\sigma}_{\epsilon^{(6)}}^2 \) (Table
2). Since
\( {\sigma}_{\epsilon^{(6)}}^2 \) is generally unknown,
\( {\sigma}_{\epsilon^{(6)}}^2 \) is estimated by
$$ {\hat{\sigma}}_{\epsilon^{(6)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}\right)}^2}{\left({n}_0+{n}_1-4\right)}, $$
where
\( {\hat{y}}_{ij{t}_1} \) is the predicted value of
\( {y}_{ij{t}_1} \). We let
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) denote the OLS model-based variance estimator of
\( {\hat{\beta}}_{1, ols}^{(6)} \) with
\( {\hat{\sigma}}_{\epsilon^{(6)}}^2 \) substituted for
\( {\sigma}_{\epsilon^{(6)}}^2 \). and known constant
\( {\mu}_{t_0} \) (Table
2).
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is reported by standard statistical softwares (e.g., “proc reg” in SAS). To assess the validity of the model-based standard errors and
p-values from a regular
ANCOVAII model for unconditional inference, we need to examine: i) whether
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right) \); ii) whether
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}\right) \).
First,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is unbiased for
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right). \) However, the unbiasedness of
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) as an estimator of
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) depends on the relationship between
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) and
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \). Asymptotically, we have
$$ {\displaystyle \begin{array}{c}{\Delta }_{{\hat{\beta}}_{1, ols}^{(6)}}={\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)-\mathit{\operatorname{var}}\right({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)\\ {}=\left({\sigma}_{\epsilon_0^{(6)}}^2-{\sigma}_{\epsilon_1^{(6)}}^2\right)\left(\ \frac{1}{n_1}-\frac{1}{n_0}\right)\end{array}} $$
It can be shown in a balanced design (
n0 =
n1),
$$ {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)\approx \mathit{\operatorname{var}}\right({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right). $$
Thus,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is nearly unbiased for
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)\ \left[3\right]. \) When the design is unbalanced (
n0 ≠
n1),
$$ {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)\ne \mathit{\operatorname{var}}\right({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{\mathrm{i}j{t}_0}\right). $$
Hence,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is biased for
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right). \) Due to heteroscedasticity,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) over-estimates
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) if the group with a larger residual variance has larger sample size and the group with a smaller residual variance has smaller sample size, and otherwise may underestimate
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) [
3,
4].
Second, the common mean baseline weight
\( {\mu}_{t_0} \) is generally unknown. We need to estimate
\( {\mu}_{t_0} \) in
\( {\overset{\sim }{Y}}_{ij{t}_0} \) using the overall sample mean
\( {\hat{\mu}}_{t_0}=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{Y}_{ij{t}_0}}{n_0+{n}_1} \) but ANCOVA treats
\( {\hat{\mu}}_{t_0} \) as fixed and fails to capture this additional variability in the conditional variances. As shown below, it turns out that
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) underestimates
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}\right) \) by a factor of
\( {\beta}_{3, ols}^{\left[6\right]2}\mathit{\operatorname{var}}\left({\hat{\mu}}_{t_0}\right) \) [
3]:
$$ \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}\right)=E\left(\mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right)\right)+{\beta}_{3, ols}^{(6)2}\mathit{\operatorname{var}}\left({\hat{\mu}}_{t_0}\right). $$
Thus, the OLS model-based conditional inference is biased for unconditional hypothesis testing because of heteroscedasticity and neglecting of sampling variability in
\( {\hat{\mu}}_{t_0} \). To fix these two problems, we can use the following adjusted heteroscedasticity-consistent (HC) variance estimator to replace
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) for valid unconditional inference:
$$ {\hat{\mathit{\operatorname{var}}}}_{aHC}\left({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)={\hat{\mathit{\operatorname{var}}}}_{HC}\right({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right)+{\hat{\beta}}_{3, ols}^{(6)2}\frac{{\hat{\sigma}}_0^2}{n_0+{n}_1}, $$
where
\( {\hat{\mathit{\operatorname{var}}}}_{HC}\Big({\hat{\beta}}_{1, ols}^{(6)}\left|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is a HC variance estimator for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) [
19] and can be output from standard softwares. HC variance estimators are consistent (i.e., unbiased in large sample). Among all available HC variance estimators, HC2 was shown to have the best performance in finite samples [
3,
4] (e.g. “HCCMETHOD = 2” in proc. reg or “EMPIRICAL” in proc. mixed, SAS).
\( {\hat{\beta}}_{3, ols}^{\left[6\right]} \) is the OLS estimator of
\( {\beta}_3^{(6)} \), and
\( {\hat{\sigma}}_0^2 \) is the overall sample variance of the baseline body weight. It follows directly that
\( {\hat{\mathit{\operatorname{var}}}}_{aHC}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) is asymptotically unbiased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(6)}\right) \) and we can construct a valid test
\( t=\frac{{\hat{\beta}}_{1, ols}^{(6)}}{\sqrt{{\hat{\mathit{\operatorname{var}}}}_{aHC}\left({\hat{\beta}}_{1, ols}^{(6)}|{\overset{\sim }{Y}}_{ij{t}_0}\right)}\ } \) for testing
Ho :
τ = 0 unconditionally.
Method 7ANCOVAI: We model the post-treatment weight
\( {Y}_{ij{t}_1} \) using the binary treatment
G and the baseline weight
\( {Y}_{ij{t}_0} \):
$$ {Y}_{ij{t}_1}={\beta}_0^{(7)}+{\beta}_1^{(7)}{G}_{ij}+{\beta}_2^{(7)}{Y}_{ij{t}_0}+{e}_{ij}^{(7)} $$
(7)
$$ {e}_{i0}^{(7)}\sim N\left(0,\kern0.5em {\sigma}_{\epsilon_0^{(7)}}^2\right)\ \mathrm{and}\kern0.75em {\sigma}_{\epsilon_0^{(7)}}^2=\left(1-{\rho}_0^2\right){\sigma}_{01}^2+{\left({\beta}_3^{(6)}{p}_1\right)}^2{\sigma}_0^2 $$
$$ {e}_{i1}^{(7)}\sim N\left(0,\kern0.5em {\sigma}_{\epsilon_1^{(7)}}^2\right)\ \mathrm{and}\kern0.75em {\sigma}_{\epsilon_1^{(7)}}^2=\left(1-{\rho}_1^2\right){\sigma}_{11}^2+{\left({\beta}_3^{(6)}{p}_0\right)}^2{\sigma}_0^2 $$
, where \( {\beta}_0^{(7)}={\beta}_0^{(6)}-{\beta}_3^{(6)}{p}_0{\mu}_0, \) and \( {\beta}_1^{(7)}=\tau \). \( {e}_{i0}^{(7)} \) and \( {e}_{i1}^{(7)} \) are random errors in the control and treatment arms. Since \( {e}_{i0}^{(7)} \) and \( {e}_{i1}^{(7)} \) have different variances in general, model (7) is heteroscedastic and the severity of heteroscedasticity is determined by the correlation coefficient, the variances of the post-treatment weights in two arms, and whether the design is balanced.
As shown in Table
2, the OLS estimator
\( {\hat{\beta}}_{1, ols}^{(7)} \) is an adjusted mean difference in the post-treatment weights controlling for a weighted mean difference of the baseline weights between two arms with equal weighting coefficient for the treatment and control arms (i.e.,
\( {\hat{\beta}}_{2, ols}^{(7)} \) for both arms).
\( {\hat{\beta}}_{1, ols}^{(7)} \) is unbiased for
τ. The true conditional variance
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) incorporates two different residual variances. Similar to
ANCOVAII, the OLS model-based inference for
ANCOVAI also mistakenly assumes a constant residual variance
\( {\sigma}_{\epsilon^{(7)}}^2 \), which is a weighted average of
\( \kern0.5em {\sigma}_{\epsilon_0^{(7)}}^2 \) and
\( \kern0.5em {\sigma}_{\epsilon_1^{(7)}}^2 \), as follows:
$$ {\sigma}_{\epsilon^{(7)}}^2=\frac{n_0}{n_0+{n}_1}\kern0.5em {\sigma}_{\epsilon_0^{(7)}}^2+\frac{n_1}{n_0+{n}_1}{\sigma}_{\epsilon_1^{(7)}}^2. $$
Since
\( {\sigma}_{\epsilon^{(7)}}^2 \) is unknown, it is estimated by
$$ {\hat{\sigma}}_{\epsilon^{(7)}}^2=\frac{\sum_{j=0}^1{\sum}_{i=1}^{n_j}{\left({y}_{ij{t}_1}-{\hat{y}}_{ij{t}_1}\right)}^2}{n_0+{n}_1-3}, $$
where
\( {\hat{y}}_{ij{t}_1} \) is the predicted value of
\( {y}_{ij{t}_1} \) from model (7). The closed form expressions of the OLS model-based conditional variance
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) incorporating
\( {\sigma}_{\epsilon^{(7)}}^2 \) and the OLS model-based variance estimator
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) with
\( {\hat{\sigma}}_{\epsilon^{(7)}}^2 \) substituted for
\( {\sigma}_{\epsilon^{(7)}}^2 \) are given in Table
2. Recall that standard statistical softwares report
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \). To show the model-based standard errors and
p-values are valid for unconditional inference, we need to examine: i) whether
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\Big({\hat{\beta}}_{1, ols}^{(7)}\left|{Y}_{ij{t}_0}\right) \); ii) whether
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) is unbiased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}\right) \).
First,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) is unbiased for
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) but the unbiasedness of
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) as an estimator of
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) depends on the relationship between
\( {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) and
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \). Asymptotically, we have
$$ {\displaystyle \begin{array}{c}{\Delta }_{{\hat{\beta}}_{1, ols}^{(7)}}={\mathit{\operatorname{var}}}_{ols}\Big({\hat{\beta}}_{1, ols}^{(7)}\left|{Y}_{ij{t}_0}\right)-\mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right)\\ {}=\left({\sigma}_{\epsilon_0^{(7)}}^2-{\sigma}_{\epsilon_1^{(7)}}^2\right)\left(\ \frac{1}{n_1}-\frac{1}{n_0}\right)\end{array}} $$
When sample sizes are equal between two arms, we have
$$ {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}\left|{Y}_{ij{t}_0}\right)\approx \mathit{\operatorname{var}}\right({\hat{\beta}}_{1, ols}^{(7)}\left|{Y}_{ij{t}_0}\right). $$
Thus,
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) is nearly unbiased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) in a balanced design [
3]. When sample sizes are not equal between two arms,
$$ {\mathit{\operatorname{var}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}\left|{Y}_{ij{t}_0}\right)\ne \mathit{\operatorname{var}}\right({\hat{\beta}}_{1, ols}^{(7)}\left|{Y}_{ij{t}_0}\right), $$
it follows directly that
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) is biased for
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) due to heteroscedasticity.
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) may over-estimate
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) when the group with a larger residual variance has larger sample size and the group with a smaller residual variance has smaller sample size, and otherwise may underestimate
\( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{\overset{\sim }{Y}}_{ij{t}_0}\right) \) [
3,
4] .
ANCOVAI is robust against heteroscedasticity in a balanced design, but not in an unbalanced design.
Second, different from ANCOVAII, \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) is unbiased for \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}\right) \) because \( \mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}\right)=E\left(\mathit{\operatorname{var}}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right)\right) \).
Thus, the model-based standard errors and
p-values are valid for unconditional inference in a balanced design but are biased in an unbalanced design only due to heteroscedasticity. This bias can be easily corrected by replacing
\( {\hat{\mathit{\operatorname{var}}}}_{ols}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \) with an HC variance estimator
\( {\hat{\mathit{\operatorname{var}}}}_{HC}\left({\hat{\beta}}_{1, ols}^{(7)}|{Y}_{ij{t}_0}\right) \)[
4,
19] and corrected
ANCOVAI will provide valid unconditional inference.
Constrained Repeated Measures heterogeneous variance model (“cRM”): We model the baseline and post-treatment weights (
\( {Y}_{ij{t}_0}, \)\( {Y}_{ij{t}_1} \)) jointly using the binary time point
Tij, time by treatment interaction
Gij ×
Tij:
$$ {Y}_{ij t}={\gamma}_0^{(8)}+{\gamma}_1^{(8)}{T}_{ij}+{\gamma}_2^{(8)}{G}_{ij}\times {T}_{ij}+{e}_{ij t}^{(8)}\ j=0,1;i=1,2,\dots {n}_j. $$
(8)
$$ \left(\begin{array}{c}{e}_{i0{t}_0}^{(8)}\\ {}{e}_{i0{t}_1}^{(8)}\end{array}\right)\sim N\left(\left[\begin{array}{c}0\\ {}0\end{array}\right],{\sum}_0\right)\ \mathrm{in}\ \mathrm{the}\ \mathrm{control}\ \mathrm{arm}, $$
$$ \left(\begin{array}{c}{e}_{i1{t}_0}^{(8)}\\ {}{e}_{i1{t}_1}^{(8)}\end{array}\right)\sim N\left(\left[\begin{array}{c}0\\ {}0\end{array}\right],{\sum}_1\right)\ \mathrm{in}\ \mathrm{the}\ \mathrm{treatment}\ \mathrm{arm}, $$
where
\( {\gamma}_0^{(8)}={\mu}_{t_0},{\gamma}_2^{(8)}={\mu}_{0{t}_1}-{\mu}_{0{t}_0} \), and
\( {\gamma}_2^{(8)}=\tau \). Noting that subjects in the treatment and control arms have different variance-covariance structures for the association between the pre- and post-treatment weights, we fit a
cRM heterogeneous variance GLS model with group specific variance-covariance structure (“repeated/group=” in SAS proc. mixed procedure specifies distinct variance-covariance structure for each treatment arm). The formulas of
\( {\hat{\gamma}}_{2, gls}^{(8)} \) and
\( \mathit{\operatorname{var}}\left({\hat{\gamma}}_{2, gls}^{(8)}\right) \)are listed in Table
2. The GLS estimator
\( {\hat{\gamma}}_{2, gls}^{(8)} \)is asymptotically unbiased for
\( {\gamma}_2^{(8)} \). REML is used to derive the empirical or model-based variance estimator
\( {\hat{\ \mathit{\operatorname{var}}}}_{reml}\Big({\hat{\gamma}}_{2,\kern0.5em gls}^{(8)} \)).