Appendix
A general model for outcomes from n-of-1 trials arranged in cycles can be expressed as follows.
$${Y}_{ir s}={\lambda}_i+{\beta}_{ir}+{\varepsilon}_{ir s}+{Z}_{ir s}{\tau}_i,$$
(1)
where
Yirs is the measured outcome for occasion
s,
s = 1, 2 of cycle
r,
r = 1, 2…
ki for patient
i,
i = 1, 2⋯
n. Here,
λi ∼
N(
Λ,
ϕ2)is a random effect for patient
i,
βir ∼
N(0,
γ2) is a random effect for cycle
rwithin patient
i,
εirs ∼
N(0,
σ2) is a random error term for occasion
sof cycle
r for patient
i, and
τi ∼
N(
Τ,
ψ2) is a random treatment effect for patient
i, with
\({Z}_{irs}=-\frac{1}{2},\frac{1}{2}\), depending on whether the patient was assigned A or B on that occasion in that cycle. All stochastic terms are assumed independent of each other.
It is worth drawing attention here to a potential point of confusion. If we study the variation of the difference between treatments A and B for a given patient, the variance of these differences will be expected to be 2σ2 because each within cycle difference has a contribution from two errors, εir1, εir2. Because in a matched pairs analysis, 2σ2is estimated directly, but in a linear model, one would estimate the variance of the εirsterms, which is σ2, there is a danger that readers may misunderstand what an author means by referring to a within-cycle variance. If variance terms are picked up from a paper for planning purposes, there is a danger of miscalculation of the necessary sample size by either a factor of two or of one half. The moral is it is best to be explicit, and indeed, in an earlier version of this article (as noticed by a referee), both conventions were used.
If we reduce everything to within-cycle differences first, then the random patient and cycle terms are eliminated, and only
\({\mathit{\sf{\sigma}}}^{\sf{2}},{\mathit{\sf{\gamma}}}^{\sf{2}}\)are relevant to calculating our estimates. We can have
$$\hat{\tau}=\frac{\sum_{i=1}^n{\sum}_{r=1}^{k_i}\frac{Y_{ir2}-{Y}_{ir1}}{Z_{ir2}-{Z}_{ir1}}}{\sum_{i=1}^n{k}_i}$$
(2)
as an estimate of
\(\mathit{\sf{T}}\). Since
\({\mathit{\sf{Z}}}_{\mathit{\sf{ir}}\sf{2}}-{\mathit{\sf{Z}}}_{\mathit{\sf{ir}}\sf{1}}=\sf{1},-\sf{1}\) depending on whether A is given on the first occasion in a cycle or the second, this is simply the sum of all the within-cycle differences for treatment B minus treatment A divided by the total number of cycles. If we have the same number of cycles,
k, per patient, this simplifies to
$$\hat{\tau}=\frac{\sum_{i=1}^n{\sum}_{r=1}^k\frac{Y_{ir2}-{Y}_{ir1}}{Z_{ir2}-{Z}_{ir1}}}{nk}.$$
(3)
What the appropriate variance of this estimator is depends on what we consider it is an estimate of, that is to say, what we consider
Τ to be. For example, if we take it to be an estimate of the mean treatment effect for these patients, then this is fixed for the sample. We shall refer to this as the
local purpose. We then have that the variance is
$$Var\left(\hat{\tau}\right)=\frac{2{\sigma}^2}{\sum_{i=1}^n{k}_i}.$$
(4)
Note, as discussed above, the appearance of the factor 2 because variances of within cycle differences have a contribution from each of two error terms.
In the balanced case where
ki =
k, ∀
i, then we have
$$Var\left(\hat{\tau}\right)=\frac{2{\sigma}^2}{nk}.$$
(5)
On the other hand, if we take
Τ to be the mean treatment effect in a population of patients from whom the patients studied may be taken to be a random sample, then we have
$$Var\left(\hat{\tau}\right)=\frac{\psi^2}{n}+\frac{2{\sigma}^2}{\sum_{i=1}^n{k}_i},$$
(6)
with, in the balanced case,
$$Var\left(\hat{\tau}\right)=\frac{\psi^2}{n}+\frac{2{\sigma}^2}{nk}.$$
(7)
We refer to this as the global purpose. Note that for the global purpose, (a) this estimator is only optimal in the unbalanced case or if ψ2 = 0, and (b) whether or not this is optimal, the variance for the global is only the same as for the local purpose if ψ2 = 0.
An alternative approach to estimation starts with the individual patient estimates,
$${\hat{\tau}}_i=\frac{\sum_{r=1}^{k_i}\frac{Y_{ir2}-{Y}_{ir1}}{Z_{ir2}-{Z}_{ir1}}}{k_i}.$$
(8)
For the global purpose, these have variances
$$Var\left({\hat{\tau}}_i\right)={\psi}^2+\frac{\sigma_d^2}{k_i},.$$
(9)
where
\({\sigma}_d^2=2{\sigma}^2\), with the subscript
d standing for difference.
These estimates may then be combined in a weighted sum to produce an estimate
$${\hat{T}}_{global}={\sum}_{i=1}^n{w}_i{\hat{\tau}}_i,$$
(10)
where
$${w}_i=\frac{\frac{1}{\psi^2+\frac{\sigma_d^2}{k_i}}}{\sum_{i=1}^n\left(\frac{1}{\psi^2+\frac{\sigma_d^2}{k_i}}\right)},$$
(11)
that is to say, with weights inversely proportional to the variance and summing to one. Note that (9), (10), and (11) define an estimate that has the same general form as a random effects meta-analysis estimator, the only practical difference being that
σ2should be estimated globally, rather than individually patient by patient. The variance of (10) is given by
$$Var\left({\hat{T}}_{global}\right)=\frac{1}{\sum_{i=1}^n\frac{1}{Var\left({\hat{\tau}}_i\right)}}.$$
(12)
Note also that if ki = k, ∀ i, i = 1⋯n, we have from (10) that \({\hat{T}}_{global}=\frac{\sum_{i=1}^n{\hat{\tau}}_i}{n}\) and from (12) that \(Var\left({\hat{T}}_{global}\right)=\frac{Var\left(\hat{\tau}\right)}{n}\).
In the ‘Estimates of effects for individual patients’ section, the formula for shrunk estimates was given as
$$shrunk=w\times personal+\left(1-w\right) global.$$
(13)
If we assume that a suitably large number of patients have been studied, then the global estimate as a prediction for the long-term average may be assumed to have a variance of
ψ2, whereas the local estimate for patient
i may be assumed to have a variance of
\(\frac{2{\sigma}^2}{k_i}\). These two estimates should be weighed proportionately to the inverse of their variances, so we have
$$w=\frac{\psi^2}{\psi^2+\frac{2{\sigma}^2}{k_i}},1-w=\frac{\frac{2{\sigma}^2}{k_i}}{\psi^2+\frac{2{\sigma}^2}{k_i}}.$$
(14)
Since wis the weight for the personal element and ψ2 is the variation in the true treatment effect from patient to patient, we can see that, other things being equal, as this variation becomes more important, more weight is given to the global estimate. Similarly, since 1 − w is the weight for the global estimate, we can see that as the within patient variation σ2 gets larger, then more weight will be given to the global estimate, although this can be reduced by increasing the number of cycles ki in which the patient is observed.
Finally, we have as a formula for the variance of the shrunk estimate,
$$Var(shrunk)=\frac{2{\psi}^2{\sigma}^2}{k_i{\psi}^2+2{\sigma}^2}.$$
(15)
Note that if we have no local information on patient i so that ki = 0, we have that (15) is equal to ψ2, which, since we must rely on global information only, is to be expected. On the other hand, as ψ2 → ∞, we have that (15) \(\to \frac{2{\sigma}^2}{k_i}\) which is the personal variance, which again is only to be expected, since the results from other patients contribute no information. In general, however, (15) is lower than either the global or the personal variance. Thus, an advantage of the shrunk estimate is the reduction in variance that it brings.
A further point to note is that the formula does not allow for uncertainty in the global estimate itself. The uncertainty in using the global estimate as a prediction of the effect for a given patient reflects the variation of the individual patient effects from the supposed true average global value. In practice, this global value itself is subject to uncertainty and, as a referee has pointed out, since the values from an individual also contribute to the global estimate, there will also be a small correlation between the two, which is also in practice ignored. In short, this approach works best when one has data from many patients. See also my paper on sample size determination [
22] for further discussion.