Skip to main content
Erschienen in: BMC Medical Research Methodology 1/2009

Open Access 01.12.2009 | Research article

Design effect in multicenter studies: gain or loss of power?

verfasst von: Emilie Vierron, Bruno Giraudeau

Erschienen in: BMC Medical Research Methodology | Ausgabe 1/2009

Abstract

Background

In a multicenter trial, responses for subjects belonging to a common center are correlated. Such a clustering is usually assessed through the design effect, defined as a ratio of two variances. The aim of this work was to describe and understand situations where the design effect involves a gain or a loss of power.

Methods

We developed a design effect formula for a multicenter study aimed at testing the effect of a binary factor (which thus defines two groups) on a continuous outcome, and explored this design effect for several designs (from individually stratified randomized trials to cluster randomized trials, and for other designs such as matched pair designs or observational multicenter studies).

Results

The design effect depends on the intraclass correlation coefficient (ICC) (which assesses the correlation between data for two subjects from the same center) but also on a statistic S, which quantifies the heterogeneity of the group distributions among centers (thus the level of association between the binary factor and the center) and on the degree of global imbalance (the number of subjects are then different) between the two groups. This design effect may induce either a loss or a gain in power, depending on whether the S statistic is respectively higher or lower than 1.

Conclusion

We provided a global design effect formula applying for any multicenter study and allowing identifying factors – the ICC and the distribution of the group proportions among centers – that are associated with a gain or a loss of power in such studies.
Hinweise

Electronic supplementary material

The online version of this article (doi:10.​1186/​1471-2288-9-39) contains supplementary material, which is available to authorized users.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

The study was designed by EV and BG. EV performed the statistical analysis and drafted the article, which was then revised by BG. All authors approved the final manuscript.

Background

Multicenter studies involve correlation in data because subjects from the same center are more similar than are those from different centers [1]. Such a correlation potentially affects the power of standard statistical tests, and conclusions made under the assumption that data are independent can be invalidated.
A usual measure of the clustering effect on an estimator (often a treatment or a group effect) is the design effect (Deff). The Deff is defined as the ratio of two variances: the variance of the estimator when the center effect is taken into account over the variance of the estimator under the hypothesis of a simple random sample [2, 3]. The Deff represents the amount by which the sample size needs to be multiplied to account for the design of the study. Ignoring clustering can lead to over- (Deff < 1) or underpowered (Deff > 1) studies.
In cluster randomized trials, clustering produces a loss of power and Donner and Klar proposed a method to inflate the sample size to take data correlation into account [4]. On the contrary, in individually randomized trials with equal treatment arm sizes, a center effect induces a gain in power, and sample size can be reduced [5]. Thus, in some situations, correlation in data induces a loss of power, and in others, a gain in power. To our knowledge, complete explanations for this striking discrepancy are lacking.
We aimed to produce a measure of clustering in multicenter studies testing the effect of a binary factor on a continuous outcome. We first present the statistical model used and the associated design-effect formula. Then we explore the general form of this design effect under particular study designs. Finally, we give examples to illustrate our results.

Methods and results

Theoretical Issues

The mixed-effects model

Let us consider a multicenter study aimed at comparing two groups on a continuous outcome. Several situations can be considered. If subjects are randomly assigned to a group (e.g., a treatment arm), the study is a randomized trial; otherwise, it is an observational study, and the group data depicts exposure to a binary risk factor. Data are distributed as follows:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ1_HTML.gif
(1)
where Y ijk denotes the response from the kth subject, of the ith group, in the jth center. The overall response mean is μ. Each center is of size m j = m 1j + m 2j , and each group is of size https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq1_HTML.gif , with N = n 1 + n 2 being the total number of subjects in the study. The group effects {α i } are fixed, with https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq2_HTML.gif . We assume that centers are a random sample of a large population of centers, so the center effects {B j } are independent and identically distributed (iid) https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq3_HTML.gif . The residual errors {ε ijk } are assumed to be https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq4_HTML.gif and independent of {B j }. The center effect, quantified by the intraclass correlation coefficient (ICC), ρ, and defined as the proportion of the total variance that is due to the between-center variability, can be defined from model (1) as follows [6]:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ2_HTML.gif
(2)

Group effect variance

Two-way ANOVA
The group effect variance can be shown to equal (Appendix 1):
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ3_HTML.gif
(3)
One-way ANOVA
Ignoring the center effect, model (1) reduces to:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ4_HTML.gif
(4)
where Y ik represents the response from the kth subject in the ith group. The random errors { https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq5_HTML.gif } are iid https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq6_HTML.gif . Thus, the variance of the group effect is as follows:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ5_HTML.gif
(5)
and we have (Table 1):
Table 1
One-way ANOVA for data distributed according to the two-way mixed-effects model (1).
Source
DF
SS
E(MS)
Group
2 - 1
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq7_HTML.gif
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq8_HTML.gif
Residual
N - 2
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq9_HTML.gif
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq10_HTML.gif
Total
N - 1
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq11_HTML.gif
 
When data are distributed according to the mixed model (1) but analyzed by performing a one-way ANOVA – as if data were distributed according to model (4) – the expectation of the residual mean squares (denoted https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq12_HTML.gif in the framework of model (4)) can actually be expressed as a function of https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq13_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq14_HTML.gif , the variance components associated to the true underlying statistical model (i.e. the mixed model (1)).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ6_HTML.gif
(6)

The Design Effect

The Deff measures the effect of clustering on the group effect variance. It is defined as the ratio of the group effect variances (3) over (5). Using equation (6) we have:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ7_HTML.gif
(7)
Multicenter randomized trials often recruit a large number of subjects. Then, assuming a large total sample size and numerous centers, the {m ij } are small in comparison with N, and https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq15_HTML.gif can be approximated by 1. Expression (7) then becomes:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ8_HTML.gif
(8)
where ρ is the ICC as defined in (2) and https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq16_HTML.gif .

Simulation study

We first conducted a simulation study aiming validating the approximate formula we proposed. We considered equal and varying center sizes for 12 combinations of the total sample size and number of centers (100 subjects for 5, 10 or 20 centers, 200 subjects for 5, 10, 20 or 50 centers, 500 subjects for 5, 10, 20, 50 or 100 centers), 4 group distributions (from balanced groups within centers to randomization of centers, which are then nested within the groups) and two ICC values (0.01 and 0.10). One thousand simulations were conducted using SAS 9.1 (SAS Institute, Cary, NC) for each combination of the parameters. Table 2 presents the average exact design effect estimate and average relative difference between exact and approximate design effect calculations for all these situations, for varying center sizes (20% of centers recruit 80% of subjects). Although such extreme imbalance in center sizes is unlikely to occur (and not advisable, mainly in cluster trial designs including very few centers, such as 5 or 10 centers), it allows testing the robustness of our formula even in such extreme situations. Similar results were found for equal center sizes (data not shown). Results show that the approximate design effect formula always slightly underestimates the exact formula since all relative differences are positive. These differences increase with the ICC and decrease, as expected, while the number of centers increases but are not influenced by the total number of subjects. Moreover, they globally increase with the design effect. All of these results are below (or equal) 0.0771, indicating that our formula applies in the majority of multicenter designs, with a better accuracy (relative differences lesser than 0.052) for designs including more than 10 centers.
Table 2
Validation of the approximate design effect formula.
ICC = 0.01
N subjects
100
200
500
N centers
5
10
20
5
10
20
50
5
10
20
50
100
S1 Deff
0.9969
0.9938
0.9921
0.9966
0.9936
0.9922
0.9911
0.9965
0.9933
0.9919
0.9913
0.9908
rdiff
0.0065
0.0032
0.0016
0.0065
0.0032
0.0016
0.0006
0.0065
0.0032
0.0016
0.0006
0.0003
S2 Deff
0.9972
0.9949
0.9928
0.9972
0.9956
0.9938
0.9917
0.9980
0.9989
0.9956
0.9931
0.9918
rdiff
0.0065
0.0032
0.0014
0.0065
0.0032
0.0016
0.0005
0.0065
0.0033
0.0016
0.0006
0.0003
S3 Deff
1.0102
1.0306
1.0147
1.0217
1.0622
1.0431
1.0132
1.0575
1.1788
1.1143
1.0487
1.0204
rdiff
0.0066
0.0035
0.0016
0.0066
0.0036
0.0018
0.0006
0.0066
0.0036
0.0019
0.0007
0.0003
S4 Deff
1.1038
1.0323
1.0285
1.2026
1.0538
1.0604
1.0184
1.4788
1.1290
1.1588
1.0559
1.0186
rdiff
0.0077
0.0051
0.0027
0.0077
0.0052
0.0030
0.0011
0.0077
0.0053
0.0030
0.0013
0.0006
ICC = 0.10
N subjects
100
200
500
N centers
5
10
20
5
10
20
50
5
10
20
50
100
S1 Deff
0.9655
0.9356
0.9197
0.9642
0.9337
0.9209
0.9105
0.9631
0.9313
0.9177
0.9124
0.9076
rdiff
0.0643
0.0318
0.0155
0.0649
0.0320
0.0160
0.0061
0.0649
0.0324
0.0161
0.0063
0.0031
S2 Deff
0.9709
0.9469
0.9269
0.9696
0.9547
0.9359
0.9171
0.9793
0.9827
0.9549
0.9300
0.9174
rdiff
0.0656
0.0318
0.0142
0.0648
0.0323
0.0157
0.0053
0.0651
0.0325
0.0161
0.0063
0.0028
S3 Deff
1.1101
1.3018
1.1721
1.2095
1.6471
1.4256
1.1337
1.6662
2.7175
2.1685
1.4965
1.2049
rdiff
0.0654
0.0349
0.0166
0.0659
0.0354
0.0182
0.0063
0.0662
0.0358
0.0185
0.0074
0.0034
S4 Deff
2.0718
1.3360
1.2725
3.1669
1.5750
1.6252
1.1934
6.2708
2.5759
2.5886
1.5687
1.2017
rdiff
0.0768
0.0507
0.0272
0.0770
0.0517
0.0299
0.0110
0.0771
0.0513
0.0299
0.0126
0.0059
ICC: Intraclass Correlation Coefficient
Simulations are conducted with varying center sizes: 20% of centers recruit 80% of subjects. Average exact design effect estimate (Deff) and average relative difference (rdiff) between exact and approximate design effect formula are given for 4 situations (Si, i = 1,2,3,4), two ICC values, and obtained for 1000 simulations.
S1: Equal group sizes. In each center, the probability, for a subject, to be in group 1 is 1/2
S2: Slight variations in group 1 proportions among centers. The ratio between the sizes of group 1 and group 2 varies uniformly between 0.8 and 1.25 among centers
S3: Important variations in group 1 proportions among centers. The ratio between the sizes of group 1 and group 2 varies uniformly between 0.1 and 10 among centers
S4: "Cluster design". The center is nested within the group and the probability, for each center, to be in group 1 is 1/2

Some specific designs

Stratified Multicenter Individually Randomized Trial
Assuming that randomization is balanced and stratified on centers, we then have equal group size ( https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq17_HTML.gif ) and equal number of subjects in the two groups in each center (∀ j = 1,..., Q, https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq18_HTML.gif ). The Deff reduces to:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ9_HTML.gif
(9)
In a stratified multicenter individually randomized trial, the Deff is smaller than 1 and its value decreases as the ICC increases, which involves a gain in power allowing a reduction in sample size, as shown by Vierron et al. [5].
Matched Pair Design
Some studies yield observations that are individually matched, such as cross-over trials, trials on matched subjects (which are, for example, matched by age or sex) or data (e.g. two eyes from the same subject) or before-after studies. Assuming pairs of matched data, pairs can be considered as centers, thus leading to a particular case of the stratified multicenter individually randomized trial with m 1j = m 2j = 1. Then the Deff equals:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ10_HTML.gif
(10)
In a matched pair design, the variance of the differences between paired responses equals:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ11_HTML.gif
(11)
where σ 2 is the variance of observations in a standard parallel group design.
Then, correcting the classical sample size formula for two independent samples with the Deff (1 - ρ) and replacing the σ 2(1 - ρ) term by https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq19_HTML.gif leads to the sample size formula used for paired data studies [7]:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ12_HTML.gif
(12)
where d is the difference in mean responses from the two groups.
Cluster Randomized Trial and Expertise-based Randomized Trial
In a cluster randomized trial, clusters rather than subjects are randomly assigned to a treatment group. Considering centers as clusters, for each center we then have m 1j = 0 or m 2j = 0. Such a design is also encountered in individually randomized trials in which clustering is imposed by the intervention design and is nested within groups, such as when subjects are assigned to two treatment arms for which the intervention is delivered by several physicians, each participating in only one arm of the study [8, 9]. In this case, equation (8) reduces to:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ13_HTML.gif
(13)
where https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq20_HTML.gif . With roughly equal cluster sizes and assuming the same number of subjects in each arm ( https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq17_HTML.gif ), the Deff can be approximated as follows:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ14_HTML.gif
(14)
where https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq21_HTML.gif is the mean cluster size. This value is the inflation factor [4], used for sample size calculation in cluster randomized trials.
Multicenter Observational Study
In a multicenter observational study, group sizes are likely to differ, at the level of the center (i.e., m 1j m 2j ) or globally (i.e., n 1n 2). Nevertheless, with identical group distributions among centers (i.e., the proportion of subjects in group 1 is p ∈ ]0;1[, whatever the center is), the design effect reduces to:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ15_HTML.gif
(15)
Thus, in an observational study, with all centers having identical group distributions – even if the global group sizes are not equal (i.e., even if n 1n 2) – taking into account the center effect leads to increased power, as with stratified individually randomized trials.
No design effect: Deff = 1.
From formula (8), Deff = 1 leads to:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ16_HTML.gif
(16)
Rewriting S as https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq22_HTML.gif , we obtain a statistic that estimates, for group 1, the difference between the observed group size (i.e., m 1j ) and its expected value under the assumption of centers having identical group proportions (i.e., https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq23_HTML.gif ). Therefore, when this statistic – providing a measure of heterogeneity of the group distributions among centers (thus the level of association between the group and the center) – is below 1, the Deff is also below 1 and using a statistical model that takes into account the center effect leads to increased power. On the contrary, when the group distributions differ strongly among centers, the S statistic, and then the Deff, is greater than 1, thus leading to a loss of power. At the extreme case where centers are totally nested within groups, the loss of power can be very important and it has been shown that omitting the center effect in analyzes leads to type I error [4]. The link between the power of multicenter studies and the design effect can be established as follows. Be n i the size of group i, ES the expected effect size and z γ the quantile of the standard normal distribution such that P(Zz γ ) = γ (Z being N(0,1)). The sample size calculation formula allowing testing the group effect on a continuous outcome and corrected for the design effect is [7, 10]:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ17_HTML.gif
(17)
Then, the power of any multicenter study depends on the design effect according to the following relation:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equ18_HTML.gif
(18)
where Φ(·) is defined as the cumulative density function of N(0,1). As the design effect increases and exceeds 1, the power decreases and sample size has to be inflated to reach the nominal power. On the contrary, when the design effect value is below 1, the power is larger than the nominal one, allowing reducing the required sample size.

Example

Table 3 presents data for hypothetical studies of 10 centers of unequal sizes. In each case, the proportion of subjects in group 1 equals 25% but this proportion varies more or less among centers according to the design of the study. The center sizes imbalance is voluntary less important than in the simulation study and represents a more likely study design. This example shows clearly that, when the proportion of subjects in group 1 varies slightly around the global proportion (the "quite homogeneous" column) the design effect is below 1 then indicating a gain in power. On the contrary, when this proportion varies strongly (the "heterogeneous" column), the design effect exceeds 1, involving a loss of power. In the last column, we present the extreme case where centers are nested within the groups. This situation, which can be identified with that of a cluster randomized trial, leads to an important loss in power as shown by the very large design effect.
Table 3
Design effects calculations for three different group distributions among centers.
Group distribution among centers
Quite homogeneous
Heterogeneous
Cluster design
Group size per center
m 1j
m 2j
%*
m 1j
m 2j
%*
m 1j
m 2j
%*
Center 1 (n = 57)
16
41
28
11
46
19
0
57
0
Center 2 (n = 38)
10
28
26
24
14
63
38
0
100
Center 3 (n = 44)
11
33
25
7
37
16
0
44
0
Center 4 (n = 15)
3
12
20
1
14
7
0
15
0
Center 5 (n = 41)
9
32
22
8
33
20
0
41
0
Center 6 (n = 19)
5
14
26
10
9
53
19
0
100
Center 7 (n = 37)
8
29
22
9
28
24
0
37
0
Center 8 (n = 52)
12
40
23
4
48
8
0
52
0
Center 9 (n = 12)
3
9
25
1
11
8
0
12
0
Center 10 (n = 28)
8
20
29
10
18
36
28
0
100
S
0.14
5.79
33.77
Deff ( ρ = 0.10)
0.91
1.48
4.28
*group 1 proportion in each center
The global proportion of subjects in group 1 is 25%, for each group distribution, and the Intraclass Correlation Coefficient is equal to 0.10.
To illustrate the impact of heterogeneity between the global group sizes on the design effect, we considered hypothetical situations, less likely to occur, where 10 centers recruit 20 subjects each, for balanced designs (i.e., n 1 = n 2, Table S4a in Additional file 1) and imbalanced designs (i.e., n 1 ≠ n2, Table S4b in Additional file 1), and for different levels of heterogeneity of group distributions among centers and two ICC values. As expected, the Deff increases with S and increases with the ICC. Moreover, if we focus on the "strongly heterogeneous" column, we observe a higher Deff with imbalance between the two groups (Table S4b in Additional file 1, Deff = 1.757 for ρ = 0.1) than with balance between the groups (Table S4a in Additional file 1, Deff = 1.620 for ρ = 0.1), which can be analytically explained (Appendix 2). Thus, the impact of heterogeneity of the group distributions among centers is greater with increased imbalance between the two group sizes. See additional file 1 for results from this example.

Discussion and conclusion

In a multicenter study, the design effect measures the effect of clustering due to multisite recruitment of subjects. As shown in formula (18), the power of such a study is directly affected by the design effect value. Our work aimed explaining why some situations of multicenter studies, such as individually randomized trials, lead to a gain in power whereas others, such as cluster randomized trials lead to a loss of power.
We derived a simple formula assessing the clustering effect in a multicenter study aiming to estimate the effect of a binary factor on a continuous outcome, through an individual level analysis with a mixed effect model: Deff = 1+(S-1)ρ. The design effect depends on ρ, the correlation between observations from the same center. It also depends on S, a statistic that quantifies the degree of heterogeneity of group distributions among centers, and in other words, the level of association between the binary factor and the center. S increases with the heterogeneity of the group distributions among centers, which leads to an increased Deff and a loss of power, and falls below 1 when the group distributions are identical between centers, thus leading to a Deff below 1 and a gain in power. It is now known that balanced designs such as individually randomized trials increase their power when including the center effect in analyses [5], and that cluster randomized trials should increase their sample size to reach the nominal power and account for the center effect in the analyses to protect against type I error inflation [4]. Our simple formula throws light on the relation between these two situations and allows calculating the design effect for any multicenter design.
We used in our developments a weighted method to assess the group effect: this method gives equal weight to each subject, whatever the size of his/her center is. Different methods of analysis could be used. In the frame of multicenter randomized trials, Lin et al. and Senn et al. discuss this point and show that a weighted analysis is more powerful than an unweighted one, particularly when there is unbalance in sample sizes between centers [11, 12]. The weighted method is then often recommended for analyses of data from multicenter randomized trials, what justifies our choices for model (1) [13]. However, in clusters randomized trials, Kerry et al. show that the minimum variance weights are the most efficient weights in the estimation of the design effect in the presence of important imbalance between the clusters sizes, but that weighting the clusters by their sizes give similar – though over estimated – results, except when clusters are large [14]. Our formula aims to apply to any multicenter study, whatever its design is, from individually to cluster randomized trials. Then, it may not use the most powerful method of calculation for some particular multicenter designs but has the great advantage to be simple and general.
Apart from the mixed effect model (1) we described, we did not develop the practical aspect of the analysis stage of a multicenter study. Several statistical software packages are available to perform analyses of correlated data, such as data from multicenter designs. Zhou et al. and Murray et al. review many of these programs and detail, among others, appropriate procedures and available options allowing specifying data modeling [15, 16]. Moreover, some tutorials present step-by-step illustrations of the use of SAS and SPSS mixed model procedures [17, 18]. Lastly, Pinheiro and Bates provide an overview of the application of mixed-effects models in S and S-PLUS which are easily transposable to the R software [19].
In the field of cluster randomized trials, several authors worked on the planning of studies through the design effect and sample size calculations and proposed extensions of classical formula, for example to account for imbalance in cluster sizes [20, 21]. Our formula does not aim to substitute for these more specific and precise formula but to connect several multicenter designs through a design effect formula. This result helps in understanding the impact of the correlation on power of multicenter studies, whatever their designs are, and is particularly useful for observational studies where the center effect question is not often taken into account at the planning and/or at the analysis stages [22, 23]. However, when extended design effects formulas exist, dealing with a particular problem such as that of imbalance cluster sizes in cluster randomized trials, we recommend using them.
This simple result could now be extended to designs including, for example, several nested or crossed levels of correlation. One can then consider cluster-cluster randomization, or cluster then individual randomization and all observational designs including multiple levels of correlation between outcomes. Such designs could bring mixture of gain and loss of power, according to the multiple correlation levels considered. For example, Diehr et al. studied the case of matched-pair cluster designs and Giraudeau et al. the case of cluster randomized cross-over designs [24, 25]. A lot of situations like these ones could be explored to extend our result to more complex designs.
To conclude, clustering of data is a logical consequence of multicenter designs [26, 27]. Some designs allow for controlling some factors (e.g., balancing and homogenizing the treatment distribution in individually randomized trials), whereas others exclude such possibility. This latter situation occurs mainly in observational studies, for which there is no way to control the prevalence or distribution of any factor. Since multicenter studies range in design, from homogeneous and balanced designs to "cluster" distribution designs, the design effect can induce a gain or a loss of power as we described. The main advantage of the design effect formula we proposed is its simplicity and its ability to apply to any multicenter study. Its potential weakness would be the difficulty, for an investigator who plans a multicenter study, to obtain an accurate estimate of S, the degree of heterogeneity of the group distributions between centers, and of the ICC. In the field of cluster randomized trials, important efforts have been done to improve ICC estimates reporting, which should now be followed for any multicenter study [28, 29]. In the same way, recommendations should be made for encouraging the reporting of Deff calculation, or of the S statistic, from any multicenter study publication. Associated with an ICC estimate, this information could help researchers in planning new multicenter – particularly observational – studies.

Appendix 1

Calculation of the group effect variance with a two-way ANOVA

In the mixed-effects model (1), the variance of the mean response in group i is as follows:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equa_HTML.gif
The group effect variance is defined as follows:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equb_HTML.gif
Since the centers are independent, we have:
corr(Y ijk ; Y i'j'k') = 0 for jj' and
corr(Y ijk ; Y i' jk') = ρ for responses from the same center. Then:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equc_HTML.gif
which leads to:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equd_HTML.gif

Appendix 2

Rewriting the S statistic with the between-center group size variances

Assuming centers are of equal sizes, ∀ j = 1,..., Q, https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq24_HTML.gif and we have:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Eque_HTML.gif
where V 1 is the between-center variance for sizes of group 1. Let https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq25_HTML.gif be the mean size for group i, then V 1 can be rewritten as follows:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equf_HTML.gif
where https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq26_HTML.gif is the center size variance and https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq27_HTML.gif is the between-center variance for sizes of group 2. Assuming centers are of equal sizes, we have ∀ j = 1,..., Q, https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq24_HTML.gif ; thus V m = 0 and V 1 = V 2. The statistic is then:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_Equg_HTML.gif
Hence, assuming centers are of equal sizes, for a given total sample size N, number of centers Q, and between-center group size variance V i , the higher the difference between https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq28_HTML.gif and 1 the higher the statistic S. Then, the Deff increases with the degree of imbalance between the two group sizes. This result generalizes to designs with unequal center sizes, because the S statistic always depends on https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq29_HTML.gif . However, quantitative prediction of the impact of the https://static-content.springer.com/image/art%3A10.1186%2F1471-2288-9-39/MediaObjects/12874_2008_Article_355_IEq28_HTML.gif ratio on the Deff is not straightforward because the center size variance, V m , and the covariance term between V m and V 2 are, in this case, not null.

Acknowledgements

EV was supported by a doctoral fellowship from the Ministère de l'Enseignement Supérieur et de la Recherche, France.
Authors would like to thank the two referees for their helpful and constructive comments.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

The study was designed by EV and BG. EV performed the statistical analysis and drafted the article, which was then revised by BG. All authors approved the final manuscript.
Literatur
1.
Zurück zum Zitat Localio AR, Berlin JA, Ten Have TR, Kimmel SE: Adjustments for center in multicenter studies: an overview. Ann Intern Med. 2001, 135: 112-123.CrossRefPubMed Localio AR, Berlin JA, Ten Have TR, Kimmel SE: Adjustments for center in multicenter studies: an overview. Ann Intern Med. 2001, 135: 112-123.CrossRefPubMed
3.
Zurück zum Zitat Kish L: Survey sampling. 1965, New York: John Wiley Kish L: Survey sampling. 1965, New York: John Wiley
4.
Zurück zum Zitat Donner A, Klar N: Design and Analysis of Cluster Randomization Trials in Health Research. 2000, London: Arnold Donner A, Klar N: Design and Analysis of Cluster Randomization Trials in Health Research. 2000, London: Arnold
5.
Zurück zum Zitat Vierron E, Giraudeau B: Sample size calculation for multicenter randomized trial: taking the center effect into account. Contemp Clin Trials. 2007, 28: 451-458. 10.1016/j.cct.2006.11.003.CrossRefPubMed Vierron E, Giraudeau B: Sample size calculation for multicenter randomized trial: taking the center effect into account. Contemp Clin Trials. 2007, 28: 451-458. 10.1016/j.cct.2006.11.003.CrossRefPubMed
6.
Zurück zum Zitat Fleiss JL: The Design and Analysis of Clinical Experiments. 1986, New York: Wiley Fleiss JL: The Design and Analysis of Clinical Experiments. 1986, New York: Wiley
7.
Zurück zum Zitat Machin D, Campbell M, Fayers P, Pinol A: Sample size tables for clinical studies. 1997, London: Blackwell Science, 2 Machin D, Campbell M, Fayers P, Pinol A: Sample size tables for clinical studies. 1997, London: Blackwell Science, 2
8.
9.
Zurück zum Zitat Devereaux PJ, Bhandari M, Clarke M, Montori VM, Cook DJ, Yusuf S, Sackett DL, Cina CS, Walter SD, Haynes B, Schunemann HJ, Norman GR, Guyatt GH: Need for expertise based randomised controlled trials. BMJ. 2005, 330: 88-10.1136/bmj.330.7482.88.CrossRefPubMedPubMedCentral Devereaux PJ, Bhandari M, Clarke M, Montori VM, Cook DJ, Yusuf S, Sackett DL, Cina CS, Walter SD, Haynes B, Schunemann HJ, Norman GR, Guyatt GH: Need for expertise based randomised controlled trials. BMJ. 2005, 330: 88-10.1136/bmj.330.7482.88.CrossRefPubMedPubMedCentral
10.
Zurück zum Zitat Julious SA: Sample sizes for clinical trials with normal data. Stat Med. 2004, 23: 1921-1986. 10.1002/sim.1783.CrossRefPubMed Julious SA: Sample sizes for clinical trials with normal data. Stat Med. 2004, 23: 1921-1986. 10.1002/sim.1783.CrossRefPubMed
11.
Zurück zum Zitat Lin Z: An issue of statistical analysis in controlled multi-centre studies: how shall we weight the centres?. Stat Med. 1999, 18: 365-373. 10.1002/(SICI)1097-0258(19990228)18:4<365::AID-SIM46>3.0.CO;2-2.CrossRefPubMed Lin Z: An issue of statistical analysis in controlled multi-centre studies: how shall we weight the centres?. Stat Med. 1999, 18: 365-373. 10.1002/(SICI)1097-0258(19990228)18:4<365::AID-SIM46>3.0.CO;2-2.CrossRefPubMed
12.
Zurück zum Zitat Senn S: Some controversies in planning and analysing multi-centre trials. Stat Med. 1998, 17: 1753-1765. 10.1002/(SICI)1097-0258(19980815/30)17:15/16<1753::AID-SIM977>3.0.CO;2-X.CrossRefPubMed Senn S: Some controversies in planning and analysing multi-centre trials. Stat Med. 1998, 17: 1753-1765. 10.1002/(SICI)1097-0258(19980815/30)17:15/16<1753::AID-SIM977>3.0.CO;2-X.CrossRefPubMed
13.
Zurück zum Zitat ICH Topic E 9. Note for guidance on statistical principles for clinical trials. The European Agency for the Evaluation of Medicinal Products: 1998. 1998 ICH Topic E 9. Note for guidance on statistical principles for clinical trials. The European Agency for the Evaluation of Medicinal Products: 1998. 1998
14.
Zurück zum Zitat Kerry SM, Bland JM: Unequal cluster sizes for trials in English and Welsh general practice: implications for sample size calculations. Stat Med. 2001, 20: 377-390. 10.1002/1097-0258(20010215)20:3<377::AID-SIM799>3.0.CO;2-N.CrossRefPubMed Kerry SM, Bland JM: Unequal cluster sizes for trials in English and Welsh general practice: implications for sample size calculations. Stat Med. 2001, 20: 377-390. 10.1002/1097-0258(20010215)20:3<377::AID-SIM799>3.0.CO;2-N.CrossRefPubMed
15.
Zurück zum Zitat Murray DM, Varnell SP, Blitstein JL: Design and analysis of group-randomized trials: a review of recent methodological developments. Am J Public Health. 2004, 94: 423-432. 10.2105/AJPH.94.3.423.CrossRefPubMedPubMedCentral Murray DM, Varnell SP, Blitstein JL: Design and analysis of group-randomized trials: a review of recent methodological developments. Am J Public Health. 2004, 94: 423-432. 10.2105/AJPH.94.3.423.CrossRefPubMedPubMedCentral
16.
Zurück zum Zitat Zhou X, Perkins A, Hui S: Comparison of software packages for generalized linear multilevel models. American Statistician. 1999, 53: 282-290. 10.2307/2686112. Zhou X, Perkins A, Hui S: Comparison of software packages for generalized linear multilevel models. American Statistician. 1999, 53: 282-290. 10.2307/2686112.
17.
Zurück zum Zitat Peugh J, Enders C: Using the SPSS mixed procedure to fit cross-sectional and longitudinal multilevel models. Educational and Psychological Measurement. 2005, 65: 717-741. 10.1177/0013164405278558.CrossRef Peugh J, Enders C: Using the SPSS mixed procedure to fit cross-sectional and longitudinal multilevel models. Educational and Psychological Measurement. 2005, 65: 717-741. 10.1177/0013164405278558.CrossRef
18.
Zurück zum Zitat Singer J: Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models. Journal of Educational and Behavioral Statistics. 1998, 24: 323-355.CrossRef Singer J: Using SAS PROC MIXED to Fit Multilevel Models, Hierarchical Models, and Individual Growth Models. Journal of Educational and Behavioral Statistics. 1998, 24: 323-355.CrossRef
19.
Zurück zum Zitat Pinheiro J, Bates D: Mixed-Effects Models in S and S-PLUS. 2000, New-York: SpringerCrossRef Pinheiro J, Bates D: Mixed-Effects Models in S and S-PLUS. 2000, New-York: SpringerCrossRef
20.
Zurück zum Zitat Eldridge SM, Ashby D, Kerry S: Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int J Epidemiol. 2006, 35: 1292-1300. 10.1093/ije/dyl129.CrossRefPubMed Eldridge SM, Ashby D, Kerry S: Sample size for cluster randomized trials: effect of coefficient of variation of cluster size and analysis method. Int J Epidemiol. 2006, 35: 1292-1300. 10.1093/ije/dyl129.CrossRefPubMed
21.
Zurück zum Zitat Guittet L, Ravaud P, Giraudeau B: Planning a cluster randomized trial with unequal cluster sizes: practical issues involving continuous outcomes. BMC Med Res Methodol. 2006, 6: 17-10.1186/1471-2288-6-17.CrossRefPubMedPubMedCentral Guittet L, Ravaud P, Giraudeau B: Planning a cluster randomized trial with unequal cluster sizes: practical issues involving continuous outcomes. BMC Med Res Methodol. 2006, 6: 17-10.1186/1471-2288-6-17.CrossRefPubMedPubMedCentral
22.
Zurück zum Zitat DeLong ER, Coombs LP, Ferguson TB, Peterson ED: The evaluation of treatment when center-specific selection criteria vary with respect to patient risk. Biometrics. 2005, 61: 942-949. 10.1111/j.1541-0420.2005.00358.x.CrossRefPubMed DeLong ER, Coombs LP, Ferguson TB, Peterson ED: The evaluation of treatment when center-specific selection criteria vary with respect to patient risk. Biometrics. 2005, 61: 942-949. 10.1111/j.1541-0420.2005.00358.x.CrossRefPubMed
23.
Zurück zum Zitat Greenfield S, Kaplan SH, Kahn R, Ninomiya J, Griffith JL: Profiling care provided by different groups of physicians: effects of patient case-mix (bias) and physician-level clustering on quality assessment results. Ann Intern Med. 2002, 136: 111-121.CrossRefPubMed Greenfield S, Kaplan SH, Kahn R, Ninomiya J, Griffith JL: Profiling care provided by different groups of physicians: effects of patient case-mix (bias) and physician-level clustering on quality assessment results. Ann Intern Med. 2002, 136: 111-121.CrossRefPubMed
24.
Zurück zum Zitat Diehr P, Martin DC, Koepsell T, Cheadle A: Breaking the matches in a paired t-test for community interventions when the number of pairs is small. Stat Med. 1995, 14: 1491-1504. 10.1002/sim.4780141309.CrossRefPubMed Diehr P, Martin DC, Koepsell T, Cheadle A: Breaking the matches in a paired t-test for community interventions when the number of pairs is small. Stat Med. 1995, 14: 1491-1504. 10.1002/sim.4780141309.CrossRefPubMed
25.
Zurück zum Zitat Giraudeau B, Ravaud P, Donner A: Sample size calculation for cluster randomized cross-over trials. Stat Med. 2008, 27: 5578-5585. 10.1002/sim.3383.CrossRefPubMed Giraudeau B, Ravaud P, Donner A: Sample size calculation for cluster randomized cross-over trials. Stat Med. 2008, 27: 5578-5585. 10.1002/sim.3383.CrossRefPubMed
26.
Zurück zum Zitat Chuang JH, Hripcsak G, Heitjan DF: Design and analysis of controlled trials in naturally clustered environments: implications for medical informatics. J Am Med Inform Assoc. 2002, 9: 230-238. 10.1197/jamia.M0997.CrossRefPubMedPubMedCentral Chuang JH, Hripcsak G, Heitjan DF: Design and analysis of controlled trials in naturally clustered environments: implications for medical informatics. J Am Med Inform Assoc. 2002, 9: 230-238. 10.1197/jamia.M0997.CrossRefPubMedPubMedCentral
27.
Zurück zum Zitat Lee KJ, Thompson SG: The use of random effects models to allow for clustering in individually randomized trials. Clin Trials. 2005, 2: 163-173. 10.1191/1740774505cn082oa.CrossRefPubMed Lee KJ, Thompson SG: The use of random effects models to allow for clustering in individually randomized trials. Clin Trials. 2005, 2: 163-173. 10.1191/1740774505cn082oa.CrossRefPubMed
28.
Zurück zum Zitat Campbell MK, Elbourne DR, Altman DG: CONSORT statement: extension to cluster randomised trials. Bmj. 2004, 328: 702-708. 10.1136/bmj.328.7441.702.CrossRefPubMedPubMedCentral Campbell MK, Elbourne DR, Altman DG: CONSORT statement: extension to cluster randomised trials. Bmj. 2004, 328: 702-708. 10.1136/bmj.328.7441.702.CrossRefPubMedPubMedCentral
29.
Zurück zum Zitat Campbell MK, Grimshaw JM, Elbourne DR: Intracluster correlation coefficients in cluster randomized trials: empirical insights into how should they be reported. BMC Med Res Methodol. 2004, 4: 9-10.1186/1471-2288-4-9.CrossRefPubMedPubMedCentral Campbell MK, Grimshaw JM, Elbourne DR: Intracluster correlation coefficients in cluster randomized trials: empirical insights into how should they be reported. BMC Med Res Methodol. 2004, 4: 9-10.1186/1471-2288-4-9.CrossRefPubMedPubMedCentral
Metadaten
Titel
Design effect in multicenter studies: gain or loss of power?
verfasst von
Emilie Vierron
Bruno Giraudeau
Publikationsdatum
01.12.2009
Verlag
BioMed Central
Erschienen in
BMC Medical Research Methodology / Ausgabe 1/2009
Elektronische ISSN: 1471-2288
DOI
https://doi.org/10.1186/1471-2288-9-39

Weitere Artikel der Ausgabe 1/2009

BMC Medical Research Methodology 1/2009 Zur Ausgabe