Sample sizes required to detect interactions between two binary fixed-effects in a mixed-effects linear regression model

https://doi.org/10.1016/j.csda.2008.06.010Get rights and content

Abstract

Mixed-effects linear regression models have become more widely used for analysis of repeatedly measured outcomes in clinical trials over the past decade. There are formulae and tables for estimating sample sizes required to detect the main effects of treatment and the treatment by time interactions for those models. A formula is proposed to estimate the sample size required to detect an interaction between two binary variables in a factorial design with repeated measures of a continuous outcome. The formula is based, in part, on the fact that the variance of an interaction is fourfold that of the main effect. A simulation study examines the statistical power associated with the resulting sample sizes in a mixed-effects linear regression model with a random intercept. The simulation varies the magnitude (Δ) of the standardized main effects and interactions, the intraclass correlation coefficient (ρ), and the number (k) of repeated measures within-subject. The results of the simulation study verify that the sample size required to detect a 2×2 interaction in a mixed-effects linear regression model is fourfold that to detect a main effect of the same magnitude.

Introduction

The mixed-effects linear regression model (Harville, 1977, Laird and Ware, 1982) is widely used in observational studies and randomized controlled clinical trials (RCT) in which there are repeated measures over time. In designing a study, the Ethical Guidelines of the American Statistical Association (1999) advise statisticians to provide informed recommendations for sample size such that a research protocol will neither propose an inadequate nor an excessive number of subjects to detect a scientifically noteworthy result with acceptable statistical power. Several authors have examined the sample sizes required to detect the main effects and interaction of treatment and time in longitudinal studies with repeated measures (e.g., Hsieh (1988), Rochon (1991), Overall and Doyle (1994), Hedeker et al. (1999), Raudenbush and Liu (2001) and Diggle et al. (2002)). Yet a study that is designed to detect the main effect of treatment will not have sufficient power to detect the interaction between two binary fixed effects. In a 2×2 factorial fixed-effects ANOVA with equal cell sizes and an assumption of independence among observations, for instance, the sample size required to detect an interaction is four times that for a main effect of the same magnitude (Fleiss, 1986). However, we are not aware of formulae to estimate the sample size needed to detect an interaction between two binary fixed effects in a mixed-effects linear regression model for analysis of repeatedly measured correlated data.

The objective of this manuscript is to examine the sample size required to detect a 2×2 interaction of two binary fixed effects in mixed-effects linear regression analyses. The model, described in detail in Section 2, also incorporates a time-varying covariate, but that covariate does not interact with group membership. We sought to determine if, as with the fixed-effects factorial ANOVA, the sample size needed to detect an interaction in a repeated measures design is fourfold that of a main effect. A formula for the sample size required to detect an interaction is presented below. A simulation study then examines the statistical power of the resulting sample sizes to detect interactions of various magnitudes in a 2×2 factorial design with repeated measures of a continuous outcome.

Section snippets

Mixed-effects linear regression model and sample size determination

A mixed-effects linear regression model of repeated measures of a continuous dependent variable, yij, is specified as: yij=β0+β1x1i+β2x2i+β3x1ix2i+β4tj+υi+εij for subject i(i=1,,N), at time j(j=1,,k), where β0 is the intercept term, x1, represents the treatment contrast (x1=1/2 if placebo; x1=1/2 if investigational treatment), x2 represents the moderator contrast (x2=1/2 if effect moderator is absent; x2=1/2 if effect moderator is present), x1x2 represents the treatment by moderator

Simulation study

The primary focus of this simulation study was to examine whether the statistical power to detect an interaction of two fixed effects in a 2×2 factorial design with repeated measures of a continuous outcome in model (1) is consistent with the sample sizes derived from (4). The statistical power to detect a main effect with the sample sizes derived from (3) was also examined. A Wald test with a two-tailed alpha-level of .05 was used to test each of two hypotheses: H01:β1=0H02:β3=0.

The

Simulation results

Empirical power estimates for each specification of the main effect models (Table 1 for 80% power; Table 2 for 90% power; Table 3 for 95% power) are consistent with the sample size N(Δ1) calculation based on Eq. (3). Furthermore, the required sample sizes N(Δ3) for an interaction are indeed fourfold that of a main effect of the same magnitude. For example, for 80% power, with ρ=0.20 and k=4 observations per subject, N(Δ3)=808 subjects in total (or 202/cell) are needed for power of 80% to detect

Application

There is a recent NIH initiative (NIH: RFA-MH-09-010) to identify personalized treatments by designing clinical trials that test not only the effect of treatment, but moderators of the treatment effect. The goal of such a trial would be to test whether an hypothesized subject characteristic (i.e., the moderator) is associated with enhanced or inhibited treatment response. In either case, a treatment by moderator could test an important clinical question, in that it would help the clinician

Discussion

This simulation study examined required sample sizes for the main effects and interaction of two binary fixed effects in a mixed-effects linear regression model with a random intercept. The results indicate that, for a given set of design specifications, four times as many subjects are required to detect an interaction as for a main effect, as specified in our formula (4). The formula was verified by simulation for 80%, 90%, and 95% statistical power. This relationship did not depend on the

Acknowledgements

This research was supported, in part, by grants from the National Institute Health (MH060447 and MH068638).

References (14)

There are more references available in the full text version of this article.

Cited by (142)

View all citing articles on Scopus
View full text