Sample sizes required to detect interactions between two binary fixed-effects in a mixed-effects linear regression model

doi:10.1016/j.csda.2008.06.010

Computational Statistics & Data Analysis

Volume 53, Issue 3, 15 January 2009, Pages 603-608

https://doi.org/10.1016/j.csda.2008.06.010 Get rights and content

Abstract

Mixed-effects linear regression models have become more widely used for analysis of repeatedly measured outcomes in clinical trials over the past decade. There are formulae and tables for estimating sample sizes required to detect the main effects of treatment and the treatment by time interactions for those models. A formula is proposed to estimate the sample size required to detect an interaction between two binary variables in a factorial design with repeated measures of a continuous outcome. The formula is based, in part, on the fact that the variance of an interaction is fourfold that of the main effect. A simulation study examines the statistical power associated with the resulting sample sizes in a mixed-effects linear regression model with a random intercept. The simulation varies the magnitude ( $Δ$ ) of the standardized main effects and interactions, the intraclass correlation coefficient ( $ρ$ ), and the number $(k)$ of repeated measures within-subject. The results of the simulation study verify that the sample size required to detect a 2×2 interaction in a mixed-effects linear regression model is fourfold that to detect a main effect of the same magnitude.

Introduction

The mixed-effects linear regression model (Harville, 1977, Laird and Ware, 1982) is widely used in observational studies and randomized controlled clinical trials (RCT) in which there are repeated measures over time. In designing a study, the Ethical Guidelines of the American Statistical Association (1999) advise statisticians to provide informed recommendations for sample size such that a research protocol will neither propose an inadequate nor an excessive number of subjects to detect a scientifically noteworthy result with acceptable statistical power. Several authors have examined the sample sizes required to detect the main effects and interaction of treatment and time in longitudinal studies with repeated measures (e.g., Hsieh (1988), Rochon (1991), Overall and Doyle (1994), Hedeker et al. (1999), Raudenbush and Liu (2001) and Diggle et al. (2002)). Yet a study that is designed to detect the main effect of treatment will not have sufficient power to detect the interaction between two binary fixed effects. In a 2×2 factorial fixed-effects ANOVA with equal cell sizes and an assumption of independence among observations, for instance, the sample size required to detect an interaction is four times that for a main effect of the same magnitude (Fleiss, 1986). However, we are not aware of formulae to estimate the sample size needed to detect an interaction between two binary fixed effects in a mixed-effects linear regression model for analysis of repeatedly measured correlated data.

The objective of this manuscript is to examine the sample size required to detect a 2×2 interaction of two binary fixed effects in mixed-effects linear regression analyses. The model, described in detail in Section 2, also incorporates a time-varying covariate, but that covariate does not interact with group membership. We sought to determine if, as with the fixed-effects factorial ANOVA, the sample size needed to detect an interaction in a repeated measures design is fourfold that of a main effect. A formula for the sample size required to detect an interaction is presented below. A simulation study then examines the statistical power of the resulting sample sizes to detect interactions of various magnitudes in a 2×2 factorial design with repeated measures of a continuous outcome.

Section snippets

Mixed-effects linear regression model and sample size determination

A mixed-effects linear regression model of repeated measures of a continuous dependent variable, $y_{i j}$ , is specified as: $y_{i j} = β_{0} + β_{1} x_{1 i} + β_{2} x_{2 i} + β_{3} x_{1 i} x_{2 i} + β_{4} t_{j} + υ_{i} + ε_{i j}$ for subject $i (i = 1, \dots, N)$ , at time $j (j = 1, \dots, k)$ , where $β_{0}$ is the intercept term, $x_{1}$ , represents the treatment contrast ( $x_{1} = - 1 / 2$ if placebo; $x_{1} = 1 / 2$ if investigational treatment), $x_{2}$ represents the moderator contrast ( $x_{2} = - 1 / 2$ if effect moderator is absent; $x_{2} = 1 / 2$ if effect moderator is present), $x_{1} x_{2}$ represents the treatment by moderator

Simulation study

The primary focus of this simulation study was to examine whether the statistical power to detect an interaction of two fixed effects in a 2×2 factorial design with repeated measures of a continuous outcome in model (1) is consistent with the sample sizes derived from (4). The statistical power to detect a main effect with the sample sizes derived from (3) was also examined. A Wald test with a two-tailed alpha-level of .05 was used to test each of two hypotheses: $H_{01} : β_{1} = 0$ $H_{02} : β_{3} = 0 .$

The

Simulation results

Empirical power estimates for each specification of the main effect models (Table 1 for 80% power; Table 2 for 90% power; Table 3 for 95% power) are consistent with the sample size $N (Δ_{1})$ calculation based on Eq. (3). Furthermore, the required sample sizes $N (Δ_{3})$ for an interaction are indeed fourfold that of a main effect of the same magnitude. For example, for 80% power, with $ρ = 0.20$ and $k = 4$ observations per subject, $N (Δ_{3}) = 808$ subjects in total (or 202/cell) are needed for power of 80% to detect

Application

There is a recent NIH initiative (NIH: RFA-MH-09-010) to identify personalized treatments by designing clinical trials that test not only the effect of treatment, but moderators of the treatment effect. The goal of such a trial would be to test whether an hypothesized subject characteristic (i.e., the moderator) is associated with enhanced or inhibited treatment response. In either case, a treatment by moderator could test an important clinical question, in that it would help the clinician

Discussion

This simulation study examined required sample sizes for the main effects and interaction of two binary fixed effects in a mixed-effects linear regression model with a random intercept. The results indicate that, for a given set of design specifications, four times as many subjects are required to detect an interaction as for a main effect, as specified in our formula (4). The formula was verified by simulation for 80%, 90%, and 95% statistical power. This relationship did not depend on the

Acknowledgements

This research was supported, in part, by grants from the National Institute Health (MH060447 and MH068638).

References (14)

J.E. Overall et al.
Estimating sample sizes for repeated measurement designs
Controlled Clinical Trials
(1994)
A.J. Rush et al.
The 16-item Quick Inventory of Depressive Symptomatology (QIDS), clinician rating (QIDS-C), and self-report (QIDS-SR): A psychometric evaluation in patients with chronic major depression
Biological Psychiatry
(2003)
American Statistical Association
Ethical guidelines for statistical practice: Executive summary
Amstat News
(1999)
P.J. Diggle et al.
Analysis of Longitudinal Data
(2002)
A. Donner et al.
Randomization by cluster: Sample size requirements and analysis
American Journal of Epidemiology
(1981)
A. Donner et al.
Design and Analysis of Cluster Randomization Trials in Health Research
(2000)
J.L. Fleiss
The Design and Analysis of Clinical Experiments
(1986)

There are more references available in the full text version of this article.

Cited by (142)

Effect of a Patient Portal Reminder Message After No-Show on Appointment Reattendance in Ophthalmology: A Randomized Clinical Trial
2024, American Journal of Ophthalmology
To assess the efficacy of electronic health record (EHR) messaging for re-engaging patients with ophthalmology care after a missed appointment.
Prospective, randomized clinical trial.
The study setting was an academic ophthalmology department. The patient population comprised of return patients age 18 years or older with an appointment “no show,” or missed appointment. Over 2 phases of recruitment, 362 patients with an active patient portal in the EHR were selected consecutively each business day. Patients were randomized using a web-based tool to receive a reminder to reschedule via a standard mailed letter only (control) or the mailed letter plus an electronic message through the EHR within 1 business day of the missed appointment (intervention). Reengagement with eye care was defined as attendance of a rescheduled appointment within 30 days of the no-show visit. Patient charts were reviewed for demographic and clinical data.
The average age of recruited patients was 59.9 years, just under half of the sample was male (42.5%, 154/362), and most patients were White (56.9%, 206/362) or Black (36.2%, 131/362). Patients were most commonly recruited from the retina service (39.2%, 142/362) followed by the glaucoma service (29.3%, 106/362). Many patients in this study had previous no-show appointments, with an average no-show rate of 18.8% out of all scheduled visits across our health system. In total, 22.2% (42/189) of patients in the intervention group attended a follow-up appointment within 30 days of their no-show visit compared to 11.6% (20/173) of the control group (OR, 2.186; 95% CI, 1.225-3.898; P = .008). When including only the 74 patients in the intervention group who read the intervention message in the patient portal, 28.4% (21/74) attended a follow-up compared to 11.6% (20/173) of the control group (P = .001).
EHR-based reminder messages sent within a business day of a missed appointment may promote re-engagement in ophthalmology care after appointment no-show.
Intrusive-like memory errors associate with positive schizotypy
2023, Schizophrenia Research: Cognition
Schizophrenia is characterized by memory impairments, yet the relationships between its distinct symptom clusters (i.e., positive, negative, disorganized) and specific aspects of memory dysfunction remain poorly characterized. In the present study, we compiled a large analog sample (N = 795) to test whether positive symptoms, versus negative and disorganized symptoms, were uniquely and differentially related to false alarm versus miss errors during recognition memory. Mixed-effects beta regression analyses revealed that both positive schizotypy and paranoia were more strongly associated with false alarms than misses. Disorganized schizotypy showed a similar pattern, though to a lesser extent; negative schizotypy showed a significant relationship with neither false alarm nor miss errors. We suggest that those higher in positive schizotypy are especially prone to misattribute signal to noise stimuli during recognition memory – characteristic of an “intrusive-like” profile of memory impairment, wherein context-irrelevant stimuli trigger spurious retrieval events – and speculate on the neural processes that might give rise to this asymmetry.
BOLD Response is more than just magnitude: Improving detection sensitivity through capturing hemodynamic profiles
2023, NeuroImage
Typical fMRI analyses often assume a canonical hemodynamic response function (HRF) that primarily focuses on the peak height of the overshoot, neglecting other morphological aspects. Consequently, reported analyses often reduce the overall response curve to a single scalar value. In this study, we take a data-driven approach to HRF estimation at the whole-brain voxel level, without assuming a response profile at the individual level. We then employ a roughness penalty at the population level to estimate the response curve, aiming to enhance predictive accuracy, inferential efficiency, and cross-study reproducibility. By examining a fast event-related FMRI dataset, we demonstrate the shortcomings and information loss associated with adopting the canonical approach. Furthermore, we address the following key questions: 1) To what extent does the HRF shape vary across different regions, conditions, and participant groups? 2) Does the data-driven approach improve detection sensitivity compared to the canonical approach? 3) Can analyzing the HRF shape help validate the presence of an effect in conjunction with statistical evidence? 4) Does analyzing the HRF shape offer evidence for whole-brain response during a simple task?
Therapeutically targeting the consequences of HIV-1-associated gastrointestinal dysbiosis: Implications for neurocognitive and affective alterations
2023, Pharmacology Biochemistry and Behavior
Approximately 50 % of the individuals living with human immunodeficiency virus type 1 (HIV-1) are plagued by debilitating neurocognitive impairments (NCI) and/or affective alterations. Sizeable alterations in the composition of the gut microbiome, or gastrointestinal dysbiosis, may underlie, at least in part, the NCI, apathy, and/or depression observed in this population. Herein, two interrelated aims will be critically addressed, including: 1) the evidence for, and functional implications of, gastrointestinal microbiome dysbiosis in HIV-1 seropositive individuals; and 2) the potential for therapeutically targeting the consequences of this dysbiosis for the treatment of HIV-1-associated NCI and affective alterations. First, gastrointestinal microbiome dysbiosis in HIV-1 seropositive individuals is characterized by decreased alpha (α) diversity, a decreased relative abundance of bacterial species belonging to the Bacteroidetes phylum, and geographic-specific alterations in Bacillota (formerly Firmicutes) spp. Fundamentally, changes in the relative abundance of Bacteroidetes and Bacillota spp. may underlie, at least in part, the deficits in γ-aminobutyric acid and serotonin neurotransmission, as well as prominent synaptodendritic dysfunction, observed in this population. Second, there is compelling evidence for the therapeutic utility of targeting synaptodendritic dysfunction as a method to enhance neurocognitive function and improve motivational dysregulation in HIV-1. Further research is needed to determine whether the therapeutics enhancing synaptic efficacy exert their effects by altering the gut microbiome. Taken together, understanding gastrointestinal microbiome dysbiosis resulting from chronic HIV-1 viral protein exposure may afford insight into the mechanisms underlying HIV-1-associated neurocognitive and/or affective alterations; mechanisms which can be subsequently targeted via novel therapeutics.
The impact of opioid-stimulant co-use on tonic and cue-induced craving
2023, Journal of Psychiatric Research
The twin opioid-stimulant epidemics have led to increased overdose deaths and present unique challenges for individuals entering treatment with opioid-stimulant polysubstance use. This study examined tonic and cue-induced craving as a primary outcome among persons in substance use treatment who reported primary substances of opioids, methamphetamine, or cocaine. The sample consisted of 1974 individuals in 55 residential substance-use treatment centers in the United States in 2021. Weekly surveys were delivered via a third-party outcomes tracking system, including measures of tonic and cue-induced craving. Initial comparisons on tonic and cue-induced craving were made among those who primarily used opioids, cocaine, or methamphetamine. Further, the effect of opioid/stimulant polysubstance use on tonic and cue-induced craving was evaluated using marginal effect regression models. Primary methamphetamine use was associated with decreased tonic craving compared to primary opioid use (β = −5.63, p < 0.001) and primary cocaine use was also associate with decreased tonic craving compared to primary opioid use (β = −6.14, p < 0.001). Primary cocaine use was also associated with lower cue-induced cravings compared to primary opioid use (β = −0.53, p = 0.037). Opioid-methamphetamine polysubstance use was associated with higher tonic craving (β = 3.81, p = <0.001) and higher cue-induced craving (β = 1.55, p = 0.001); however, this was not the case for opioid-cocaine polysubstance use. The results of this study indicate that individuals who primarily use opioids and have secondary methamphetamine use experience higher cue-induced and tonic-induced craving, suggesting that these individuals may benefit from additional interventions that target craving and mitigate relapse risk and other negative sequelae.
Licenced doses of approved COVID-19 vaccines may not be optimal: A review of the early-phase, dose-finding trials
2023, Vaccine
Although over 13 billion COVID-19 vaccine doses have been administered globally, the issue of whether the optimal doses are being used has received little attention. To address this question we reviewed the reports of early-phase dose-finding trials of the nine COVID-19 vaccines approved by World Health Organization, extracting information on study design and findings on reactogenicity and early humoral immune response. The number of different doses evaluated for each vaccine varied widely (range 1–7), as did the number of subjects studied per dose (range 15–190). As expected, the frequency and severity of adverse reactions generally increased at higher doses, although most were clinically tolerable. Higher doses also tended to elicit better immune responses, but differences between the highest dose and the second-highest dose evaluated were small, typically less than 1.6-fold for both binding antibody concentration and neutralising antibody titre. All of the trials had at least one important design limitation – few doses evaluated, large gaps between adjacent doses, or an inadequate sample size – although this is not a criticism of the study investigators, who were working under intense time pressures at the start of the epidemic. It is therefore open to question whether the single dose taken into clinical efficacy trials, and subsequently authorised by regulatory agencies, was optimal. In particular, our analysis indicates that the recommended doses for some vaccines appear to be unnecessarily high. Although reduced dosing for booster injections is an active area of research, the priming dose also merits study. We conclude by suggesting improvements in the design of future vaccine trials, for both next-generation COVID-19 vaccines and for vaccines against other pathogens.

View all citing articles on Scopus

View full text

Sample sizes required to detect interactions between two binary fixed-effects in a mixed-effects linear regression model

Abstract

Introduction

Section snippets

Mixed-effects linear regression model and sample size determination

Simulation study

Simulation results

Application

Discussion

Acknowledgements

Controlled Clinical Trials

Biological Psychiatry

Ethical guidelines for statistical practice: Executive summary

Amstat News

Analysis of Longitudinal Data

Randomization by cluster: Sample size requirements and analysis

American Journal of Epidemiology

Design and Analysis of Cluster Randomization Trials in Health Research

The Design and Analysis of Clinical Experiments