Background
The randomised controlled trial (RCT) is widely considered to be the gold standard for assessing comparative clinical efficacy, effectiveness and safety, as well as providing an important vehicle to assess cost-effectiveness [
1]. RCTs are routinely used to evaluate a wide range of interventions and have been used successfully in a variety of health care settings. Central to the design of an RCT is an a priori sample size calculation which ensures that the study has a high probability of achieving its pre-specified objectives.
A compromise is required when designing an RCT to balance the possibility of being misled by chance when there is no true difference between treatments (type I error), with the risk of failing to identify a treatment difference when one treatment is truly superior to the other (type II error) [
2]. Under the conventional (sometimes referred to as
Neyman-Pearson) approach, the probabilities of these two errors are controlled by setting the significance level (type I error) and statistical power (1 − type II error) at appropriate levels. Once these two inputs have been set, the sample size can be determined, given the magnitude of the between-group difference in the outcome that is to be detected.
The difference between groups used to calculate a trial’s sample size—that is, the ‘target difference’—is the magnitude of difference that the RCT is designed to reliably detect. It can be expressed as an absolute difference (e.g., mean difference) or a relative difference (e.g., HR or risk ratio), and it is also often referred to as the trial’s
effect size. The required sample size is very sensitive to the target difference. Under the conventional approach, halving the target difference quadruples the sample size for a two-arm 1:1 parallel-group trial with a continuous outcome which is assumed to be normally distributed [
2]. Appropriate sample size formulae vary, depending upon the proposed trial design and statistical analysis, although the overall approach is consistent. In addition to the conventional approach, other statistical approaches (to calculating the sample size) can be used, such as Fisherian/precision-based approaches, Bayesian and Bayesian decision-theoretic approaches, along with a hybrid of the Bayesian and Neyman-Pearson approaches [
3‐
7]. However, a relatively recent review of 215 RCTs in leading medical journals identified only the Neyman-Pearson approach in use [
4].
A comprehensive methodological review conducted by the original Difference ELicitation in TriAls (DELTA) group [
8,
9] highlighted the available methods and limitations in current practice. It showed that despite there being many different approaches available, some are rarely used in practice [
10]. Although relevant to all types of outcomes, a substantial amount of research has been carried out on patient-reported quality-of-life outcomes, reflecting not only that patients may find specifying an important difference more difficult than clinicians but also the general challenge of interpreting quality-of-life measures and the value of the patient’s perspective [
11,
12]. In practice, the target difference is often not formally based upon these concepts and in many cases appears, at least on the basis of trial reports, to be determined on the basis of convenience or some other informal basis [
13].
Recent surveys of practice of researchers involved in clinical trials have demonstrated that determination of the sample size, including specification of the target difference, is a more complex process than the trial reports suggest [
10]. Initial guidance has been prepared for non-adaptive superiority two-arm parallel-group trials which are to be analysed according to the Neyman-Pearson approach [
14]. However, this guidance does not cover trials of alternative hypotheses (i.e., equivalence/non-inferiority trials), more complex designs (e.g., multi-arm trials) or other alternative statistical approaches (Bayesian and precision-based) to choosing the target difference and reporting the sample size calculation. There are signs that the recent work led by the DELTA group has begun to influence practice through citations, presentations and anecdotal experience [
15,
16]. However, it is clear that limitations in the scope and conception (because it was developed primarily for researchers) of the initial DELTA guidance mean that it does not fully meet the needs of funders and researchers in terms of understanding the role of the target difference in various designs and options available to inform its choice.
Aim and objectives
The overall aim of the project is to produce updated guidance for researchers and funders on specifying and reporting the target difference (‘effect size’) in the sample size calculation of an RCT. The following are the specific objectives:
1.
To review existing guidance provided by funders to researchers and scientific review panel/board members
2.
To identify key methodological developments or changes in practice which have emerged since the comprehensive DELTA review [
8,
9] was undertaken and update the DELTA method guidance
3.
To determine the scope of guidance that would aid researchers and address funders’ needs
4.
To achieve consensus on what structured guidance for choosing the target difference (effect size) should comprise
5.
To identify future research needs
To achieve these objectives, we will systematically review the methodological literature for approaches to determining the target difference in RCTs which have been published since the DELTA review was completed in 2011 (stage 1). In addition, experts will be asked about recent methodological developments and changes in practice (stage 2). Following this, a Delphi study involving key stakeholders will be undertaken to gather views on the needed scope and focus of the guidance needed (stage 3). Embedded within the Delphi study will be a 2-day consensus workshop, which will bring together key stakeholders (stage 4) to reach agreement on key aspects of the structured guidance for researchers and funders that will be prepared. Following completion of the Delphi study, this guidance will be reviewed, finalised and disseminated (stage 5).
Discussion
Researchers face a number of difficult decisions when designing an RCT, including the choice of trial design, primary outcome and sample size. The latter is driven largely by the choice of target difference (‘effect size’), although other aspects of sample size determination also contribute. Existing guidance on determination of the target difference is limited, and there has been growing recognition of the need for greater guidance for funders and researchers, as well as other key stakeholders, such as patients and the respective clinical communities. DELTA2 is seeking to produce practical and comprehensive guidance which is applicable to the vast majority of trials to bridge the gap between existing guidance and the needs of researchers.
Acknowledgements
The Health Services Research Unit, Institute of Applied Health Sciences, University of Aberdeen, is core-funded by the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The funders had no involvement in study design; collection, analysis and interpretation of data; reporting; or the decision to publish.
DELTA2 project group: Jonathan Cook, William Sones, Joanne Rothwell, Luke Vale, Craig Ramsay, Lisa Hampson, Richard Emsley, Stephen Walters, Catherine Hewitt, Martin Bland, Dean Fergusson, Jesse Berlin, Doug Altman, and Steven Julious.