Background
Most randomized trials report the intention-to-treat (ITT) effect as the primary, or only, measure of the comparative effect of the studied interventions. A focus on the ITT effect is attractive for several reasons [
1,
2]. However, the ITT effect may not be the effect of interest for patients and clinicians when there is a high rate of non-compliance or when the rate of non-compliance in the trial differs from that expected outside the trial setting. In such circumstances, the per-protocol effect – the effect that would have been observed had all trial participants followed the trial protocol – may be of greater interest [
1,
2]. Unfortunately, when patient characteristics associated with non-compliance are also related to patient outcomes, the naïve approach to estimating this effect in a “per-protocol analysis” restricted to those who follow the protocol in each arm of the trial will be biased. In these cases, identifying the per-protocol effect in a randomized trial requires strong assumptions (e.g., no unmeasured confounding) and methods that are commonly used in the analysis of non-randomized studies [
2].
An alternative to estimating the per-protocol effect under these strong assumptions is to estimate lower and upper limits or “bounds” for the per-protocol effect under weaker, but perhaps more realistic, assumptions [
3‐
8]. While effect bounding, known as “partial identification of the effect”, has been attempted in observational studies (particular in the social sciences), it is rarely implemented in randomized trials. This is surprising because partial identification methods can capitalize on assumptions that are expected to hold in many randomized trials.
Here we provide a guide to the use of partial identification methods in randomized trials with dichotomous outcomes and point interventions, i.e., interventions that are not sustained over time. As an example, we demonstrate the estimation of bounds for the per-protocol effect of colorectal cancer (CRC) screening on the 10-year risk of CRC incidence and death in the Norwegian Colorectal Cancer Prevention (NORCCAP) trial.
Discussion
We have demonstrated how combining data with various sets of assumptions helps to bound the per-protocol effect of point interventions (i.e., interventions that are not sustained over time) in randomized trials with dichotomous outcomes. In our application to a trial of CRC screening, we showed how bounds for both the per-protocol risk difference and risk ratio are achievable. Our application illustrates three key benefits of an approach based on partial identification with progressively stronger assumptions.
First, this approach illuminates our reliance on unverifiable assumptions. In our trial, the wide bounds under no assumptions make clear that we cannot learn much at all about the effectiveness of screening without bringing in prior knowledge about the study design or our subject matter.
Second, this approach provides the range of effect sizes we are most confident in under fairly reasonable assumptions. In our trial we could estimate relatively informative lower bounds that quantify the maximum benefit of screening. For example, had everybody been screened, at most we would expect CRC risk to decrease by 0.6 percentage points. This number provides a limit for how much our ITT effect estimate (−0.2 %) might underestimate the effectiveness under perfect adherence, and a boundary that could be helpful in evaluating the cost-effectiveness of screening or informing clinical or policy decisions. We know less about the upper bound (minimum effectiveness or even possible harm) of the screening program without making more debatable assumptions, but the type of analyses presented in Fig.
1 provides a template for discussing what level of assumptions may be reasonable and how much differing opinions may lead to differing conclusions.
Third, this approach can demonstrate our confidence, or lack thereof, in the effect sizes for certain subpopulations [
13‐
15]. In our trial, the estimates support the benefit of CRC screening for nearly two thirds of the study population (the “compliers”), and in this case we can describe which individuals are included in this group. In randomized trials with non-compliance in both arms, we can only obtain a point estimate for the effect in the “compliers” if we assume there are no “defiers”. However, we would not know who the “compliers” are and membership in this group may vary across studies. Because of this, the common practice of presenting this subgroup effect alone is of questionable interest for clinical or policy decision-making [
17] as there is no obvious way of applying the results of the study to that particular subgroup. When presented alongside bounds for the effect in the full study population, however, investigators may sometimes be able to discern whether certain subpopulations are likely to receive more benefit or harm than others. In trials with one-sided non-compliance, like the NORCCAP trial, such practice is sometimes actionable because we can describe the subpopulation of “compliers” based on measured pre-randomization characteristics.
Investigators considering employing these methods in randomized trials with point interventions and dichotomous outcomes should consider how features of their particular study design may affect which sets of assumptions we describe in Table
2 are reasonable. The instrumental conditions are expected to hold in placebo-controlled, double-blinded randomized trials of point interventions where there is no loss to follow-up, no placebo effect, and double-blinding is successfully maintained, but the instrumental conditions are suspect in head-to-head randomized trials and whenever double-blinding is not successfully maintained or there is a possible placebo effect. The homogeneity conditions, on the other hand, are not expected to hold based on any study design feature and thus should be weighed judiciously when applied to the analysis of any randomized trial. A similar caveat applies to conditions about the distribution of or effects within compliance types when there is non-compliance in both treatment arms [
5].
Our discussion of bounding the per-protocol effect focused on dichotomous outcomes and point interventions. Similar bounds under the instrumental conditions can be identified for continuous outcomes if one assumes the outcomes are finitely bounded [
8], and the point-identification expressions under effect homogeneity conditions can also be restated to apply to continuous outcomes [
6,
7,
16]. Because we can choose to estimate cumulative risk up through any point in time in follow-up, we could also extend these bounds to bounding the survival curve for time-to-event outcomes [
18]. Partial identification strategies can also be applied to trials with substantial attrition by further incorporating methods to account for selection bias, e.g., inverse probability weighting [
19]. In trials that involve an intervention sustained over time, accounting for non-adherence can be more complicated as participants may discontinue the intervention at different times during follow-up and time-varying patient characteristics may inform and be affected by these decisions. More research is needed on how to generalize partial identification strategies to such settings, although the point-identification results can be expanded upon using structural nested models under related homogeneity and instrumental conditions [
7,
20]. Finally, our example and discussion has focused on identification, but there is a growing body of literature on how to incorporate random variability [
21]. Specifically, there has been recent development in methods for estimating confidence intervals around the bounds [
22‐
26] as well as estimating confidence intervals for the partially identified treatment effect itself [
27,
28]. Incorporating random variability into the presentation of partial identification results in randomized trials is critical; however, more research is needed as there is currently no consensus in the statistical literature on – or readily available software for – the optimal approach.
The per-protocol effect is often of greater interest than, or complementary with, the ITT effect [
1,
2]. In trials like the NORCCAP trial with essentially no loss to follow-up, we can easily compute an unbiased estimate for the ITT effect. However, the ITT effect quantifies the effect of assignment to treatment. From a patient’s perspective, deciding whether or not to take treatment requires knowledge about the effect of the treatment when received as intended rather than the effect of merely being assigned to treatment [
1,
2]. Further, the ITT effect is study-specific because it depends on the magnitude and type of observed adherence to the intervention among study participants. That the per-protocol effect is independent of the observed adherence makes it interesting from a societal perspective too. For example, were the screening made available in the future to the Norwegian population, the actual adherence to the intervention could be different from that observed in the trial (not the least because the trial itself contributed to establish the efficacy of screening). As a result, the ITT effect from the trial would be outdated as a tool for decision-making, e.g., for cost-effectiveness analyses. On the other hand, unbiased estimates for the per-protocol effect, while potentially more relevant for decision making, are not achievable from the data alone: investigators need to combine the data with assumptions based on the study design and subject matter expertise. Historically, this has deterred many investigators from estimating the per-protocol effect as expert knowledge is, by definition, provisional and fallible.
Competing interests
The authors report no competing interests.
Authors’ contributions
SS, ØH, ML, MK, MB, GH, EA, and MH contributed to the conception of the current study. SS performed the data analyses and drafted the manuscript. ØH, ML, MK, MB, GH, EA, and MH contributed to the interpretation of the data and provided critical revisions. All authors read and approved the final manuscript.