Skip to main content
Erschienen in: European Journal of Epidemiology 7/2022

Open Access 31.05.2022 | METHODS

Avoiding collider bias in Mendelian randomization when performing stratified analyses

verfasst von: Claudia Coscia, Dipender Gill, Raquel Benítez, Teresa Pérez, Núria Malats, Stephen Burgess

Erschienen in: European Journal of Epidemiology | Ausgabe 7/2022

Abstract

Mendelian randomization (MR) uses genetic variants as instrumental variables to investigate the causal effect of a risk factor on an outcome. A collider is a variable influenced by two or more other variables. Naive calculation of MR estimates in strata of the population defined by a collider, such as a variable affected by the risk factor, can result in collider bias. We propose an approach that allows MR estimation in strata of the population while avoiding collider bias. This approach constructs a new variable, the residual collider, as the residual from regression of the collider on the genetic instrument, and then calculates causal estimates in strata defined by quantiles of the residual collider. Estimates stratified on the residual collider will typically have an equivalent interpretation to estimates stratified on the collider, but they are not subject to collider bias. We apply the approach in several simulation scenarios considering different characteristics of the collider variable and strengths of the instrument. We then apply the proposed approach to investigate the causal effect of smoking on bladder cancer in strata of the population defined by bodyweight. The new approach generated unbiased estimates in all the simulation settings. In the applied example, we observed a trend in the stratum-specific MR estimates at different bodyweight levels that suggested stronger effects of smoking on bladder cancer among individuals with lower bodyweight. The proposed approach can be used to perform MR studying heterogeneity among subgroups of the population while avoiding collider bias.
Hinweise

Supplementary Information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s10654-022-00879-0.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Mendelian randomization (MR) is the use of genetic variants as instrumental variables to assess the causal relationship between a risk factor and an outcome [1, 2]. A valid instrumental variable (IV), or genetic instrument, must meet the following assumptions [3]: IV1, the instrument is associated with the risk factor; IV2, the instrument cannot affect the outcome directly, only potentially indirectly via the risk factor; and IV3, the instrument is not associated with any measured or unmeasured confounders (Fig. 1A). If these assumptions are satisfied, an association of the instrument with the outcome is indicative of a causal effect of the risk factor on the outcome [1, 4]. For point estimation of a causal effect, a further parametric assumption (known as IV4) is required. Two common assumptions are (1) monotonicity: the effect of the IV on the exposure is in the same direction (either an increase or a decrease) for all individuals in the population; or (2) homogeneity: a sufficient assumption is that the causal effect of the exposure on the outcome is constant for all individuals in the population [4]. Under monotonicity, the IV estimate can be interpreted as a local average causal effect; under homogeneity, it can be interpreted as an average causal effect [5]. If either the IV2 or IV3 assumption is not satisfied, then the instrument could be associated with the outcome in the absence of a causal effect of the risk factor. However, only the IV1 assumption can be verified based on measured data [6].
Collider bias can occur when conditioning on a collider, defined as a variable that is a common effect of two or more variables [710]. The existence of a collider can be recognized in a causal diagram when there are two arrows pointing at the same variable; the node at which the arrowheads “collide” together is a collider. For example, in the standard MR diagram, the risk factor is a collider as it is affected by both the instrument and the confounders. Moreover, any variable that is a causal descendent of collider is also affected by the same variables and so is itself a collider; hence in MR any variable influenced by the risk factor is a collider (Fig. 1B). Even if the variables influencing a collider are independent, they will typically become dependent when conditioning on the collider. Hence conditioning on a variable affected by the risk factor will typically generate a conditional association between the instrument and the confounders, violating the IV3 assumption, and biasing Mendelian randomization estimates of the risk factor on the outcome.
Selection bias is a form of collider bias that occurs when selection of individuals into a dataset is dependent on a collider. For example, when disease progression is considered as an outcome, only patients who have already developed the disease would be recruited into the study [7]. If risk of developing the disease is influenced by the risk factor, then it is a collider when considering disease progression as the outcome, and selection of the study sample would result in collider bias. Several papers related to selection bias in the context of IV analysis and MR have been already published [1115]. Inverse probability weighting on the probability of selection has been proposed as a method to avoid selection bias [11, 13].
Collider bias could also occur when stratifying the population based on a collider. As an example, we consider investigating the causal effect of the risk factor on the outcome for individuals with specific levels of a stratifying variable. Stratification is important for identifying whether there are subgroups of the population for which causal effects of the risk factor are different, and so the outcome would be affected more strongly by an intervention on the risk factor. However, if the stratifying variable is a collider, an association between the instrument and the outcome in strata of the population could arise due to collider bias, invalidating the results. In particular, collider bias could affect some estimates more than others, leading to heterogeneity in the stratum-specific causal estimates even if the true causal effect is the same across strata. Although several previous papers have considered collider bias arising due to differential selection into the study sample [11, 13, 16], including when selection is driven by differential survival (a specific example of collider bias known as survival bias) [1719], we are not aware of previous work considering the impact of stratification on a collider variable.
The aim of this paper is to present an MR approach that obtains estimates in strata of the population that do not suffer from collider bias. The structure of this paper is as follows: first, we demonstrate the bias that arises from conditioning on a collider; second, we propose an approach to calculate MR estimates in strata of the population and evaluate heterogeneity between estimates in the different strata; third, we illustrate this new technique in simulation studies and an applied example using the UK Biobank resource; and finally, we discuss the interpretation of estimates and limitations of the approach.

Methods

Illustration of collider bias

The simplest MR method to estimate the causal effect of a risk factor X on outcome Y with a genetic instrument G is the ratio method [2]. With a single instrument, a continuous risk factor and outcome, and under assumptions of linearity and no effect modification, the ratio estimate is defined as: \(\widehat{\theta }= \frac{{\widehat{\beta }}_{YG}}{{\widehat{\beta }}_{XG}}\), where \({\widehat{\beta }}_{YG}\) is the coefficient from regressing Y on G, and \({\widehat{\beta }}_{XG}\) is the coefficient from regressing X on G [2]. If data on G, X, and Y are available in the same individuals (known as “one-sample MR”), the same estimate with a single IV can be obtained using the two-stage-least-squares method.
Collider bias will occur when adjusting for a collider variable C in the regression models for the ratio estimate, since an association between the instrument and the outcome will occur through conditioning on the collider. To demonstrate the impact and magnitude of collider bias, we performed a simulation study in which we compared estimates when no adjustment on C is made versus when the outcome regression is adjusted for C. It is also possible to adjust the risk factor regression for C; however, while this will distort estimates, this adjustment alone will not bias causal estimates when the true causal effect is null. Under the causal null, the genetic association with the outcome will tend towards zero, and so the expectation of the IV estimate will be zero even if the genetic association with the risk factor is misestimated.
In Pearl’s language of d-separation (open and closed paths), conditions for a valid instrument are: (1) there must be an open pathway from the instrument to the exposure, and (2) all pathways between the instrument and outcome must be closed in a modified graph where all edges out of exposure are removed [20]. A path is blocked if it contains a node in a chain (that is, M in the graph \(A\to M\to B\)) or a fork (that is, M in the graph \(A\leftarrow M\to B\)) that is conditioned on, or a collider (that is, M in the graph \(A\to M\leftarrow B\)) such that we neither condition on the collider nor a descendent of the collider [10]. In this case, if we stratify on the exposure or a descendent of the exposure, then the pathway \(G\to X\leftarrow U\to Y\) in Fig. 1B is now open. As this is a pathway between the instrument and outcome that does not contain an edge out of the exposure, this path being open invalidates the instrumental variable assumptions.

Stratification in Mendelian randomization

To further illustrate the impact of collider bias, we performed a simulation study in which we calculated causal estimates using the ratio method within strata of the population defined using a variable that is influenced by the risk factor, and hence is a collider. We compared two approaches: first, we stratified directly on the collider C, and second, we stratified on a new variable C0, referred to as the “residual collider”. The residual collider was generated as the residual from regression of the collider on the genetic instruments:
\({C}_{0}=C- \widehat{C}\), where \(\widehat{C}\) are the fitted values from regression of C on G.
The residual collider C0 is not associated with the instrument, and hence it is not itself a collider. It is influenced by the component of the risk factor that is not a function of G (defined as X0), but not by the component that is a function of G, as shown in Fig. 2, which displays an augmented graph demonstrating that conditioning on C0 does not lead to invalidity of the instrumental variable assumptions. Moreover, provided that the genetic instrument does not explain much of the variance in the risk factor (as is typical in a MR application), it is likely not to explain much of the variance in the collider, and so the residual collider will be highly correlated with the collider. Hence, while stratifying on the residual collider is important to avoid bias, the strata defined by stratifying on the collider or residual collider are likely to be similar and so any difference in the interpretation of stratum-specific estimates is minimal. If the genetic instrument explains a substantial portion of variance in the risk factor, then the residual collider will not be as highly correlated with the collider, and so differences in the strata explained by the residual collider and collider would be more substantial. Even so, stratum-specific estimates represent Mendelian randomization estimates in strata of the population with different average levels of the collider, which can be meaningfully compared.
Here we considered estimates in four strata of the population defined by quartiles of the distribution of the collider or residual collider; however, in practice any number of strata could be considered. We estimated genetic associations with the outcome in each stratum separately. We estimated genetic associations with the risk factor in the full dataset, although if it is believed that these associations vary between strata, it would be possible to estimate these within each stratum as well. The stratum-specific estimate is calculated as the ratio of the stratum-specific genetic association with the outcome divided by the genetic association with the risk factor. The interpretation of stratum-specific estimates is equivalent to that of IV estimates obtained in the whole population; depending on the version of the IV4 assumption, they either target an average or a local average causal effect [4]. We also investigated heterogeneity between the stratum-specific estimates using Cochran’s Q statistic [21], and (in the applied example) we examined the presence of a trend in the estimates by meta-regression of the stratum-specific estimates on the median value of the collider in each stratum [22].

Simulation set-up

To investigate the impact of collider bias in realistic scenarios, we generated simulated data using the following data-generating model:
\({\text{G}},{\text{ U}},\varepsilon _{{\text{X}}} ,\varepsilon _{{\text{Y}}} ,\varepsilon _{{\text{C}}} \sim {\text{ N}}\left( {0,1} \right)\) independently
$${\text{X}} = \alpha _{0} + \alpha _{1} {\text{G}} + \alpha _{2} {\text{U}} + \varepsilon _{{\text{X}}}$$
$${\text{Y}} = \beta _{0} + \beta _{1} {\text{X}} + \beta _{2} {\text{U}} + \varepsilon _{{\text{Y}}}$$
$${\text{C}} = \mu _{0} + \mu _{1} {\text{X}} + \mu _{2} {\text{U}} + \varepsilon _{{\text{C}}}$$
We simulated the instrument G, the confounder U, and the error terms for X, Y and C,\({\upvarepsilon }_{\mathrm{X}}\), \({\upvarepsilon }_{\mathrm{Y}}\) and \({\upvarepsilon }_{\mathrm{C}}\), as independent normally distributed variables. The risk factor X is defined as a linear combination of the instrument, the confounder, and the error term \({\upvarepsilon }_{\mathrm{X}}.\) The outcome Y and the collider C are both linear combinations of the risk factor, confounder, and their error terms. In each simulated dataset, we also generated the residual collider C0 as the residual from regression of C on G as previously described.
The causal estimate of interest is \({\beta }_{1}\), while \({\alpha }_{2}\) and \({\beta }_{2}\) represent the effects of U on X and Y respectively; \({\alpha }_{1}\) is the effect of G on X; and \({\mu }_{1}\) and \({\mu }_{2}\) are the effects of X and U on C, respectively.
We considered three scenarios based on the parameter \({\beta }_{1}\): Scenario A1, where there is a null causal effect of X on Y (\({\upbeta }_{1}=0\)); Scenario A2, where the effect is constant and positive (\({\upbeta }_{1}=0.5\)); and Scenario A3, where the effect depends on C (\({\upbeta }_{1}=0.5+0.2\mathrm{C}\)). In Scenario A1, we considered estimates from the ratio method with and without adjustment for the collider. In Scenarios A2 and A3, we consider stratum-specific estimates from stratification on the collider C or the residual collider C0.
We varied the other parameters to consider the impact of different settings on collider bias: i) \({\mathrm{\alpha }}_{1}\) = (0.05, 0.1, and 0.3), in order to study the impact of the strength of the instrument on estimates; ii) positive confounding \({(\mathrm{\alpha }}_{2}=0.8, {\upbeta }_{2}=0.8)\) negative \({(\mathrm{\alpha }}_{2}=-0.8, {\upbeta }_{2}=-0.8)\) and mixed (\({\mathrm{\alpha }}_{2}=0.8, {\upbeta }_{2}=-0.8)\), to study how the direction of confounding affects the estimates and, iii) \({\upmu }_{1}\, \mathrm{ and }\, {\upmu }_{2}=(-1,-0.5, 0, 0.5, 1)\) to study how the strength of the collider effects influence bias.
We also considered scenarios where the collider is a common effect of X and Y (Fig. 1C). In these scenarios, the collider is generated as \(\mathrm{C}= {\upmu }_{0}+{\upmu }_{1}\mathrm{X}+ {\upmu }_{2}\mathrm{U}+ {\upmu }_{3}\mathrm{Y}+ {\upvarepsilon }_{\mathrm{C}}\), where \({\upmu }_{2}=0.3\) and \({\upmu }_{3}=(-1, -0.5, 0, 0.5, 1)\). In Scenario B1, the causal effect of X on Y is null (\({\upbeta }_{1}=0\)), in Scenario B2, the causal effect is constant and positive (\({\upbeta }_{1}=0.5\)), and in Scenario B3, the causal effect depends on U (\({\upbeta }_{1}=0.5+0.2U\)), as it is not possible for the causal effect to depend on C when C is a function of Y. Finally, we investigated additional scenarios with a binary outcome Y. We generate Y from a Binomial distribution where the probability is obtained from a logit transformation as: \(\mathrm{logit}(P\left(\mathrm{Y}=1\right))= {\upbeta }_{0}+ {\upbeta }_{1}\mathrm{X}+ {\upbeta }_{2}\mathrm{U}\), where \({\upbeta }_{0}=0.5\). In Scenario C1, the causal effect of X on Y is null (\({\upbeta }_{1}=0\)), in Scenario C2, the causal effect is constant and positive (\({\upbeta }_{1}=0.5\)) and in Scenario C3, the causal effect depends on C (\({\upbeta }_{1}=0.5+0.2\mathrm{C}\)). In the binary outcome scenarios, genetic associations with the outcome were estimated by logistic regression. For these additional scenarios, we only consider \({\mathrm{\alpha }}_{1}=0.1\) and the positive confounding values; otherwise, we consider all parameters as in scenarios A1–A3.
We considered a sample size of n = 10,000 and m = 500 replications for each set of parameter values. A directed acyclic graph illustrating the simulation parameters is shown in Fig. 1D.

Applied example: effect of tobacco smoking on bladder cancer risk across bodyweight strata

We applied the proposed MR stratification approach to investigate the causal effect of tobacco smoking on bladder cancer across strata of the population defined by bodyweight. Tobacco smoking is one of the strongest risk factors for cancer, and it has already been reported to be causally associated with bladder cancer risk in a previous Mendelian randomization study [23]. With our current example, the objective was to investigate whether the effect of smoking on the risk of developing bladder cancer is homogeneous across the bodyweight distribution of the population, while avoiding potential collider bias by applying our new stratification approach.
We performed analyses in the UK Biobank study, a population-based cohort of more than 500,000 United Kingdom residents recruited between 2006 and 2010 [24]. For our analysis, we restricted to unrelated European ancestry participants, resulting in a final sample size of 367,643 individuals following sample selection and quality control procedures as described previously [23]. The risk factor is a binary variable representing the smoking behaviour, defined as being a current smoker versus a former or never smoker; the stratifying variable is bodyweight, measured in kg; and the binary outcome is bladder cancer status, defined based on the data from national registries (International Classification of Diseases 9th edition codes: 188, 189.1, 189.2, V10.51, V10.53; or International Classification of Diseases 10th edition codes: C67, C65, C66, Z85.51, Z85.54, Z85.53), and self-reported information from an interview with a nurse practitioner. The instrument for smoking was a weighted genetic risk score comprising 378 conditionally independent SNPs obtained from a genome-wide association study (GWAS) assessing associations with smoking initiation (i.e., probability of ever smoked regularly), and weighted by the associations with smoking initiation [25]. Genetic associations with the risk factor and outcome were obtained by logistic regression in UK Biobank with adjustment for age, sex, and 10 genomic principal components. While age, sex, and principal components cannot logically be colliders as they are not affected by the risk factor or outcome, bodyweight is likely to be a collider, as it is influenced by smoking status [26].

Results

Illustration of collider bias

Results from Scenario A1 (\({\upbeta }_{1}=0\), null causal effect) are presented in Table 1 for \({\mathrm{\alpha }}_{1}=0.1\) (corresponding to R2 = 0.006 for the mean proportion of variance in the risk factor explained by the instrument and a mean F statistic of 60.8) and Supplementary Tables 1 and 2 for \({\mathrm{\alpha }}_{1}=0.3\) (corresponding to R2 = 0.051, mean F statistic of 548.6) and \({\mathrm{\alpha }}_{1}=0.05\) (corresponding to R2 = 0.001, mean F statistic of 15.3). In each case, we report the median estimate of \({\upbeta }_{1}\) across simulations, and the empirical type I error rate, representing the proportion of simulated datasets where the 95% confidence interval for the ratio estimate excludes zero. With no adjustment for the collider, median estimates were close to zero and empirical type I error rate was close to the expected value of 5%. When adjusting for the collider in the regression of Y on G, estimates were biased, and type I error rates were substantially above 5%. The only exception was for \({\upmu }_{1}=0\); in this case, the variable C is not a function of the risk factor, and so does not act as a collider. Bias and type I error rates generally increased for more extreme values of \({\upmu }_{1}\) and \({\upmu }_{2}\) (both positive and negative values). The direction of bias depended on \({\upmu }_{1}\) and \({\upmu }_{2}\) and the direction of confounding.
Table 1
Median of \({\upbeta }_{1}\) estimates and empirical Type I error rates for Scenario A1 (null causal effect \(,{\upbeta }_{1}=\) 0) with positive, negative, and mixed confounding, and \({\mathrm{\alpha }}_{1}=\) 0.1
µ1
µ2
Positive confounding (α2 and β2 = 0.8)
Negative confounding (α2 and β2 = -0.8)
Mixed confounding (α2 = 0.8 and β2 = -0.8)
Median estimate
Type I error rate (%)
Median estimate
Type I error rate (%)
Median estimate
Type I error rate (%)
Median estimate
Type I error rate (%)
Median estimate
Type I error rate (%)
Median estimate
Type I error rate (%)
No adjust for collider
Adjust Y/G for collider
No adjust for collider
Adjust Y/G for collider
No adjust for collider
Adjust Y/G for collider
−1
−1
0.01
7%
−0.27
70%
0.01
5%
0.09
10%
0.00
5%
0.28
69%
−0.5
0.00
6%
−0.28
66%
−0.01
3%
−0.12
15%
0.00
6%
0.29
69%
0
0.00
5%
−0.24
50%
−0.01
4%
−0.26
53%
0.01
5%
0.25
54%
0.5
0.01
6%
−0.10
15%
0.01
6%
−0.28
69%
0.00
6%
0.12
15%
1
0.00
5%
0.08
8%
0.01
3%
−0.27
68%
0.00
4%
−0.08
11%
−0.5
−1
0.01
6%
−0.16
30%
0.00
6%
0.15
21%
0.00
5%
0.17
34%
−0.5
0.00
6%
−0.18
32%
0.00
5%
0.04
7%
0.01
4%
0.18
30%
0
0.00
5%
−0.11
16%
0.00
3%
−0.12
14%
0.00
3%
0.11
14%
0.5
−0.01
5%
0.02
5%
0.00
6%
−0.17
30%
0.00
4%
−0.03
7%
1
0.01
5%
0.15
25%
0.00
6%
−0.18
34%
0.01
4%
−0.15
21%
0
−1
−0.01
7%
0.00
6%
0.00
6%
0.00
6%
0.00
6%
0.00
6%
−0.5
0.00
7%
0.00
6%
0.00
6%
0.00
6%
0.00
5%
−0.01
5%
0
0.00
6%
0.01
6%
0.01
6%
0.01
6%
0.00
6%
0.00
6%
0.5
−0.01
6%
0.00
6%
0.00
5%
0.00
6%
0.00
7%
0.00
7%
1
0.00
5%
0.01
6%
0.00
4%
0.00
4%
−0.01
6%
−0.01
5%
0.5
−1
0.01
4%
0.16
22%
0.01
6%
−0.17
35%
0.00
4%
−0.15
23%
−0.5
−0.01
5%
0.02
5%
−0.01
7%
−0.19
36%
0.01
6%
−0.03
7%
0
0.00
5%
−0.11
14%
0.00
4%
−0.11
15%
−0.01
4%
0.10
15%
0.5
0.01
4%
−0.17
28%
0.00
5%
0.03
7%
0.00
5%
0.18
34%
1
0.01
6%
−0.17
31%
0.01
5%
0.15
23%
0.00
5%
0.17
33%
1
−1
0.01
5%
0.08
8%
0.01
5%
−0.27
70%
0.01
3%
−0.07
9%
−0.5
0.01
4%
−0.10
13%
0.01
5%
−0.27
64%
−0.01
5%
0.11
18%
0
0.01
6%
−0.24
52%
0.00
5%
−0.24
50%
−0.01
3%
0.24
48%
0.5
0.01
5%
−0.27
66%
0.01
4%
−0.11
15%
0.00
4%
0.28
66%
1
0.01
4%
−0.26
68%
0.00
4%
0.08
10%
0.02
5%
0.29
75%
Empirical Type I error rate represents the proportion of simulated datasets where the null hypothesis is not rejected.

Stratification in Mendelian randomization

Results from Scenario A2 (\({\upbeta }_{1}=0.5,\) constant positive effect) are presented in Table 2 for \({\mathrm{\alpha }}_{1}=0.1\) with positive confounding. Supplementary Table 3 shows results for \({\mathrm{\alpha }}_{1}=0.1\) with negative and mixed confounding, and Supplementary Tables 4 and 5 for \({\mathrm{\alpha }}_{1}=0.3\) and \({\mathrm{\alpha }}_{1}=0.05\). We report the median estimate of \({\upbeta }_{1}\) in four strata of the sample defined by quartiles of the collider C or residual collider C0, and the proportion of simulated datasets for which the heterogeneity test statistic is rejected. When stratifying on the collider, median estimates were somewhat variable between the strata, although the proportion of datasets in which the heterogeneity test rejects the null hypothesis of homogeneity was not much above 5% in any scenario, reaching a maximum of 11% when \({\mathrm{\alpha }}_{1}=0.3\). However, if we considered stronger instruments or larger sample sizes, we would see this proportion considerably exceed 5% (see Supplementary Table 6 where we first set \({\mathrm{\alpha }}_{1}=0.5\) and n = 10,000, and then set \({\mathrm{\alpha }}_{1}=0.1\) and n = 50,000, and the type I error rate reached 16% in each case). This was due to increased precision of estimates; the magnitude of bias did not depend strongly on instrument strength. While increasing the effect of the IV on the exposure increases the strength of the conditional association of the IV with the confounder conditional on the collider, and hence increases the coefficient for the association of the IV with the outcome conditional on the collider (the numerator in the ratio estimate), it also increases the coefficient for the association of the IV with the exposure conditional on the collider (the denominator in the ratio estimate). These increases cancel out, and the result is that the bias in the ratio estimate is independent of the strength of the IV. Median estimates differed substantially from the true value of 0.5 across strata, especially when the collider was strongly affected by the risk factor. In contrast, when stratifying on the residual collider, median estimates of \({\upbeta }_{1}\) were close to 0.5 throughout, and there was no suggestion in any case that the heterogeneity test rejected the null above the expected 5% rate.
Table 2
Median of causal estimates in different quartiles, and proportion of datasets in which the homogeneity test was rejected for Scenario A2 (fixed causal effect of \({\upbeta }_{1}=\) 0.5) with positive confounding and \({\mathrm{\alpha }}_{1}=\) 0.1
µ1
µ2
Positive confounding (α2 and β2 = 0.8)
Stratifying on collider, C
Stratifying on residual collider, C0
Proportion homogeneity rejected (%)
Median estimates Q1
Median estimates Q2
Median estimates Q3
Median estimates Q4
Proportion homogeneity rejected (%)
Median estimates Q1
Median estimates Q2
Median estimates Q3
Median estimates Q4
−1
−1
8%
0.11
−0.01
0.01
0.10
8%
0.49
0.48
0.49
0.50
−0.5
6%
0.09
−0.05
−0.05
0.09
5%
0.53
0.48
0.51
0.53
0
7%
0.07
0.01
−0.04
0.07
6%
0.50
0.52
0.50
0.49
0.5
7%
0.19
0.10
0.09
0.17
6%
0.49
0.50
0.49
0.48
1
4%
0.41
0.37
0.40
0.40
4%
0.50
0.48
0.52
0.49
−0.5
−1
7%
0.30
0.21
0.23
0.26
5%
0.53
0.53
0.52
0.48
−0.5
5%
0.24
0.16
0.18
0.25
5%
0.48
0.47
0.51
0.48
0
4%
0.29
0.23
0.21
0.29
4%
0.50
0.47
0.45
0.49
0.5
6%
0.47
0.42
0.47
0.46
6%
0.50
0.50
0.50
0.50
1
3%
0.59
0.65
0.64
0.60
5%
0.48
0.50
0.51
0.50
0
−1
6%
0.50
0.50
0.48
0.48
6%
0.51
0.50
0.48
0.47
−0.5
4%
0.48
0.48
0.52
0.50
4%
0.47
0.47
0.52
0.50
0
4%
0.49
0.48
0.51
0.52
4%
0.49
0.49
0.51
0.51
0.5
5%
0.52
0.49
0.49
0.53
6%
0.54
0.49
0.48
0.53
1
4%
0.48
0.48
0.51
0.49
4%
0.48
0.47
0.52
0.50
0.5
−1
4%
0.62
0.63
0.66
0.63
4%
0.51
0.49
0.52
0.52
−0.5
5%
0.45
0.44
0.44
0.47
4%
0.49
0.50
0.49
0.51
0
4%
0.32
0.23
0.23
0.29
4%
0.51
0.51
0.47
0.49
0.5
5%
0.25
0.18
0.19
0.25
3%
0.51
0.49
0.52
0.50
1
5%
0.26
0.23
0.20
0.28
5%
0.46
0.51
0.49
0.50
1
−1
5%
0.41
0.37
0.34
0.40
5%
0.49
0.50
0.46
0.49
−0.5
5%
0.16
0.11
0.09
0.21
4%
0.49
0.49
0.50
0.51
0
6%
0.12
−0.03
0.00
0.11
6%
0.50
0.50
0.53
0.51
0.5
6%
0.04
−0.03
−0.02
0.07
4%
0.47
0.51
0.51
0.50
1
6%
0.14
0.00
0.03
0.10
6%
0.54
0.49
0.50
0.49
Proportion homogeneity rejected represents the proportion of simulated datasets where the null hypothesis of homogeneity is rejected
Results from Scenario A3 (variable effect) are presented in Table 3 for \({\mathrm{\alpha }}_{1}=0.1\) with positive confounding. Supplementary Table 7 shows results for \({\mathrm{\alpha }}_{1}=0.1\) with negative and mixed confounding, and Supplementary Tables 8 and 9 for \({\mathrm{\alpha }}_{1}=0.3\) and \({\mathrm{\alpha }}_{1}=0.05\). Estimates differed somewhat when stratifying on the collider versus the residual collider, although in both cases median estimates increased across the four strata. The proportion of datasets in which the heterogeneity test was rejected, which in this case represents the empirical power to detect heterogeneity in the stratum-specific estimates, was consistently higher when stratifying on the residual collider, indicating that true differences in the stratum-specific estimates were better detected when stratifying on the residual collider.
Table 3
Median of causal estimates in different quartiles, and proportion of datasets in which the homogeneity test was rejected for Scenario A3 (varying causal effect) with positive confounding and \({\mathrm{\alpha }}_{1}=\) 0.1
  
Positive confounding (α2 and β2 = 0.8)
Stratifying on collider, C
Stratifying on residual collider, C0
µ1
µ2
Proportion homogeneity rejected (%)
Median estimates Q1
Median estimates Q2
Median estimates Q3
Median estimates Q4
Proportion homogeneity rejected (%)
Median estimates Q1
Median estimates Q2
Median estimates Q3
Median estimates Q4
−1
−1
48
−0.39
−0.07
0.14
0.58
95
−0.46
0.19
0.64
1.27
 
−0.5
30
−0.33
−0.06
0.00
0.44
88
−0.36
0.24
0.59
1.15
0
19
−0.25
−0.08
0.00
0.38
78
−0.25
0.22
0.55
1.09
0.5
15
−0.11
0.05
0.14
0.46
61
−0.19
0.24
0.56
0.99
1
16
0.09
0.32
0.44
0.63
40
−0.09
0.28
0.54
0.88
−0.5
−1
36
−0.11
0.15
0.32
0.71
68
−0.07
0.34
0.64
1.08
−0.5
19
−0.07
0.14
0.30
0.58
48
−0.01
0.38
0.64
0.94
0
14
0.08
0.23
0.36
0.57
25
0.11
0.41
0.59
0.87
0.5
11
0.22
0.46
0.55
0.76
16
0.16
0.43
0.56
0.83
1
16
0.35
0.58
0.77
0.98
16
0.19
0.40
0.57
0.81
0
−1
24
0.24
0.49
0.66
0.95
24
0.24
0.51
0.70
0.94
−0.5
14
0.33
0.52
0.67
0.90
15
0.33
0.50
0.66
0.89
0
13
0.34
0.52
0.66
0.86
13
0.35
0.53
0.66
0.87
0.5
13
0.33
0.51
0.69
0.88
13
0.33
0.52
0.69
0.89
1
25
0.24
0.51
0.69
0.98
26
0.25
0.52
0.70
0.99
0.5
−1
18
0.45
0.69
0.87
1.07
18
0.38
0.59
0.77
1.01
−0.5
15
0.34
0.50
0.64
0.88
18
0.37
0.61
0.79
1.03
0
14
0.17
0.30
0.41
0.68
26
0.30
0.61
0.77
1.07
0.5
19
0.05
0.22
0.37
0.72
45
0.20
0.57
0.84
1.17
1
34
0.00
0.25
0.41
0.81
63
0.13
0.56
0.83
1.27
1
−1
16
0.25
0.46
0.55
0.88
40
0.26
0.68
0.90
1.33
−0.5
12
0.03
0.16
0.30
0.58
53
0.19
0.65
0.97
1.37
0
18
−0.10
−0.02
0.11
0.49
70
0.13
0.63
0.97
1.46
0.5
23
−0.16
−0.02
0.11
0.60
83
0.04
0.59
0.98
1.54
1
35
−0.23
0.01
0.20
0.71
94
−0.07
0.55
1.04
1.65
Proportion homogeneity rejected represents the proportion of simulated datasets where the null hypothesis of homogeneity is rejected

Additional scenarios

In Scenarios B1 (\({\upbeta }_{1}=0)\), B2 (\({\upbeta }_{1}=0.5)\) and B3 \(({\upbeta }_{1}=0.5+0.2\mathrm{U})\), where the collider was a function of both the risk factor and outcome, similar results were observed, with collider bias evident when conditioning on the collider (Supplementary Table 10) and when stratifying on the collider (Supplementary Table 11). Collider bias in Scenarios B1 and B2 was greater compared with Scenarios A1 and A2 where the collider was a function of the risk factor only. Similarly, bias was not observed when stratifying on the residual collider (Supplementary Table 11). For Scenario B3, the power of the homogeneity test was lower in comparison to Scenario A3 (Supplementary Table 11), as the dependence of effect heterogeneity on the collider was weaker; however, heterogeneity was detected more often when stratifying on the residual collider than on the collider.
For Scenarios C1 (\({\upbeta }_{1}=0)\), C2 (\({\upbeta }_{1}=0.5)\) and C3 (\({\upbeta }_{1}=0.5+0.2\mathrm{C})\), where the outcome was binary, again similar results were observed, with collider bias evident when conditioning on the collider in Scenario C1 (Supplementary Table 12) and when stratifying on the collider in Scenarios C2 and C3 (Supplementary Table 13). Bias was smaller than in cases with a continuous outcome, although direct comparison is somewhat unfair as estimates with a binary outcome were obtained from logistic regression and so represent log odds ratios. Estimates when stratifying on the residual collider were slightly attenuated from 0.5 due to the non-collapsibility of the odds ratio [27, 28]. Despite this, in Scenario C2 we observed similar estimates across the different strata of C0 for each set of parameter values. Similarly, in Scenario C3 we observed that median stratum-specific estimates increased across the four strata when stratifying on either the collider or residual collider. Power to detect heterogeneity was lower compared with Scenario A3 as the stratum-specific estimates are less precise, although again power was consistently higher when stratifying on the residual collider.

Applied example: effect of tobacco smoking on bladder cancer risk across bodyweight strata

Estimates for the causal effect of smoking on bladder cancer in strata of bodyweight and residual bodyweight are shown in Table 4. Estimates represent the odds ratio for bladder cancer per one unit increase in the log odds of being a current smoker. Estimates were positive in all strata, although larger in strata 1 and 2 for both bodyweight and residual bodyweight, and 95% confidence intervals excluded the null in these strata only. Although the homogeneity test was not rejected for either collider variable (p value = 0.151 and p value = 0.084 for bodyweight and residual bodyweight, respectively), there was evidence of trend in the stratum-specific estimates for residual bodyweight from meta-regression on the mean value of bodyweight in each stratum (p value = 0.019). These results suggest that the effect of smoking on bladder cancer is stronger for subgroups of the population with lower bodyweight.
Table 4
Applied example using UK Biobank to investigate the effect of smoking status on bladder cancer risk in different bodyweight strata
 
Bodyweight Q1
OR [95%CI]
Bodyweight Q2
OR [95%CI]
Bodyweight Q3
OR [95%CI]
Bodyweight Q4
OR [95%CI]
Heterogeneity test p-value
Trend test p value
Stratifying on bodyweight
1.59 [1.08; 2.33]
1.58 [1.16; 2.14]
1.13 [0.87; 1.45]
1.11 [0.88; 1.41]
0.151
0.051
Stratifying on residual bodyweight
1.61 [1.09; 2.37]
1.73 [1.28; 2.34]
1.25 [0.97; 1.62]
1.10 [0.87; 1.39]
0.084
0.019
Bodyweight Q1, Q2, Q3, Q4, represent the four quartiles for both collider and residual collider in which the causal effect of smoking on bladder cancer risk is estimated
Odds ratios (OR) and 95% confidence intervals (95% CI) for bladder cancer are represent estimates per one unit increase in the log odds of being a current smoker

Discussion

In this paper, we have demonstrated that conditioning or stratifying on a variable that is a collider can have a serious impact on MR estimates. We have introduced a simple approach that constructs a new variable, the residual collider, which is typically highly correlated with the collider, but is independent of the instrument. Estimates obtained from stratification on the residual collider did not suffer from bias in a range of simulation studies. Stratification on the residual collider allows investigators to explore causal estimation in relevant subgroups of the population. We applied our new approach to demonstrate that MR estimates for the effect of smoking on bladder cancer differ within strata of bodyweight, suggesting that the effect of smoking is stronger for subgroups of the population with lower bodyweight.
The approach of stratifying on the residual collider follows the same logic as a previously proposed method for non-linear MR, in which causal estimates are obtained in strata of the population defined by the “residual risk factor” or “IV-free exposure” [29, 30]. This variable is defined similarly to the residual collider, except the collider variable is the risk factor itself. This method has been used previously to estimate the causal effect of blood pressure on coronary heart disease risk within strata of blood pressure, resulting in a curve that represents the shape of the causal relationship between the risk factor and the outcome [31]. This paper extends on that method, showing that the same idea can be used to provide causal estimates stratified on a separate variable even if that variable is a collider. A strength of this method is that its implementation does not depend on the causal structure of the data, in particular the relationships between the collider and other variables in the model.
There are some limitations to this approach. First, while the independence of the residual collider from the instrument is theoretically justified, we demonstrated the validity of our approach through simulation studies. Although we considered a range of different scenarios and parameter values, it is not possible to consider every possible data-generating mechanism by which that a collider could arise. Second, in practice, the relationships between variables are unknown, and so it may be unclear whether a proposed stratifying variable is a collider. However, even if the variable is not a collider, it is unlikely stratification on the residual variable will lead to invalid estimates, suggesting that this approach would be valid for stratifying on variables that are not colliders. This was demonstrated in the simulation study when the effect of the risk factor on the “collider” was zero (\({\upmu }_{1}=0\)), and so the stratifying variable was not a collider. One exception is if the stratifying variable is on the causal pathway from the risk factor to the outcome. Stratification on such a variable (a “mediator”) will lead to biased estimates even in the proposed approach. Third, the degree of collider bias depends on the strength of the effects of the risk factor and confounder on the collider, and the direction of confounding. Previous work provides an analytical solution to estimate the magnitude of selection bias [32]. It is possible that collider bias may not be substantial in practice, as observed in the applied example, where estimates were broadly similar when stratifying on bodyweight or residual bodyweight. However, the power to detect heterogeneity in stratum-specific estimates in the simulation study was greater when stratifying on the residual collider, especially when the proportion of variance of the risk factor explained by the instrument was higher. This was also observed in the applied example, where a lower p-value was observed in both the heterogeneity test and the trend test when stratifying on residual bodyweight. Fourthly, the residual collider differs from the collider. While strata defined based on the residual collider will typically be similar to those defined based on the collider, there may be some differences, particularly if the genetic variants explain a substantial proportion of variance in the collider. This means the strata that estimates are obtained in are not so clearly defined, as stratum membership for an individual near to the boundary between two strata would only be evident if their genotype were known. Values of the residual collider can only be calculated when data on the relevant genetic variants are available. Finally, we assumed that the IV assumptions hold; if they do not, estimates will typically be biased. However, several estimation methods that are robust to IV violations are available that allow for consistent estimation under a weaker set of assumptions [33].
The finding that the effect of smoking on bladder cancer is greater in lower bodyweight subgroups is plausible, because for any given level of cigarette consumption smaller individuals will tend to be exposed to greater concentrations of carcinogens [34]. An alternative explanation is that the genetic variants could associate more strongly with smoking intensity in individuals of lower bodyweight. However, we would be cautious not to interpret estimates in the higher bodyweight quartiles as implying an absence of a causal effect in heavier individuals; it is possible that the null estimates reflect limited power. Another possible explanation for the results observed is differential survival bias induced by the age of UK Biobank participants. However, as UK Biobank participants were recruited at a relatively young age (40–65 years), substantial survival bias is unlikely. A limitation of the applied example is overlap between the discovery dataset for the genetic variants, and the dataset used in the MR analysis, which can lead to winner’s curse, and the one-sample setting, which can lead to weak instrument bias.
In conclusion, we recommend that researchers performing MR to investigate causal effects in strata of a population defined by a collider stratify on residual values of the collider rather than stratifying on the collider directly.

Declarations

Conflict of interest

DG is employed part time by Novo Nordisk, outside of the submitted work.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Unsere Produktempfehlungen

e.Med Interdisziplinär

Kombi-Abonnement

Für Ihren Erfolg in Klinik und Praxis - Die beste Hilfe in Ihrem Arbeitsalltag

Mit e.Med Interdisziplinär erhalten Sie Zugang zu allen CME-Fortbildungen und Fachzeitschriften auf SpringerMedizin.de.

e.Dent – Das Online-Abo der Zahnmedizin

Online-Abonnement

Mit e.Dent erhalten Sie Zugang zu allen zahnmedizinischen Fortbildungen und unseren zahnmedizinischen und ausgesuchten medizinischen Zeitschriften.

Anhänge

Supplementary Information

Below is the link to the electronic supplementary material.
Literatur
1.
Zurück zum Zitat Haycock PC, Burgess S, Wade KH, Bowden J, Relton C, Smith GD. Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am J Clin Nutr. 2016;103(4):965–78.CrossRef Haycock PC, Burgess S, Wade KH, Bowden J, Relton C, Smith GD. Best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am J Clin Nutr. 2016;103(4):965–78.CrossRef
2.
Zurück zum Zitat Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2015. Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res. 2015.
3.
Zurück zum Zitat Burgess S, Butterworth AS, Thompson JR. Beyond Mendelian randomization: how to interpret evidence of shared genetic predictors. J Clin Epidemiol. 2016;69:208–16.CrossRef Burgess S, Butterworth AS, Thompson JR. Beyond Mendelian randomization: how to interpret evidence of shared genetic predictors. J Clin Epidemiol. 2016;69:208–16.CrossRef
4.
Zurück zum Zitat Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream?. Epidemiology. 2006. Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s dream?. Epidemiology. 2006.
5.
Zurück zum Zitat Swanson SA, Hernán MA. Commentary: how to report instrumental variable analyses (suggestions welcome). Epidemiology. 2013;24(3):370–4.CrossRef Swanson SA, Hernán MA. Commentary: how to report instrumental variable analyses (suggestions welcome). Epidemiology. 2013;24(3):370–4.CrossRef
6.
Zurück zum Zitat Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.CrossRef Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–14.CrossRef
7.
Zurück zum Zitat Paternoster L, Tilling K, Davey SG. Genetic epidemiology and Mendelian randomization for informing disease therapeutics: conceptual and methodological challenges. PLoS Genet. 2017;13(10): e1006944.CrossRef Paternoster L, Tilling K, Davey SG. Genetic epidemiology and Mendelian randomization for informing disease therapeutics: conceptual and methodological challenges. PLoS Genet. 2017;13(10): e1006944.CrossRef
8.
Zurück zum Zitat Munafò MR, Tilling K, Taylor AE, Evans DM, Smith GD. Collider scope: when selection bias can substantially influence observed associations. Int J Epidemiol. 2018;47(1):226–35.CrossRef Munafò MR, Tilling K, Taylor AE, Evans DM, Smith GD. Collider scope: when selection bias can substantially influence observed associations. Int J Epidemiol. 2018;47(1):226–35.CrossRef
9.
Zurück zum Zitat Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15(5):615–25.CrossRef Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15(5):615–25.CrossRef
10.
Zurück zum Zitat Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82(4):669–88.CrossRef Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82(4):669–88.CrossRef
11.
Zurück zum Zitat Gkatzionis A, Burgess S. Contextualizing selection bias in Mendelian randomization: How bad is it likely to be? Int J Epidemiol. 2019;48(3):691–701.CrossRef Gkatzionis A, Burgess S. Contextualizing selection bias in Mendelian randomization: How bad is it likely to be? Int J Epidemiol. 2019;48(3):691–701.CrossRef
12.
Zurück zum Zitat Hughes RA, Davies NM, Davey Smith G, Tilling K. Selection bias when estimating average treatment effects using one-sample instrumental variable analysis. Epidemiology. 2019;30(3):350–7.CrossRef Hughes RA, Davies NM, Davey Smith G, Tilling K. Selection bias when estimating average treatment effects using one-sample instrumental variable analysis. Epidemiology. 2019;30(3):350–7.CrossRef
13.
Zurück zum Zitat Canan C, Lesko C, Lauc B. Instrumental variable analyses and selection bias. Epidemiology. 2017;28(3):396–8.CrossRef Canan C, Lesko C, Lauc B. Instrumental variable analyses and selection bias. Epidemiology. 2017;28(3):396–8.CrossRef
14.
Zurück zum Zitat Boef AGC, Le Cessie S, Dekkers OM. Mendelian randomization studies in the elderly. Epidemiology. 2015;26(2):e15–6.CrossRef Boef AGC, Le Cessie S, Dekkers OM. Mendelian randomization studies in the elderly. Epidemiology. 2015;26(2):e15–6.CrossRef
15.
Zurück zum Zitat Smit RAJ, Trompet S, Dekkers OM, Jukema JW, Le Cessie S. Survival bias in mendelian randomization studies: a threat to causal inference. Epidemiology. 2019;30(6):813–6.CrossRef Smit RAJ, Trompet S, Dekkers OM, Jukema JW, Le Cessie S. Survival bias in mendelian randomization studies: a threat to causal inference. Epidemiology. 2019;30(6):813–6.CrossRef
16.
Zurück zum Zitat Swanson SA. A practical guide to selection bias in instrumental variable analyses [Internet]. Vol. 30, Epidemiology. Lippincott Williams and Wilkins; 2019. p. 345–9. Swanson SA. A practical guide to selection bias in instrumental variable analyses [Internet]. Vol. 30, Epidemiology. Lippincott Williams and Wilkins; 2019. p. 345–9.
17.
Zurück zum Zitat Tchetgen EJT, Walter S, Vansteelandt S, Martinussen T, Glymour M. Instrumental variable estimation in a survival context. Epidemiology. 2015;26(3):402–10.CrossRef Tchetgen EJT, Walter S, Vansteelandt S, Martinussen T, Glymour M. Instrumental variable estimation in a survival context. Epidemiology. 2015;26(3):402–10.CrossRef
19.
Zurück zum Zitat Hu A, Mustillo SA. Recent development of propensity score methods in observational studies : multi- categorical treatment, causal mediation, and heterogeneity background: propensity score methods in the counterfactual framework. Curr Sociol Imbens Rubin Imbens Wooldridge. 2016;64(1):60–82.CrossRef Hu A, Mustillo SA. Recent development of propensity score methods in observational studies : multi- categorical treatment, causal mediation, and heterogeneity background: propensity score methods in the counterfactual framework. Curr Sociol Imbens Rubin Imbens Wooldridge. 2016;64(1):60–82.CrossRef
20.
Zurück zum Zitat Brito C, Pearl J. Generalized Instrumental Variables. In: Uncertainty in artificial intelligence, proceedings of the eighteenth conference. 2002. p. 85–93. Brito C, Pearl J. Generalized Instrumental Variables. In: Uncertainty in artificial intelligence, proceedings of the eighteenth conference. 2002. p. 85–93.
21.
Zurück zum Zitat Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. Br Med J. 2003;327:557–60.CrossRef Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. Br Med J. 2003;327:557–60.CrossRef
22.
Zurück zum Zitat Thompson SG, Higgins JPT. How should meta-regression analyses be undertaken and interpreted? Stat Med. 2002;21(11):1559–73.CrossRef Thompson SG, Higgins JPT. How should meta-regression analyses be undertaken and interpreted? Stat Med. 2002;21(11):1559–73.CrossRef
23.
Zurück zum Zitat Larsson SC, Carter P, Kar S, Vithayathil M, Mason AM, Michaëlsson K, et al. Smoking, alcohol consumption, and cancer: a mendelian randomisation study in UK Biobank and international genetic consortia participants. PLoS Med. 2020;17(7):1–14.CrossRef Larsson SC, Carter P, Kar S, Vithayathil M, Mason AM, Michaëlsson K, et al. Smoking, alcohol consumption, and cancer: a mendelian randomisation study in UK Biobank and international genetic consortia participants. PLoS Med. 2020;17(7):1–14.CrossRef
24.
Zurück zum Zitat Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):1–10.CrossRef Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):1–10.CrossRef
25.
Zurück zum Zitat Liu M, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Association studies of up to 12 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat Genet. 2019;51(2):237–44.CrossRef Liu M, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Association studies of up to 12 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat Genet. 2019;51(2):237–44.CrossRef
26.
Zurück zum Zitat Taylor AE, Richmond RC, Palviainen T, Loukola A, Wootton RE, Kaprio J, et al. The effect of body mass index on smoking behaviour and nicotine metabolism: a Mendelian randomization study. Hum Mol Genet. 2019;28(8):1322–30.CrossRef Taylor AE, Richmond RC, Palviainen T, Loukola A, Wootton RE, Kaprio J, et al. The effect of body mass index on smoking behaviour and nicotine metabolism: a Mendelian randomization study. Hum Mol Genet. 2019;28(8):1322–30.CrossRef
27.
Zurück zum Zitat Burgess S. Identifying the odds ratio estimated by a two-stage instrumental variable analysis with a logistic regression model. Stat Med. 2013;32(27):4726–47.CrossRef Burgess S. Identifying the odds ratio estimated by a two-stage instrumental variable analysis with a logistic regression model. Stat Med. 2013;32(27):4726–47.CrossRef
28.
Zurück zum Zitat Burgess S. Estimating and contextualizing the attenuation of odds ratios due to non collapsibility. Commun Stat - Theory Methods. 2017;46(2):786–804.CrossRef Burgess S. Estimating and contextualizing the attenuation of odds ratios due to non collapsibility. Commun Stat - Theory Methods. 2017;46(2):786–804.CrossRef
29.
Zurück zum Zitat Staley JR, Burgess S. Semiparametric methods for estimation of a nonlinear exposure-outcome relationship using instrumental variables with application to Mendelian randomization. Genet Epidemiol. 2017;41(4):341–52.CrossRef Staley JR, Burgess S. Semiparametric methods for estimation of a nonlinear exposure-outcome relationship using instrumental variables with application to Mendelian randomization. Genet Epidemiol. 2017;41(4):341–52.CrossRef
30.
Zurück zum Zitat Burgess S, Davies NM, Thompson SG. Instrumental variable analysis with a nonlinear exposure-outcome relationship. Epidemiology. 2014;25(6):877–85.CrossRef Burgess S, Davies NM, Thompson SG. Instrumental variable analysis with a nonlinear exposure-outcome relationship. Epidemiology. 2014;25(6):877–85.CrossRef
31.
Zurück zum Zitat Malik R, Georgakis MK, Vujkovic M, Damrauer SM, Elliott P, Karhunen V, et al. Relationship between blood pressure and incident cardiovascular disease: linear and nonlinear mendelian randomization analyses. Hypertension. 2021;77:2004–13.CrossRef Malik R, Georgakis MK, Vujkovic M, Damrauer SM, Elliott P, Karhunen V, et al. Relationship between blood pressure and incident cardiovascular disease: linear and nonlinear mendelian randomization analyses. Hypertension. 2021;77:2004–13.CrossRef
32.
Zurück zum Zitat Elwert F, Segarra E. Instrumental variables with treatment-induced selection: exact bias results. arXiv. 2020. Elwert F, Segarra E. Instrumental variables with treatment-induced selection: exact bias results. arXiv. 2020.
33.
Zurück zum Zitat Slob EAW, Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol. 2020;44(4):313–29.CrossRef Slob EAW, Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol. 2020;44(4):313–29.CrossRef
34.
Zurück zum Zitat Luo J, Horn K, Ockene JK, Simon MS, Stefanick ML, Tong E, et al. Interaction between smoking and obesity and the risk of developing breast cancer among postmenopausal women. Am J Epidemiol. 2011;174(8):919–28.CrossRef Luo J, Horn K, Ockene JK, Simon MS, Stefanick ML, Tong E, et al. Interaction between smoking and obesity and the risk of developing breast cancer among postmenopausal women. Am J Epidemiol. 2011;174(8):919–28.CrossRef
Metadaten
Titel
Avoiding collider bias in Mendelian randomization when performing stratified analyses
verfasst von
Claudia Coscia
Dipender Gill
Raquel Benítez
Teresa Pérez
Núria Malats
Stephen Burgess
Publikationsdatum
31.05.2022
Verlag
Springer Netherlands
Erschienen in
European Journal of Epidemiology / Ausgabe 7/2022
Print ISSN: 0393-2990
Elektronische ISSN: 1573-7284
DOI
https://doi.org/10.1007/s10654-022-00879-0

Weitere Artikel der Ausgabe 7/2022

European Journal of Epidemiology 7/2022 Zur Ausgabe