nach oben

BMC Medical Research Methodology

Erschienen in:

Open Access 01.12.2022 | Research

Case-only approach applied in environmental epidemiology: 2 examples of interaction effect using the US National Health and Nutrition Examination Survey (NHANES) datasets

verfasst von: Jinyoung Moon, Hwan-Cheol Kim

Erschienen in: BMC Medical Research Methodology | Ausgabe 1/2022

Abstract

Introduction

By substituting the general ‘susceptibility factor’ concept for the conventional ‘gene’ concept in the case-only approach for gene-environment interaction, the case-only approach can also be used in environmental epidemiology. Under the independence between the susceptibility factor and environmental exposure, the case-only approach can provide a more precise estimate of an interaction effect.

Methods

Two analysis examples of the case-only approach in environmental epidemiology are provided using the 2015–2016 and 2017–2018 US National Health and Nutritional Examination Survey (NHANES): (i) the negative interaction effect between blood chromium level and glycohemoglobin level on albuminuria and (ii) the positive interaction effect between blood cobalt level and old age on albuminuria. The second part of the methods (theoretical backgrounds) summarized the logic and equations provided in previous studies about the case-only approach.

Results

(i) When a 1 μg/L difference of both blood chromium level (mcg/L) and a 1% difference in blood glycohemoglobin level coincide, the multiplicative interaction contrast ratio (ICR_c/nc) was 0.72 (95% CI 0.35–1.60), with no statistical significance. However, when only the cases were analyzed, the case-only ICR (ICR_CO) was 0.59 (95% CI 0.28–0.95), with a statistical significance (a negative interaction effect). (ii) When a 1 μg/L difference of both blood cobalt levels and a 1-year difference in age coincide, the multiplicative interaction contrast ratio (ICR_c/nc) was 1.13 (95% CI 0.99–1.37), with no statistical significance. However, when only the cases were analyzed, the case-only ICR (ICR_CO) was 1.21 (95% CI 1.06–1.51), with a statistical significance (a positive interaction effect).

Discussion

The discussion suggested the theoretical background and previous literature about the possible protective interaction effect between blood chromium levels and blood glycohemoglobin levels on the incidence of albuminuria and the possible aggravating interaction effect between blood cobalt levels and increasing ages on the incidence of albuminuria. If the independence assumption between a susceptibility factor and environmental exposure in a study with cases and non-cases is kept, the case-only approach can provide a more precise interaction effect estimate than conventional approaches with both cases and non-cases.

Additional file 1: Supplementary material A. The used R codes for the statistical analyses. Supplementary material B. The S-E independence in the controls cannot replace the S-E independence in the population with cases and non-cases [1]. Supplementary material C. How strong a rare disease assumption is required for the equality between S-E ORc/nc and S-E ORcontrol [1]. Supplementary material D. Violation of independence: confounder [1].

Supplementary Information

The online version contains supplementary material available at https://doi.org/10.1186/s12874-022-01706-6.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

The estimation of an interaction effect has often been conducted in cohort or case-control studies using information from both cases and controls [1‐4]. However, a case-only approach can be a valid alternative and even may have advantages under certain circumstances over conventional approaches that use information from both cases and controls.

The case-only approach is used to calculate the interaction effect estimate. This unique approach is mainly used in gene-environmental and gene-gene interaction studies in genetic epidemiology [5‐8]. However, if the ‘gene’ concept in the gene-environmental interaction could indicate a type of ‘susceptibility factor,’ the term ‘gene-environment interaction’ in genetic epidemiology can be replaced with the ‘susceptibility factor-environmental exposure interaction’ in environmental epidemiology.

The case-only approach can provide 2 benefits over a study with cases and non-cases or conventional cohort/case-control studies to estimate the interaction effect between a susceptibility factor and an environmental exposure [5, 7‐13]. The first is that a more precise interaction effect estimate can be calculated. The second is that this approach can estimate the interaction effect when appropriate controls are unavailable. However, this case-only approach requires an important condition between the susceptibility factor and the environmental exposure studied: independence [5, 14]. If this independent assumption between a susceptibility factor and an environmental exposure is not fulfilled, the case-only interaction estimate might be biased severely from the interaction effect estimate acquired from a study with cases and non-cases.

This study will summarize all logic, definitions, and equations about the case-only approach through various study types, including case-only studies and a study with cases and non-cases, including case-control and cohort studies. In addition, this study will deal with important assumptions and the relationship among these assumptions, which are required for the reliable estimation of the interaction effect in the case-only approach. Possible corrective strategies for the violation of the independence assumption will also be dealt with. Finally, 2 analysis examples of the case-only approach will be illustrated using the US NHANES dataset. This study can clarify the logic and equations of the case-only approach and contribute to applying the case-only approach of genetic epidemiology to environmental epidemiology.

Methods: application for real data – 2 examples

In this study, 2 analysis examples using the US National Health and Nutritional Examination Survey (NHANES) data will be provided (https://www.cdc.gov/nchs/nhanes/index.htm). The case-only approach applied in environmental epidemiology will be explained using this dataset.

The preventive (negative) interaction effect between blood chromium level and glycohemoglobin level on albuminuria (micro and macro)

The laboratory data of NHANES 2015–2016 and NHANES 2017–2018 datasets were used. The blood chromium levels (mcg/L) were used as the environmental exposure variable, and the glycohemoglobin levels (%) were used as the susceptibility factor variable. The albumin creatinine ratio (mg/g) was the outcome (disease) variable.

The chromium level of 1.4 mcg/L was set as the standpoint between normal and abnormal chromium levels. The albumin creatinine ratio of 300 mg/g was set as the standpoint between normal and albuminuria (micro and macro). Both micro-albuminuria and macro-albuminuria were categorized in the single ‘albuminuria’ category. Glycohemoglobin level was used as a continuous variable without conversion to a categorical variable. Because of possible confounding due to diabetes treatment (glucose-lowering medications), all respondents with the ‘yes’ answer to the question ‘take diabetic pills to lower blood sugar’ were excluded from the analysis.

The aggravating (positive) interaction effect between blood cobalt level and old age on albuminuria (micro and macro)

The laboratory data and demographics data of NHANES 2015–2016 and NHANES 2017–2018 datasets were used. The blood cobalt level (mcg/L) in laboratory data was used as the environmental exposure variable, and age in years in demographics data was used as the susceptibility factor variable. Albumin creatinine ratio (mg/g) in laboratory data was used as the outcome variable.

The cobalt level of 1.8 mcg/L was set as the standpoint between normal and abnormal cobalt levels. The albumin creatinine ratio of 300 mg/g was set as the standpoint between normal and albuminuria. Both micro-albuminuria and macro-albuminuria were categorized as a single ‘albuminuria’ category. Age in years was applied as a continuous variable without conversion to a categorical variable.

Calculation of estimates

All abbreviations used in this article are provided in Table 1. First, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the blood chromium level was calculated in the first example. In the second example, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the blood cobalt level was calculated. Second, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the blood glycohemoglobin level was calculated in the first example. In the second example, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the age in years was calculated. Third, the estimate with an appropriate confidence interval for the multiplicative ICR associated with the difference of one unit in both the blood chromium level and the blood glycohemoglobin level was calculated in the first example. In the second example, the estimate with an appropriate confidence interval for the multiplicative ICR associated with the difference of one unit in both the blood cobalt level and age in years was calculated. Fourth, the independence between the blood chromium level and blood glycohemoglobin level was assessed in the whole sample, including cases and non-cases in the first example. In the second example, the independence between the blood cobalt level and age in years was assessed in the whole sample, including cases and non-cases. Fifth, only if the independence mentioned in the fourth item was plausible the multiplicative ICR using only cases were calculated. If the independence mentioned in the fourth item was not plausible, the multiplicative ICR calculated based on only cases was adjusted based on theoretical equations (multiplied by the S-E OR_c/nc). After these steps, the authors concluded whether the estimate derived from only cases is more precise than the estimate obtained from both cases and non-cases.

Table 1

Abbreviations

Abbreviations	Definition	Equation
RR	Relative Risk
OR	Odds Ratio
S	Susceptibility factor
E	Environmental exposure
ICR_c/nc	The interaction contrast ratio (ICR) in a study with cases and non-cases	${\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}=\frac{{\mathrm{RR}}_{\mathrm{s}\mathrm{e}}}{{\mathrm{RR}}_{\mathrm{s}}{\mathrm{RR}}_{\mathrm{e}}}=\left(\frac{\mathrm{ag}}{\mathrm{c}\mathrm{e}}\right)\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)$
ICR_cc	The ICR in a case-control study	${\mathrm{ICR}}_{\mathrm{cc}}=\frac{{\mathrm{OR}}_{\mathrm{s}\mathrm{e}}}{{\mathrm{OR}}_{\mathrm{s}}{\mathrm{OR}}_{\mathrm{e}}}=\left(\frac{\mathrm{ag}}{\mathrm{ce}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)$
ICR_co	The ICR in a case-only study	${\mathrm{ICR}}_{\mathrm{co}}=\left(\frac{\mathrm{ag}}{\mathrm{ce}}\right)$
S-E OR_c/nc	Susceptibility factor-Environmental exposure odds ratio in a study with cases and non-cases	S-E OR_c/nc = $\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)$
S-E OR_control	Susceptibility factor-Environmental exposure odds ratio in the control population	S-E OR_control$=\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)=\frac{\mathrm{df}}{\mathrm{bh}}$

Statistical method and software

A logistic regression model was applied for the calculation of odds ratios. The R software version 4.0.3 was used. Package ‘dplyr’ and ‘data.table’ were used for the pre-processing of the datasets. The used R codes are provided in Supplementary material A.

Methods: theoretical backgrounds

Basic assumption: the joint and ICR on the multiplicative scale

Statistical interactions between the effects of susceptibility factors and those of environmental factors can be assessed as departures from multiplicativity of effects or as departures from additivity of effects. Table 2 indicates an example of a study with cases and non-cases. With the unexposed and no susceptibility (E-G-) group set as the reference group, we can calculate relative risk (RR) and odds ratio (OR) for all other 3 groups.

Table 2

An example of a study with cases and non-cases

Environment (E)	Susceptibility (S)	Disease	No disease	Total	Relative Risk (RR)	Odds Ratio (OR)
–	–	a	B	a + B	1.0 (ref)	1.0 (reference)
–	+	c	D	c + D	RR_s=$\frac{\mathrm{c}\left(\mathrm{a}+\mathrm{B}\right)}{\mathrm{a}\left(\mathrm{c}+\mathrm{D}\right)}$	OR_s=$\frac{\mathrm{cB}}{\mathrm{aD}}$
+	–	e	F	e + F	RR_e=$\frac{\mathrm{e}\left(\mathrm{a}+\mathrm{B}\right)}{\mathrm{a}\left(\mathrm{e}+\mathrm{F}\right)}$	OR_e=$\frac{\mathrm{eB}}{\mathrm{aF}}$
+	+	g	H	g + H	RR_se=$\frac{\mathrm{g}\left(\mathrm{a}+\mathrm{B}\right)}{\mathrm{a}\left(\mathrm{g}+\mathrm{H}\right)}$	OR_se=$\frac{\mathrm{gB}}{\mathrm{aH}}$

Under additive scale: ICR_c/nc = RR_se-(RR_s + RR_e-1), ICR_c/nc = OR_se-(OR_s + OR_e-1)

Under multiplicative scale: ICR_c/nc = RR_se/(RR_s × RR_e), ICR_c/nc = OR_se/(OR_s × OR_e)

The joint RR for the susceptibility factor and environmental exposure (RR_se) can be compared with the RR for environmental exposure alone (RR_e) or with the RR for susceptibility factor alone (RR_s). The joint OR for the susceptibility factor and environmental exposure (OR_se) can be compared with the OR for environmental exposure alone (OR_e) or with the OR for susceptibility factor alone (OR_s). In the joint RR model with the additive scale, the ICR (ICR_c/nc) indicates the departures from the sum of individual RRs minus one (ICR_c/nc = RR_se-(RR_s + RR_e-1)). This equation is called ‘relative excess risk due to interaction (RERI)’ in epidemiologic literature [15]. In the joint OR model with the additive scale, the ICR (ICR_c/nc) indicates the departures from the sum of individual ORs minus one (ICR_c/nc = OR_se-(OR_s + OR_e-1)). In the joint RR model with the multiplicative scale, the ICR (ICR_c/nc) indicates the departures from the product of individual RRs (ICR_c/nc = RR_se/(RR_s × RR_e)). In the joint OR model with the multiplicative scale, the ICR (ICR_c/nc) indicates the departures from the product of individual ORs (ICR_c/nc = OR_se/(OR_s × OR_e)). In this article, we used only the joint RR or the joint OR model with the multiplicative scale to estimate the ICR_c/nc.

The ICR in a case-only study and the ICR in a study with cases and non-cases

Table 2 illustrates the composition of a study with cases and non-cases. To generate case-only data from the above source population, we extracted only the ‘case’ column in Table 3.

Table 3

An example of a case-only study

		Susceptibility (S)
		–	+
Environment (E)	–	a	c
Environment (E)	+	e	g

The ICR in a case-only study will be as follows:

$${\mathrm{ICR}}_{\mathrm{c}\mathrm{o}}=\frac{\left[\frac{\left\{\frac{\mathrm{a}}{\mathrm{a}+\mathrm{e}}\right\}}{\left\{\frac{\mathrm{e}}{\mathrm{a}+\mathrm{e}}\right\}}\right]}{\left[\frac{\left\{\frac{\mathrm{c}}{\mathrm{c}+\mathrm{g}}\right\}}{\left\{\frac{\mathrm{g}}{\mathrm{c}+\mathrm{g}}\right\}}\right]}=\left(\frac{\mathrm{a}\mathrm{g}}{\mathrm{c}\mathrm{e}}\right)$$

(1)

The ICR in a study with cases and non-cases will be as follows:

$$\kern1em {\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}=\frac{{\mathrm{RR}}_{\mathrm{s}\mathrm{e}}}{{\mathrm{RR}}_{\mathrm{s}}{\mathrm{RR}}_{\mathrm{e}}}=\left(\frac{\mathrm{ag}}{\mathrm{c}\mathrm{e}}\right)\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)=\left({\mathrm{ICR}}_{\mathrm{c}\mathrm{o}}\right)\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)$$

(2)

In Eq. (2), (ag/ce) is converted into ICR_co obtained in the case-only study. ICR_c/nc is the ICR calculated in a study with cases and non-cases. From Eq. (2), the requirement for the equality between the ICR acquired from a study with cases and non-cases and the ICR acquired from the case-only study is as follows:

$$\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)=\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}/\mathrm{nc}}=1$$

(3)

Equation (3) means that the environmental exposure and the susceptibility factor must be independent in a study with cases and non-cases for the equality between the ICR acquired from a study with cases and non-cases and the ICR acquired from the case-only study. In Eqs. (2) and (3), we should note that the equality between the ICR from a study with case and non-cases and the ICR from the case-only study does not necessarily require a rare disease assumption (a low prevalence of the disease).

The above equations in this subsection can be understood from the context of a logistic model, with other covariates adjusted. The following equations indicate a conventional logistic regression model for a case-only study:

$$\mathrm{logit}\ \mathrm{P}\left(\mathrm{S}=1\right)={\upgamma}_0+{\upgamma}_1\ \mathrm{E}$$

(4)

$${\mathrm{ICR}}_{\mathrm{co}}=\exp \left({\upgamma}_1\right)$$

(5)

When E is a categorical or continuous variable for environmental exposure status, a case-only estimate for the interaction effect can be obtained using Eq. (5).

We can also assess the independence between an environmental factor and a susceptibility factor in a study with cases and non-cases from the context of a logistic model using the following equations:

$$\mathrm{logit}\ \mathrm{P}\left(\mathrm{S}=1\right)={\upeta}_0+{\upeta}_1\mathrm{E}$$

(6)

$$\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}/\mathrm{nc}}=\exp \left({\upeta}_1\right)$$

(7)

According to the independence assumption provided in Eq. (3), the environmental exposure and the susceptibility factor must be independent in the population with cases and non-cases for the equality between the ICR obtained in the population with cases, and non-cases and the ICR obtained in the case-only study. From the context of a logistic model, this means that the confidence interval for Eq. (7) must include 1 and that the point estimate for Eq. (7) must be close to 1.

We can also calculate the ICR obtained in the population with cases and non-cases from the context of a logistic model, using the following equation:

$$\mathrm{logit}\ \mathrm{P}\left(\mathrm{D}=1\right)={\upbeta}_0+{\upbeta}_1\mathrm{S}+{\upbeta}_2\mathrm{E}+{\upbeta}_3\mathrm{SE}$$

(8)

$${\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}=\exp \left({\upbeta}_3\right)$$

(9)

The ICR in a case-control study

We can define the susceptibility-environment ICR acquired from a case-control study in the model with the multiplicative scale as follows:

$${\mathrm{ICR}}_{\mathrm{cc}}={\mathrm{OR}}_{\mathrm{s}\mathrm{e}}/\left({\mathrm{OR}}_{\mathrm{s}}\times {\mathrm{OR}}_{\mathrm{e}}\right)$$

(10)

ICR_cc: the ICR calculated in a case-control study.

ICR_cc > 1: The joint OR is larger than the product of each individual OR.

ICR_cc < 1: The joint OR is smaller than the product of each individual OR.

ICR_cc = 1: The joint OR is the same as the product of each individual OR.

If the joint OR is larger than the product of each individual OR, the ICR_cc will be larger than 1. If the joint OR is smaller than the product of each individual OR, the ICR_cc will be smaller than 1. If the joint OR is the same as the product of each individual OR, the ICR_cc will be 1.

The ICR in a case-only study and the ICR in a case-control study

For the generation of the case-control study data, a fraction (p) of controls in each group was selected from the population with cases and non-cases in Table 4.

Table 4

A case-control study data generated from a population with cases and non-cases

Environment (E)	Susceptibility (S)	Case	Control	Odds Ratio (OR)
–	–	a	b = pB	1.0 (ref)
–	+	c	d = pD	OR_s=$\frac{\mathrm{cb}}{\mathrm{ad}}$=$\frac{\mathrm{cB}}{\mathrm{aD}}$
+	–	e	f = pF	OR_e=$\frac{\mathrm{eb}}{\mathrm{af}}$=$\frac{\mathrm{eB}}{\mathrm{aF}}$
+	+	g	h = pH	OR_se=$\frac{\mathrm{gb}}{\mathrm{ah}}$=$\frac{\mathrm{gB}}{\mathrm{aH}}$

The ICR in a case-control study can be calculated as follows:

$${\mathrm{ICR}}_{\mathrm{cc}}=\frac{{\mathrm{OR}}_{\mathrm{s}\mathrm{e}}}{{\mathrm{OR}}_{\mathrm{s}}{\mathrm{OR}}_{\mathrm{e}}}=\left(\frac{\mathrm{ag}}{\mathrm{ce}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)=\left({\mathrm{ICR}}_{\mathrm{co}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)$$

(11)

In Eq. (11), the requirement for equality between ICR_cc and ICR_co is as follows:

$$\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)=\frac{\mathrm{df}}{\mathrm{bh}}=\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{control}}=1$$

(12)

Equation (12) means that for the equality between ICR_cc and ICR_co, the susceptibility factor and environmental exposure must be independent in the control population. A rare disease assumption is also not required for this equality.

We can also calculate the ICR in a case-control study from the context of a logistic model, using the following equation:

$$\mathrm{logit}\ \mathrm{P}\left(\mathrm{D}=1\right)={\upbeta}_0+{\upbeta}_1\mathrm{S}+{\upbeta}_2\mathrm{E}+{\upbeta}_3\mathrm{SE}$$

(13)

$${\mathrm{ICR}}_{\mathrm{cc}}=\exp \left({\upbeta}_3\right)$$

(14)

The ICR in a study with cases and non-cases and the ICR in a case-control study

The equality between ICR_cc and ICR_co does not mean that these 2 estimates are not biased away from the ICR acquired from the population with cases and non-cases (ICR_c/nc). Based on Eqs. (2) and (11), we can get the following equation:

$${\mathrm{ICR}}_{\mathrm{c}\mathrm{c}}={\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}\ \frac{\left(\mathrm{DF}\right)}{\left(\mathrm{BH}\right)}\ \left(\frac{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}\right)$$

(15)

In Eq. (15), for the equality between ICR_cc and ICR_c/nc, the following equation or at least 1 of 2 conditions suggested below should be met:

$${\displaystyle \begin{array}{c}\frac{\left(\mathrm{DF}\right)}{\left(\mathrm{BH}\right)}\ \left(\frac{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}\right)=1\\ {}\left[\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}\mathrm{ontrol}}=\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}/\mathrm{nc}}=1\right]\ \mathrm{or}\ \left[\mathrm{the}\ \mathrm{disease}\ \mathrm{is}\ \mathrm{rare}\right]\end{array}}$$

(16)

Equation (16) means that for the equality between ICR_cc and ICR_c/nc, the susceptibility factor and the environmental exposure must be independent both in the population with cases and non-cases and in the controls. Alternatively, if the disease is rare, Eq. (16) will be satisfied. In this case, the rare disease assumption must be examined in the population with cases and non-cases.

S-E independence in the population with cases and non-cases and S-E independence in the controls: one cannot replace the other

If we evaluate Eq. (16) in detail, we can find an important relationship. The S-E independence in the controls is a totally different concept from the S-E independence in the population with cases and non-cases: one cannot replace the other.

$${\mathrm{ICR}}_{\mathrm{c}\mathrm{c}}={\mathrm{ICR}}_{\mathrm{c}\mathrm{o}}={\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}$$

(17)

For the first equal sign, S-E OR_control = 1 is required according to Eq. (11).

For the second equal sign, S-E OR_c/nc = 1 is required according to Eq. (2).

If the disease is rare, ${\mathrm{ICR}}_{\mathrm{cc}}=\left({\mathrm{ICR}}_{\mathrm{co}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)$ according to Eq. (11), and ${\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}=\left({\mathrm{ICR}}_{\mathrm{c}\mathrm{o}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)$ according to Eq. (2).

$$\mathrm{Therefore},\kern0.5em {\mathrm{ICR}}_{\mathrm{c}\mathrm{c}}={\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}\ne {\mathrm{ICR}}_{\mathrm{c}\mathrm{o}}$$

(18)

If a researcher uses whether or not S-E OR_controls equals 1, instead of whether or not S-E OR_c/nc equals 1, for the assessment of the validity of using ICR_co instead of using ICR_c/nc, this misuse can lead to either the rejection of the valid ICR_co or the acceptance of the invalid ICR_co mistakenly.

In Supplementary material B, an example from Gatto et al. [8] is provided for this problem. In the first example, S and E are independent in the population, including cases and non-cases (S-E OR_c/nc = 1). The interaction estimate in the population, including cases and non-cases (i.e., ICR_c/nc) is 2.5. The ICR_co is also 2.5. In this situation, the S-E OR_control of 0.7 does not provide a reliable estimation for S-E OR_c/nc of 1.0. In the second example, the S-E OR_c/nc is 2.0, showing a non-independent relationship. The ICR_c/nc is 1.0, but ICR_co is 2.0. In this situation, the S-E OR_control of 1.0 does not provide a reliable estimation for S-E OR_c/nc of 2.0.

The rare disease assumption: for ICR_cc = ICR_c/nc and S-E OR_control = S-E OR_c/nc

The rare disease assumption provides 2 implications in this discussion of the case-only approach. The first implication is provided in Eq. (18). The second implication is the following:

$\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)=$S-E OR_c/nc from Eq. (3) $\mathrm{and}\ \left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)=\frac{\mathrm{df}}{\mathrm{bh}}=$ S-E OR_control from Eq. (12)

$$\mathrm{If}\ \mathrm{the}\ \mathrm{disease}\ \mathrm{is}\ \mathrm{rare},\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}/\mathrm{nc}}=\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}\mathrm{ontrol}}=\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)$$

(19)

In this subsection, we will deal with the second implication. Equation (20) indicates the relationship between S-E OR_control and S-E OR_c/nc [8].

$$\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}\mathrm{ontrol}}=\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}/\mathrm{nc}}\times \left(\frac{\left(\frac{1}{\mathrm{p}\left(\mathrm{D}|\mathrm{S}-\mathrm{E}-\right)}-1\right)\times \left(\frac{1}{\mathrm{p}\left(\mathrm{D}|\mathrm{S}-\mathrm{E}-\right)}-{\mathrm{RR}}_{\mathrm{SE}}\right)}{\left(\frac{1}{\mathrm{p}\left(\mathrm{D}|\mathrm{S}-\mathrm{E}-\right)}-{\mathrm{RR}}_{\mathrm{G}}\right)\times \left(\frac{1}{\mathrm{p}\left(\mathrm{D}|\mathrm{S}-\mathrm{E}-\right)}-{\mathrm{RR}}_{\mathrm{E}}\right)}\right)$$

(20)

In Gatto et al. [8], the authors used Eq. (20) to conduct a sensitivity analysis (Supplementary material C). The article assessed the impact of the baseline risk of disease in the population (p(D|S-E-)) and the independent effect of S (RR_S) on the S-E OR_control when the S-E OR_c/nc is 1.0. In Supplementary material C, the baseline risk of disease ranges from 0.1 to 6%. As illustrated in Supplementary material C, the S-E OR_control is similar to the S-E OR_c/nc of 1.0 when either the baseline risk of disease (p(D|S-E-)) is under 1%, and the independent effect of S is relatively low (RR_S < 2.5). However, as the baseline risk of disease approaches 3%, the S-E OR_control begins to diverge from the S-E OR_c/nc of 1.0. This worsens when the independent effect of the susceptibility factor increases.

Violation of independence: confounder and subpopulation dependence

The violation of independence between S and E occurs when an individual alters his or her environmental exposure according to his or her susceptibility factor. This violation is due to 2 factors mainly: (i) a confounder and (ii) subpopulation dependence.

Gatto et al. [8] provide 2 examples of confounders. In the first example of Supplementary material D, the family history functions as a confounder, and in the second example of Supplementary material D, the adverse reaction to alcohol functions as a mediator between the susceptibility factor and the environmental exposure. For these 2 examples, the positive multiplicative interaction (ICR_CO of > 1) will be biased towards the null (ICR_CO ≈ 1) because of the overall negative association between S and E due to C.

If these covariates can be adjusted, the independence between S and E can be restored.

$$\mathrm{logit}\ \mathrm{P}\left(\mathrm{S}=1\right)={\upgamma_0}^{'}+{\upgamma_1}^{'}\mathrm{E}+{\upgamma_2}^{'}\mathrm{C}$$

(21)

$$\mathrm{adjusted}\ {\mathrm{ICR}}_{\mathrm{CO}}\left(\mathrm{adjusted}\ \mathrm{for}\ \mathrm{covariate}\ \mathrm{C}\right)=\exp \left({\upgamma_1}^{'}\right)$$

(22)

However, a cautious approach is required because the adjustment of unrelated covariates with S-E dependence would cost some degrees of freedom and would reduce the precision of ICR_CO [8].

Another source of the violation of independence is a hidden dependence on a subpopulation. Wang et al. [9] provide a unique solution for this problem, providing the following Eq. (9):

$$\mathrm{CIR}={\mathrm{r}}_{\mathrm{S}\mathrm{E}}\ \mathrm{\times}\ {\mathrm{CV}}_{\mathrm{S}}\ \mathrm{\times}\ {\mathrm{CV}}_{\mathrm{E}}+1$$

(23)

CIR: Confounding Interaction Ratio. r_SE: the correlation coefficient between S and E. CV_S: variation in susceptibility factor prevalence odds. CV_E: variation in environmental exposure prevalence odds.

$${\mathrm{CIR}}_{\mathrm{U}}=\frac{\sqrt{\upupsilon_{\mathrm{S}}{\upupsilon}_{\mathrm{E}}}\times {\left(\sqrt{\upupsilon_{\mathrm{S}}{\upupsilon}_{\mathrm{E}}}+1\right)}^2}{\left(\sqrt{\upupsilon_{\mathrm{S}}{\upupsilon}_{\mathrm{E}}}+{\upupsilon}_{\mathrm{S}}\right)\left(\sqrt{\upupsilon_{\mathrm{S}}{\upupsilon}_{\mathrm{E}}}+{\upupsilon}_{\mathrm{E}}\right)}\ge 1,{\mathrm{CIR}}_{\mathrm{L}}=\frac{1}{\mathrm{U}}\le 1$$

(24)

CIR_U: the upper bound of CIR, CIR_L: the lower bound of CIR, υ_S(υ_S ≥ 1): the ratio of the largest and the smallest susceptibility frequency odds across all strata. υ_E(υ_E ≥ 1): the ratio of the largest and the smallest exposure frequency odds across all strata.

In Eq. (23), CIR is the ratio of the crude ICR_c/nc without stratification over ICR_c/nc with stratification. According to the above equation, there would be no population stratification bias (CIR =1), (i) if the exposure prevalence odds and the susceptibility frequency odds are uncorrelated across all strata (r_ES = 0), (ii) no variation exists in the exposure prevalence odds (CV_E = 0), or (iii) no variation exists in the susceptibility frequency odds (CV_S = 0).

In Eq. (24), υ_S(υ_S ≥ 1) denotes the ratio of the largest over the smallest susceptibility frequency odds, and υ_E(υ_E ≥ 1) denotes the ratio of the largest over the smallest exposure prevalence odds across all the strata in the population. If there is either no variation in the susceptibility frequency odds (υ_S = 1) or in the exposure prevalence odds (υ_E = 1), there would be no bias (U = L = 1) according to Eq. (24). If we can calculate CIR for a population, we can calculate ICR_c/nc with stratification.

For the violation of S-E independence, researchers usually would try to evaluate a potential confounder based on their subject-matter knowledge. However, for subpopulation dependence, attention should be paid to the whole study population and the strata rather than finding a confounder. This important difference should be in the mind of researchers using a case-only approach.

The efficiency gained from the case-only approach

Case-only approach can calculate a more precise interaction effect estimate (i.e., that with a narrower confidence interval) than a study design with case and non-cases, such as a cohort/case-control study approach can do [16].

In Eqs. (8) and (9), and Table 2, the asymptotic variance of $\hat{\upbeta}$₃ in a population with cases and non-cases is as follows:

$$\mathrm{Var}\left({\hat{\upbeta}}_3\right)=\frac{1}{a}+\frac{1}{B}+\frac{1}{c}+\frac{1}{D}+\frac{1}{e}+\frac{1}{F}+\frac{1}{g}+\frac{1}{H}$$

(25)

In Eqs. (13) and (14), and Table 4, the asymptotic variance of ${\overline{\overline{\upbeta}}}_3$ in a case-control study is as follows:

$$\mathrm{Var}\left(\overline{\overline{\upbeta_3}}\right)=\frac{1}{a}+\frac{1}{b}+\frac{1}{c}+\frac{1}{d}+\frac{1}{e}+\frac{1}{f}+\frac{1}{g}+\frac{1}{h}$$

(26)

In Eqs. (4), Eq. (5), and Table 3, the asymptotic variance of $\hat{\gamma}$₁ in a case-only study is as follows:

$$\mathrm{Var}\left({\hat{y}}_1\right)=\frac{1}{a}+\frac{1}{c}+\frac{1}{e}+\frac{1}{g}$$

(27)

Comparing Eq. (27) with Eqs. (25) and (26), the case-only design can provide an estimate with a narrower confidence interval than either the case-control or the cohort design (study designs with cases and non-cases) can do. This efficiency gain comes from the independence assumption between susceptibility factor and environmental exposure (S-E OR_c/nc = 1).

Methodological issues to be considered

Several issues must be considered when applying the case-only approach to estimating the interaction effect between a susceptibility factor and an environmental exposure. Firstly, the case selection process must follow a typical rule of case selection as in a case-control study. Secondly, researchers must verify independence between the susceptibility trait and the environmental exposure in the population with cases and non-cases to substitute the ICR_CO calculated in a case-only design for the ICR_c/nc calculated in a population with cases and non-cases (according to Eqs. (2) and (3)). If evidence of an association between susceptibility factor and environmental exposure exists, the calculated S-E OR_c/nc must be used to correct the ICR_CO by multiplying it as provided in Eq. (2). Thirdly, the independence assumption might seem reasonable for various susceptibility factors and environmental exposures. However, some susceptibility factors can modify the likelihood of environmental exposure. This hidden association must be discovered before a case-only approach is applied. Finally, the interaction effect estimate (ICR_CO) obtained from the case-only approach can only be interpreted as a departure from the multiplicative effect and not from the additive effect. However, according to previous epidemiologic literature, additive interaction more closely corresponds to mechanistic biologic interaction effects rather than merely statistical interaction effects [17, 18]. Even though this is true, researchers in the current academic societies often use the multiplicative scale to estimate interaction effects because of several practical reasons [18]. This limitation should be considered when the results of this study are applied.

Summary

In summary, the case-only approach can be applied to environmental epidemiology successfully when a susceptibility factor and an environmental exposure are independent in a population with cases and non-cases. Through this approach, a more precise interaction effect estimate can be calculated.

Results

Basic information of datasets and descriptive analysis for each variable

By combining ‘Albumin & Creatinine – Urine,’ ‘Chromium & Cobalt,’ ‘Glycohemoglobin,’ and ‘Demographic Variables and Sample Weights’ data files, a dataset with 7286 subjects was created. For the first analysis example, the respondents with the ‘yes’ answer to the question ‘take diabetic pills to lower blood sugar’ were excluded (5890 subjects). After that, only 1396 subjects were included. For the second analysis example, all subjects (7286 subjects created) were included. The descriptive analysis results for the main variables are provided in Table 5.

Table 5

Descriptive analysis for each variable used

The preventive (negative) interaction effect between blood chromium levels and glycohemoglobin levels on albuminuria (micro and macro)
Environmental exposure	Normal chromium	Abnormal chromium		NA
Number of subjects	1312	29		55
Mean	0.35 μg/L	2.02 μg/L		NA
Outcome (disease)	No albuminuria	Albuminuria		NA
Number of subjects	1089	270		37
Mean	9.49 mg/g	504.0 mg/g		NA
Susceptibility factor	Blood glycohemoglobin level (with 48 NA values)
Statistics	Min	Median	Mean	Max
Value	4.1%	6.0%	6.39%	16.5%
The aggravating (positive) interaction effect between blood cobalt levels and old ages on albuminuria (micro and macro)
Environmental exposure	Normal cobalt	Abnormal cobalt		NA
Number of subjects	6942	32		312
Mean	0.18 μg/L	6.04 μg/L		NA
Outcome (disease)	No albuminuria	Albuminuria		NA
Number of subjects	5919	1179		188
Mean	9.28 mg/g	335.0 mg/g		NA
Susceptibility factor	Age in years (with no NA value)
Statistics	Min	Median	Mean	Max
Value	40.0 yrs	60.0 yrs	60.28 yrs	80.0 yrs

NA not available (missing value), yrs years of age

The negative interaction effect between blood chromium level and glycohemoglobin level on albuminuria (micro and macro)

As the first example, Table 6 provides the sequential processes of applying the case-only approach (which will be explained in the first discussion section) in estimating the interaction effect between blood chromium level and glycohemoglobin level on albuminuria. All these sequential processes follow the sequential processes provided in subsection 2.3: (i) Firstly, a 1 μg/L difference of blood chromium level resulted in the fold-difference in the odds of albuminuria 2.20 (95% CI 1.48–3.32) times. (ii) Secondly, a 1% difference in blood glycohemoglobin level resulted in the fold-difference in the odds of albuminuria 1.57 (95% CI 1.44–1.73) times. (iii) Thirdly, when a 1 μg/L difference in blood chromium level and a 1% difference in blood glycohemoglobin level coincide, the multiplicative interaction contrast ratio (ICR) is 0.72 (95% CI 0.35–1.60), with statistical insignificance. (iv) Fourthly, in the population with cases and non-cases, blood chromium levels and blood glycohemoglobin levels are independent of each other (S-E OR_c/nc: 0.76 (95% CI 0.47–1.06)). Therefore, the case-only ICR can be a good substitute for the ICR acquired from the population with cases and non-cases. (v) Finally, when only the cases are analyzed (case-only approach), the case-only ICR is 0.59 (95% CI 0.28–0.95), with a statistical significance (a negative interaction effect).

Table 6

The application of the case-only approach for the first and second example

Table 6-1. The application of the case-only approach for the preventive (negative) interaction effect between blood chromium levels and glycohemoglobin levels on albuminuria (micro and macro)
logit P(D = 1) = β₀ + β₂’E OR for 1 unit difference of environmental exposure = exp.(β₂’)
OR for a 1 μg/L difference of blood chromium level: 2.20 (95% CI 1.48–3.32)	Effect estimate
When a 1 μg/L of blood chromium level (μg/L) differs, the fold-difference in the odds of albuminuria is 2.20 (95% CI 1.48–3.32) times.	Explanation
logit P(D = 1) = β₀ + β₁’S OR for 1 unit difference of susceptibility factor = exp.(β₁’)
OR for 1% difference of glycohemoglobin level: 1.57 (95% CI 1.44–1.73)	Effect estimate
When a 1% of blood glycohemoglobin level differs, the fold-difference in the odds of albuminuria is 1.57 (95% CI 1.44–1.73) times.	Explanation
logit P(D = 1) = β₀ + β₁S + β₂E + β₃SE ICR_c/nc = exp.(β₃)	Eq. (8) Eq. (9)
ICR_c/nc: 0.72 (95% CI 0.35–1.60)	Effect estimate
When a 1 μg/L of both blood chromium level and 1% of blood glycohemoglobin level coincide, the multiplicative ICR is 0.72 (95% CI 0.35–1.60), with statistical insignificance.	Explanation
logit P(S = 1) = η₀ + η₁E S-E OR_c/nc = exp.(η₁)	Eq. (6) Eq. (7)
S-E OR_c/nc: 0.76 (95% CI 0.47–1.06)	Effect estimate
In the the population with cases and non-cases, blood chromium levels and blood glycohemoglobin levels are independent. Therefore, the case-only ICR can be a good substitute for the ICR acquired from the population with cases and non-cases.	Explanation
logit P(S = 1) = γ₀ + γ₁E ICR_CO = exp.(γ₁)	Eq. (4) Eq. (5)
ICR_CO: 0.59 (95% CI 0.28–0.95)	Effect estimate
When only the cases are analyzed (case-only approach), the case-only ICR is 0.59 (95% CI 0.28–0.95), with a statistical significance (a negative interaction effect).	Explanation
Table 6-2. The application of the case-only approach for the aggravating (positive) interaction effect between blood cobalt levels and old ages on albuminuria (micro and macro)
logit P(D = 1) = β₀ + β₂’E OR for 1 unit difference of environmental exposure = exp.(β₂’)
OR for 1 μg/L difference of blood cobalt level: 1.09 (95% CI 0.98–1.20)	Effect estimate
When a 1 μg/L of blood cobalt level (μg/L) differs, the fold-difference in the odds of albuminuria is 1.09 (95% CI 1.31–1.57) times.	Explanation
logit P(D = 1) = β₀ + β₁’S OR for 1 unit difference of susceptibility factor = exp.(β₁’)
OR for a 1-year difference of age: 1.05 (95% CI 1.04–1.05)	Effect estimate
When 1-year in age differs, the fold-difference in the odds of albuminuria is 1.05 (95% CI 1.04–1.05) times.	Explanation
logit P(D = 1) = β₀ + β₁S + β₂E + β₃SE ICR_c/nc = exp.(β₃)	Eq. (8) Eq. (9)
ICR_c/nc: 1.13 (95% CI 0.99–1.37)	Effect estimate
When a 1 μg/L difference of both blood cobalt level and 1-year difference of age coincide, the multiplicative ICR is 1.13 (95% CI 0.99–1.37), with statistical insignificance.	Explanation
logit P(S = 1) = η₀ + η₁E S-E OR_c/nc = exp.(η₁)	Eq. (6) Eq. (7)
S-E OR_c/nc: 1.06 (95% CI 1.03–1.10)	Effect estimate
In the a population with cases and non-cases, blood cobalt level and age in years show a slight association (not completely independent). Therefore, the case-only ICR must be multiplied by the S-E OR_c/nc to be ICR_c/nc according to Eq. (3).	Explanation
logit P(S = 1) = γ₀ + γ₁E ICR_CO = exp.(γ₁)	Eq. (4) Eq. (5)
ICR_CO: 1.14 (95% CI 1.03–1.37)	Effect estimate
When only the cases were analyzed (case-only approach), the case-only ICR was 1.14 (1.03–1.37), with a statistical significance (a positive interaction effect).	Explanation
${\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}=\frac{{\mathrm{RR}}_{\mathrm{s}\mathrm{e}}}{{\mathrm{RR}}_{\mathrm{s}}{\mathrm{RR}}_{\mathrm{e}}}=\left(\frac{\mathrm{ag}}{\mathrm{c}\mathrm{e}}\right)\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)=\left({\mathrm{ICR}}_{\mathrm{CO}}\right)\left(\mathrm{S}-\mathrm{E}\ {\mathrm{OR}}_{\mathrm{c}/\mathrm{nc}}\right)$	Eq. (2)
ICR_CO: 1.14 (1.03–1.37) × S-E OR_c/nc: 1.06 (95% CI 1.03–1.10)
ICR_c/nc: 1.21 (95% CI 1.06–1.51)	Effect estimate
The ICR_CO multiplied by the S-E OR_c/nc produced the ICR_c/nc of 1.21 (95% CI 1.06–1.51).	Explanation

In this example, the environmental exposure (blood chromium level) and the susceptibility factor (blood glycohemoglobin level) are independent in the population with cases and non-cases. Therefore, the case-only ICR itself can be used as the ICR acquired from the population with cases and non-cases without a conversion. (This will be explained in the first discussion section in detail.) However, the ICR acquired from the population with cases, and non-cases was a statistically insignificant ICR because of a relatively wide confidence interval. This problem was solved by applying the case-only approach, producing a slightly decreased ICR with a statistical significance (a narrower confidence interval). A possible protective (negative) interaction effect between blood chromium levels and blood glycohemoglobin levels can be inferred from this example.

The positive interaction effect between blood cobalt level and old age on albuminuria (micro and macro)

As the second example, Table 6 provides the sequential processes of applying the case-only approach in estimating the interaction effect between blood cobalt level and age in years on albuminuria. All these sequential processes follow the sequential processes provided in subsection 2.3: (i) Firstly, a 1 μg/L difference in blood cobalt level resulted in the fold-difference in the odds of albuminuria 1.09 (95% CI 0.98–1.20) times, without a statistical significance. (ii) Secondly, the 1-year difference in age resulted in the fold-difference in the odds of albuminuria by 1.05 (95% CI 1.04–1.05) times. (iii) Thirdly, when a 1 μg/L difference in blood cobalt level (mcg/L) and a 1-year difference in age coincide, the multiplicative ICR is 1.13 (95% CI 0.99–1.37), with statistical insignificance. (iv) Fourthly, in the population with cases and non-cases, blood cobalt level and age in years show a slight association, not completely independent (S-E OR_c/nc: 1.06 (95% CI 1.03–1.10)). Therefore, the case-only ICR must be multiplied by the S-E OR_c/nc to be ICR_c/nc according to Eq. (2). (v) Finally, when only the cases are analyzed (case-only approach), the case-only ICR is 1.14 (1.03–1.37), with a statistical significance (a positive interaction effect). (vi) By multiplying S-E OR_c/nc by the ICR_CO calculated, the ICR_CO-adjusted, 1.21 (95% CI 1.06–1.51), was produced.

In this example, the environmental exposure (blood cobalt level) and the susceptibility factor (age in years) are not independent in the population with cases and non-cases. Therefore, the case-only ICR must be multiplied by the S-E OR_c/nc to produce the ICR_c/nc according to Eq. (2). The ICR acquired from the population with cases, and non-cases showed a statistically equivocal ICR (1.13 (95% CI 0.99–1.37)). However, by applying the case-only approach, the ICR_CO-adjusted showed a slightly higher ICR with a statistical significance (1.21 (95% CI 1.06–1.51). Therefore, a possible aggravating (positive) interaction effect between blood cobalt levels and ages in years can be inferred from this example.

Discussion

Many previous studies dealt with various aspects of the case-only approach, usually in the context of gene-environment interaction studies or gene-gene interaction studies [5, 7, 9, 11, 14]. Some studies compared the case-only ICR with the ICR from the case-control design, whereas others compared the case-only ICR with the ICR from the population with cases and non-cases. This study incorporated all previous literature and systematically organized the provided logic and equations. From this effort, various definitions and equations for the ICR in the case-only design can be established compared to the ICR in the population with cases and non-cases (cohort/case-control studies). This systematic organization of concepts from 3 study designs is the original contribution of this study.

Furthermore, this study extended the case-only approach, which had been used usually in gene-environment interaction or gene-gene interaction studies, to a more general concept of the interaction effect estimation between susceptibility factors and environmental exposures. If the independence assumption between a susceptibility factor and an environmental exposure is fulfilled, even though the ‘gene’ is replaced with the ‘susceptibility factor,’ the same equations can be applied. Therefore, the case-only approach can also be applied to environmental epidemiology.

The preventive (negative) interaction effect between blood chromium levels and glycohemoglobin levels on albuminuria (micro and macro)

The adverse effect of chromium on kidney function was reported in some previous literature [19, 20]. Glycohemoglobin level ≥ 6.5% is a diagnostic criterion for diabetes mellitus and is naturally associated with diabetic nephropathy [21]. Albuminuria, including micro-albuminuria and macro-albuminuria, has been used both as a useful initial marker for kidney damage and a marker associated with an increased risk of progressive renal diseases [22, 23]. However, a possible protective interaction effect is being increasingly reported for the interaction effect between chromium exposure and diabetic chronic kidney disease, based on improved glucose tolerance and insulin sensitivity [24‐28].

The result of this study illustrates well a protective interaction effect between blood chromium level (environmental exposure) and blood glycohemoglobin level (susceptibility factor) on the albuminuria status (outcome). This protective interaction effect of chromium on diabetic patients with nephropathy can be used for establishing a future effective treatment strategy for diabetic nephropathy. For example, a study reports a possible positive effect of prescribing a nano chromium metal-organic framework on diabetic chronic kidney disease patients [24].

The aggravating (positive) interaction effect between blood cobalt levels and old ages on albuminuria (micro and macro)

The effect of blood cobalt levels on kidney function is not yet established, with only a few studies reporting possible adverse effects, mainly in experimental animals [29]. However, the effect of aging on decreasing kidney function is relatively well established [30, 31]. Furthermore, the fact that this aging kidney is susceptible to various toxic substances is well known through numerous studies [32‐35]. From these pieces of evidence, we can infer that the aging kidney could be more susceptible to the possible toxic effect of cobalt, even if it is almost non-toxic to the young kidney.

The result of this study illustrates well this toxin-susceptible feature of the aging kidney (susceptibility factor) to cobalt exposure (environmental exposure). As a marker of kidney damage, the proportion of albuminuria was greater in the older subjects. The result of this study can be used to devise a protective environmental health strategy for aging people with an increased possibility of exposure to heavy metals, such as cobalt.

Conclusion

This study summarized the previously reported logic and equations about the case-only approach systematically. In particular, the associated definitions and equations are collectively summarized from the cohort and case-control (study designs with cases and non-cases) to case-only studies. By substituting the ‘susceptibility factor’ concept from environmental epidemiology for the conventional ‘gene’ concept from genetic epidemiology, this study broadened the applicability of the case-only approach to broad environmental health topics. If the independence assumption between a susceptibility factor and an environmental exposure in the population with cases and non-cases is kept, this case-only approach can provide a more precise interaction effect estimate than that from study designs with cases and non-cases (cohort/case-control studies). Finally, 2 analysis examples of the case-only approach using the US NHANES datasets were explained. The protective interaction effect between blood chromium levels and blood glycohemoglobin levels and the aggravating interaction effect between blood cobalt levels and increasing ages on the incidence of albuminuria must be investigated meticulously in future studies. In summary, the case-only approach can be a useful approach not only in genetic epidemiology but also in environmental epidemiology.

Acknowledgments

The author appreciates the reviewers’ comments on this study. (Reviewer #1 and David M. Thompson). In particular, the comments from David M. Thompson were of great help in improving the quality and logic of the primary manuscript. This work was supported by INHA UNIVERSITY Research Grant.

Declarations

This study used only the publicly available National Health and Nutrition Examination Survey (NHANES) datasets. These datasets can be accessed on the NHANES homepage (https://www.cdc.gov/nchs/nhanes/index.htm). For the datasets, the information about Ethics Review Board (ERB) approval can be found on https://www.cdc.gov/nchs/nhanes/irba98.htm.

The authors confirm that all experiments were performed in accordance with the Declaration of Helsinki.

Not applicable.

Competing interests

The authors have no potential competing interests to disclose.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary material A. The used R codes for the statistical analyses. Supplementary material B. The S-E independence in the controls cannot replace the S-E independence in the population with cases and non-cases [1]. Supplementary material C. How strong a rare disease assumption is required for the equality between S-E OR_c/nc and S-E OR_control [1]. Supplementary material D. Violation of independence: confounder [1].

Clayton D, McKeigue PM. Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet. 2001;358(9290):1356–60.CrossRef

Hogan MD, Kupper LL, Most BM, Haseman JK. Alternatives to Rothman's approach for assessing synergism (or antagonism) in cohort studies. Am J Epidemiol. 1978;108(1):60–7.PubMed

Knol MJ, Egger M, Scott P, Geerlings MI, Vandenbroucke JP. When one depends on the other: reporting of interaction in case-control and cohort studies. Epidemiology. 2009;20:161–6.CrossRef

Skrondal A. Interaction as departure from additivity in case-control studies: a cautionary note. Am J Epidemiol. 2003;158(3):251–8.CrossRef

Dennis J, Hawken S, Krewski D, Birkett N, Gheorghe M, Frei J, et al. Bias in the case-only design applied to studies of gene-environment and gene-gene interaction: a systematic review and meta-analysis. Int J Epidemiol. 2011;40(5):1329–41.CrossRef

VanderWeele TJ, Hernández-Díaz S, Hernán MA. Case-only gene-environment interaction studies: when does association imply mechanistic interaction? Genet Epidemiol. 2010;34(4):327–34.CrossRef

Li D, Conti DV. Detecting gene-environment interactions using a combined case-only and case-control approach. Am J Epidemiol. 2009;169(4):497–504.CrossRef

Gatto NM, Campbell UB, Rundle AG, Ahsan H. Further development of the case-only design for assessing gene-environment interaction: evaluation of and adjustment for bias. Int J Epidemiol. 2004;33(5):1014–24.CrossRef

Wang L-Y, Lee W-C. Population stratification bias in the case-only study for gene-environment interactions. Am J Epidemiol. 2008;168(2):197–201.CrossRef

10.

Albert PS, Ratnasinghe D, Tangrea J, Wacholder S. Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol. 2001;154(8):687–93.CrossRef

11.

Yang Q, Khoury MJ, Sun F, Flanders WD. Case-only design to measure gene-gene interaction. Epidemiology. 1999;10(2):167–70.CrossRef

12.

Schmidt S, Schaid DJ. Potential misinterpretation of the case-only study to assess gene-environment interaction. Am J Epidemiol. 1999;150(8):878–85.CrossRef

13.

Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene environment interaction: case-control studies with no controls! Am J Epidemiol. 1996;144(3):207–13.CrossRef

14.

Dai JY, Liang CJ, LeBlanc M, Prentice RL, Janes H. Case-only approach to identifying markers predicting treatment effects on the relative risk scale. Biometrics. 2018;74(2):753–63.CrossRef

15.

Richardson DB, Kaufman JS. Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds. Am J Epidemiol. 2009;169(6):756–60.CrossRef

16.

Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med. 1994;13(2):153–62.CrossRef

17.

Rothman KJ, Greenland S, Lash TL. Modern epidemiology. Third Edition. Philadelphia: Lippincott Williams & Wilkins; 2008.

18.

VanderWeele TJ, Knol MJ. A tutorial on interaction. Epidemiol Methods. 2014;3(1):33–72.CrossRef

19.

Tsai T-L, Kuo C-C, Pan W-H, Chung Y-T, Chen C-Y, Wu T-N, et al. The decline in kidney function with chromium exposure is exacerbated with co-exposure to lead and cadmium. Kidney Int. 2017;92(3):710–20.CrossRef

20.

Wedeen RP, Qian LF. Chromium-induced kidney disease. Environ Health Perspect. 1991;92:71–4.PubMedPubMedCentral

21.

Association AD. 2. Classification and diagnosis of diabetes: standards of medical care in diabetes—2019. Diabetes Care. 2019;42(Supplement 1):S13–28.CrossRef

22.

Levey AS, Becker C, Inker LA. Glomerular filtration rate and albuminuria for detection and staging of acute and chronic kidney disease in adults: a systematic review. JAMA. 2015;313(8):837–46.CrossRef

23.

Heerspink HJL, Gansevoort RT. Albuminuria is an appropriate therapeutic target in patients with CKD: the pro view. Clin J Am Soc Nephrol. 2015;10(6):1079–88.CrossRef

24.

Fakharzadeh S, Kalanaky S, Argani H, Dadashzadeh S, Torbati PM, Nazaran MH, et al. Ameliorative effect of a nano chromium metal–organic framework on experimental diabetic chronic kidney disease. Drug Dev Res. 2021;82(3):393–403.CrossRef

25.

Huang H, Chen G, Dong Y, Zhu Y, Chen H. Chromium supplementation for adjuvant treatment of type 2 diabetes mellitus: Results from a pooled analysis. Mol Nutr Food Res. 2018;62(1):1700438.CrossRef

26.

Yin RV, Phung OJ. Effect of chromium supplementation on glycated hemoglobin and fasting plasma glucose in patients with diabetes mellitus. Nutr J. 2015;14(1):1–9.CrossRef

27.

Lewicki S, Zdanowski R, Krzyzowska M, Lewicka A, Debski B, Niemcewicz M, et al. The role of Chromium III in the organism and its possible use in diabetes and obesity treatment. Ann Agric Environ Med. 2014;21(2):331–5.CrossRef

28.

Sahin K, Onderci M, Tuzcu M, Ustundag B, Cikim G, Ozercan İH, et al. Effect of chromium on carbohydrate and lipid metabolism in a rat model of type 2 diabetes mellitus: the fat-fed, streptozotocin-treated rat. Metabolism. 2007;56(9):1233–40.CrossRef

29.

Naura AS, Sharma R. Toxic effects of hexaammine cobalt(III) chloride on liver and kidney in mice: Implication of oxidative stress. Drug Chem Toxicol. 2009;32(3):293–9.CrossRef

30.

Wetzels JFM, Kiemeney LALM, Swinkels DW, Willems HL, Heijer MD. Age- and gender-specific reference values of estimated GFR in Caucasians: The Nijmegen Biomedical Study. Kidney Int. 2007;72(5):632–7.CrossRef

31.

Coresh J, Astor BC, Greene T, Eknoyan G, Levey AS. Prevalence of chronic kidney disease and decreased kidney function in the adult US population: Third national health and nutrition examination survey. Am J Kidney Dis. 2003;41(1):1–12.CrossRef

32.

Wang X, Bonventre J, Parrish A. The Aging Kidney: Increased Susceptibility to Nephrotoxicity. Int J Mol Sci. 2014;15(9):15358–76.CrossRef

33.

Rosner MH. The pathogenesis of susceptibility to acute kidney injury in the elderly. Curr Aging Sci. 2009;2(2):158–64.CrossRef

34.

Schmitt R, Cantley LG. The impact of aging on kidney repair. Am J Physiol Ren Physiol. 2008;294(6):F1265–72.CrossRef

35.

Jerkić M, Vojvodić S, López-Novoa JM. The mechanism of increased renal susceptibility to toxic substances in the elderly. Int Urol Nephrol. 2001;32(4):539–47.CrossRef

Titel: Case-only approach applied in environmental epidemiology: 2 examples of interaction effect using the US National Health and Nutrition Examination Survey (NHANES) datasets
verfasst von: Jinyoung Moon
Hwan-Cheol Kim
Publikationsdatum: 01.12.2022
Verlag: BioMed Central
Erschienen in: BMC Medical Research Methodology / Ausgabe 1/2022
Elektronische ISSN: 1471-2288
DOI: https://doi.org/10.1186/s12874-022-01706-6

Die Highlights vom Kongress des American College of Cardiology 2024

Springer Medizin

Abstract

Introduction

Methods

Results

Discussion

Supplementary Information

Publisher’s Note

Introduction

Methods: application for real data – 2 examples

The preventive (negative) interaction effect between blood chromium level and glycohemoglobin level on albuminuria (micro and macro)

The aggravating (positive) interaction effect between blood cobalt level and old age on albuminuria (micro and macro)

Calculation of estimates

Statistical method and software

Methods: theoretical backgrounds

Basic assumption: the joint and ICR on the multiplicative scale

The ICR in a case-only study and the ICR in a study with cases and non-cases

The ICR in a case-control study

The ICR in a case-only study and the ICR in a case-control study

The ICR in a study with cases and non-cases and the ICR in a case-control study

S-E independence in the population with cases and non-cases and S-E independence in the controls: one cannot replace the other

The rare disease assumption: for ICRcc = ICRc/nc and S-E ORcontrol = S-E ORc/nc

Violation of independence: confounder and subpopulation dependence

The efficiency gained from the case-only approach

Methodological issues to be considered

Summary

Results

Basic information of datasets and descriptive analysis for each variable

The negative interaction effect between blood chromium level and glycohemoglobin level on albuminuria (micro and macro)

The positive interaction effect between blood cobalt level and old age on albuminuria (micro and macro)

Discussion

The preventive (negative) interaction effect between blood chromium levels and glycohemoglobin levels on albuminuria (micro and macro)

The aggravating (positive) interaction effect between blood cobalt levels and old ages on albuminuria (micro and macro)

Conclusion

Acknowledgments

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Supplementary Information

Weitere Artikel der Ausgabe 1/2022

Clearing the air: underestimation of youth smoking prevalence associated with proxy-reporting compared to youth self-report

Feasibility study for interactive reporting of network meta-analysis: experiences from the development of the MetaInsight COVID-19 app for stakeholder exploration, re-analysis and sensitivity analysis from living systematic reviews

The effect of personalised versus non-personalised study invitations on recruitment within the ENGAGE feasibility trial: an embedded randomised controlled recruitment trial

Registration and primary outcome reporting in behavioral health trials

Measuring and exploring mental health determinants: a closer look at co-residents’ effect using a multilevel structural equations model

The case against censoring of progression-free survival in cancer clinical trials – A pandemic shutdown as an illustration

The rare disease assumption: for ICR_cc = ICR_c/nc and S-E OR_control = S-E OR_c/nc