As a single-equation model, Equation (
1) can be expressed as
logit(
πij) =
γ00 +
γ10Xij +
γ01Zj +
γ11(
ZjXij) +
u1jXij +
u0j with
\( \left[\begin{array}{c}{u}_{0j}\\ {}{u}_{1j}\end{array}\right]\sim N\left(\left[\begin{array}{c}0\\ {}0\end{array}\right]\kern0.5em \left[\begin{array}{cc}{\sigma}_0^2& {\sigma}_{01}\\ {}{\sigma}_{01}& {\sigma}_1^2\end{array}\right]\right) \) by assumption, where
i denotes Level 1 units and
j indexes Level 2 clusters. In (1),
β denotes Level 1 regression coefficients,
γ is used for Level 2 regression coefficients and
u stands for a random effect. The coefficient
γ11 refers to a cross-level interaction. This type of interaction effects is commonplace in contextual effects modelling where the level 1 predictor is an individual-level variable, such as minority status or disease exposure, and the level 2 predictor may stand for a cluster-level variable, such as neighbourhood socioeconomic status or area-level pollution [
28]. For instance, consider the hypothetical scenario in which one wishes to model the odds of a person’s infection as a function of disease exposure of the patient (a Level 1 predictor) and area-level measures of pollution (a Level 2 predictor). A cross-level interaction between exposure and pollution (e.g., assuming that higher levels of pollution among those exposed to the disease raise the odds of becoming infected if the same pattern does not occur for individuals not exposed to the disease) would be an example of a contextual effects model where interaction among Level 2 predictors with Level 1 ones are needed to further understand the phenomenon being studied.
The degree of between-cluster relatedness was set through the intraclass correlation coefficient (ICC) calculated as an intercept-only model,
logit(
πij) =
γ00 +
u0j, using the formula
\( ICC=\frac{\sigma_0^2}{\sigma_0^2+{\sigma}_e^2} \) where
\( {\sigma}_e^2=\frac{\pi^2}{3} \) denotes the variance of a standard logistic distribution. Medium and large effect sizes, as defined in Cohen [
29], were used to populate Equation (
1). The effect sizes for the binary predictor are expressed in standardized mean difference units whereas the continuous predictor ones use the correlational metric, matching the recommendations presented in Cohen [
29].
-
\( {\sigma}_0^2=\frac{\pi^2}{7} \) (medium effect size) as the variance of the random intercept, which results in an ICC of \( \frac{\frac{\pi^2}{7}}{\frac{\pi^2}{7}+\frac{\pi^2}{3}}=0.3 \) and \( {\sigma}_0^2=\frac{\pi^2}{7} \) (large effect size) for an ICC of 0.5
-
\( {\sigma}_1^2=0.3 \) (medium effect size) and \( {\sigma}_1^2=0.5 \) (large effect size) for the variance of the random intercept.
-
γ10 = 0.5 (medium effect size) and γ10 = 0.8 (large effect size) for the regression coefficient of the binary predictor .
-
γ01 = 0.3 (medium effect size) and γ01 = 0.5 (large effect size) for the regression coefficient of the continuous predictor .
-
γ11 = 0.3 (medium effect size) and γ11 = 0.5 (large effect size) for the cross-level interaction effect.
For the continuous predictor distribution, three levels of skewness were used: normally-distributed predictors (i.e., skewness of 0), a chi-square distribution with 5 degrees of freedom (i.e., moderate skewenss of
\( \frac{\sqrt{8}}{5} \)) and a chi-square distribution with 1 degree of freedom (i.e., extreme skewness of
\( \sqrt{8} \)). The levels of skewness are similar to those encountered in real datasets as reported by Micceri [
30], Blanca et al. [
31] and Cain, Zhang and Yuan [
24].For the binary categorical predictor three conditions were studied: balanced (i.e., a 50/50 split between the incidence group marked as 1 and the no-incidence group marked as 0), a moderate imbalance (i.e., a 30/70 split with 30% of the sample showing incidence) and an extreme imbalance condition (i.e., a 10/90 split with only 10% of the sample showing incidence). Three cases were studied with some representative scenarios in an attempt to better understand the relationship between power and distributional assumptions: Case (1): A “benchmark scenario” with a standard, normally-distributed Level 2 predictor (
Z) and an evenly-balanced, dummy-coded Level 1 predictor (
X). A second scenario with a normally-distributed Level 1 predictor (
X) and an extremely unbalanced Level 2 binary predictor (
Z) and a third scenario with an extremely skewed Level 1 predictor (
X) and a perfectly-balanced Level 2 predictor (
Z). Medium effect sizes were used throughout. Case (2): Moderate and extremely unbalanced Level 1 predictor (
X) with moderately and extremely skewed Level 2 predictor (
Z). Medium and large effect sizes were used. Case (3): Moderately and extremely skewed Level 1 predictor (
X) with moderately and extremely unbalanced Level 2 predictor (
Z). Medium effect sizes were used. For sample sizes, the Level 1 sample sizes were set to
N1 = 10, 11, 12, … , 99, 100 and Level 2 sample sizes to
N2 = 10, 30, 50, 70, 90, 110.
1 Please notice that the Level 1 sample sizes are clustered within the Level 2 sample sizes so that, for instance, in the first simulation condition there are 10 clusters, each cluster having 10, 11, 12,…,99, 100 Level 1 sample units for a total sample size of 10 clusters × 100 sample units per cluster = 1000 collected sample units. For the second condition there are 30 clusters where each of the thirty clusters has 10, 11, 12,…,99,100 units and so on for all possible combinations of Level 1 and Level 2 sample sizes.
The simulations were all conducted in the R programming language using the
simglm,
paramtest and
lme4 packages. Gaussian quadrature integration was used for estimation and Wald-type standard errors and
p-values were employed to calculate the power of the fixed effects. Statistical significance for the random effects was evaluated via the recommended one-degree-of-freedom, likelihood-ratio test where a chi-square difference test is conducted between the reduced model and the extended model with the added random effects [
9‐
11,
20]. For each combination of simulation conditions, 1000 replications were run and the proportion of statistically significant parameter estimates from the total number of simulations was calculated as the empirical power of each model. The nominal alpha of 5% was used to test the significance of the coefficients.