Evidence for the reliability and validity of responses to each revised (e.g., WBSE scale) or new (e.g., WBASE scale) instrument will be evaluated prior to testing either the primary or secondary hypotheses consistent with standards for psychological testing [
51]. Exploratory structural equation modelling will be used to fit Time 1 data [
52]. Latent variable reliability will be measured with coefficient
H [
53]. Composite score reliability will be assessed with coefficient omega [
54] and coefficient alpha [
55].
To test the a priori hypotheses previously stated, three general models will be fit for each primary and secondary outcome in M
plus 8.0 [
56]. The estimator for each model will be maximum-likelihood with standard errors that are robust to conditional non-normality. In each model the main focus will be to examine the mean difference between the FFW group and the UC group on a proposed outcome at both Time 2 and at Time 3. Alternate specifications to each model will also be considered to examine some model-based assumptions [
57]. The first model (i.e., Model A) will follow an intent-to-treat approach [
58] by estimating the effect of being allocated to the treatment (i.e., FFW in this case) condition (i.e., ITT or γ). The second model (i.e., Model B) will follow a complier average causal effect (i.e., CACE) approach [
59‐
62] by estimating the effect of being allocated to the treatment (i.e., FFW in this case) condition for compliers with the FFW intervention (i.e., γ
c). The third model (i.e., Model C) will follow a CACE approach by estimating the effect of being allocated to the treatment (i.e., FFW in this case) condition for non-compliers with the FFW intervention (i.e., γ
nt) in addition to estimating γ
c. Fitting Model C provides a way of evaluating the sensitivity of Model B [
45]. Model B and Model C both employ CACE estimation, where non-compliers will be conceptualized as never-takers consistent with CACE methodology based assumptions detailed in relevant literature [
59‐
62]. If level of engagement is below 50% then a CACE approach (e.g., Model B and C) will be favored [
63]. If level of engagement is at least 50% then an ITT approach (e.g., Model A) will be favored [
63]. For exploratory hypotheses bias-corrected bootstrapped estimates of 95% confidence intervals for indirect effects within a path model will be obtained with the number of draws set equal to 2000 under an ITT approach [
64]. Missing data (e.g., dropout) will be reported (e.g., in a flow diagram) and modeled consistent with the missing at random assumption [
65] consistent with previous FFW research [
1].
Effect size
Effect size will be estimated in each model by dividing the mean difference by the square root of the variance pooled across the UC and FFW groups. In Model A this effect size estimate is equivalent to Cohen’s
d [
67]. In Model B and Model C this effect size estimate can be regarded as an extension of Cohen’s
d to a latent class framework [
57]. In an effort to gain some textual parsimony, we will denote the estimated effect size in each model as Cohen’s
d hereto forward. Similarly, we will use heuristics put forth by Cohen [
67] to describe the magnitude of the absolute value of Cohen’s
d: 0.20, (small), 0.50 (medium) and 0.80 (large).
Model B in more detail
Model B will impose a latent class (with two classes) regression model with CACE estimation for each proposed outcome with measures taken at Time 2 and Time 3 as the dependent variables. The first class (i.e., Class 1) will be conceptualized as never-takers. The second class (i.e., Class 2) will be conceptualized as compliers. A dichotomous indicator of latent class (where 0 = non-compliers in the FFW group, 1 = compliers in the FFW group, and a missing value for participants in the UC group) will be generated. Compliance classification, modeled as a categorical latent variable, will be regressed on covariates. Covariates, the outcome at Time 1 and group allocation will serve as predictors of the outcome at Time 2 and Time 3 and these regression coefficients will be freely estimated in Class 1 and Class 2. The two direct effects from group allocation to the outcome at Time 2 and Time 3 will be fixed to 0 in Class 1 (i.e., the exclusion restrictions: γntTime2 = γntTime3 = 0), and will be estimated freely in Class 2 (i.e., γcTime2, γcTime3). The intercepts for the outcome at Time 2 and Time 3 will be estimated freely in each class. Covariance between the error terms for the outcome at Time 2 and Time 3 will be estimated freely in each class. The focal parameters will be the direct effects from group allocation to the outcome at Time 2 and Time 3 in Class 2 (i.e., γcTime2, γcTime3).
A positive focal parameter value will convey that compliers in the FFW group had a higher adjusted mean for the outcome as compared to potential compliers in the UC group.
Model C in more detail
Model C will estimate each parameter estimated in Model B while relaxing the exclusion restriction (i.e., freely estimate γ
ntTime2 and γ
ntTime3), making Model B nested within Model C. The change in the likelihood ratio
χ2 (robust) test
\( , \Delta {\chi}_R^2 \), will formally compare the fit of these nested models. There is a substantive and a methodological rationale for evaluating the plausibility of the exclusion restriction assumption in the FFW online behavioral intervention. From a substantive standpoint the researchers may expect, based on results from the 2015 FFW efficacy trial [
1], that some of the participants allocated to the FFW group may engage with the intervention at a level that yields a FFW engagement score greater than 0 (i.e., no engagement) but less than 21 (i.e., full participation). From a methodological standpoint it is important to note that it is well-known that the estimate of γ
c can be biased when the true γ
nt effect is not zero but is forced to equal zero, particularly when compliance with the intervention is less than high [
45]. Therefore, the focal parameters in this model will be both the γ
cTime2 and γ
cTime3 effects and the γ
ntTime2 and γ
ntTime3 effects.
Statistical power estimation
The probability of rejecting a truly false null hypothesis for every focal parameter (i.e., γ
cTime2, γ
cTime3) was estimated (
N = 900) in M
plus 8.0 using Monte Carlo methods [
68] under the assumption that engagement is likely to be less than 50% [
1]. For each of the focal parameters in Model B the population parameter value equaled a value that corresponded to either a small (i.e.,
d = 0.20), moderate (i.e.,
d = 0.50), or large (i.e.,
d = 0.80) positive effect. A range of effect sizes were modeled consistent with relevant recommendations in exercise science [
69]. The population model assumed a engagement rate of 25, 45%, or 65% based upon results observed in the 2015 FFW efficacy trial [
1]. In the 2015 FFW efficacy trial engagement ranged from 15.6 to 54.9% with a mean of 31.6% across dimensions of well-being [
1]. Our simulations assume a ~ 10% increase in engagement, which we believe may result from the new remuneration plan. Missing data (i.e., 35% at Time 2 and 40% at Time 3) were modeled based upon results observed in the 2015 FFW efficacy trial [
1]. The quantity of replications requested equaled 10,000. Each replication was originally drawn from a conditionally multivariate normal distribution.
Table
2 provides the power estimation for γ
cTime2 and γ
cTime3 at Time 2 and Time 3. Power estimation for a small effect ranged from .30 (25% engagement) to .74 (65% engagement). Power estimation for a moderate effect ranged from .95 (25% engagement) to 1.00 (at least 45% engagement). Power estimation for a large effect equaled 1.00. We conclude that we are likely to have low to moderate power for small effects (depending on engagement level) and high power for moderate and large effects. Budgetary constraints preclude enrollment of more than approximately 900 participants.
Table 2
Power Estimation for the Complier Average Causal Effect at Time 2 and at Time 3
0.20 | 25% | 0.32 | 0.30 |
0.20 | 45% | 0.54 | 0.51 |
0.20 | 65% | 0.74 | 0.71 |
0.50 | 25% | 0.96 | 0.95 |
0.50 | 45% | 1.00 | 1.00 |
0.50 | 65% | 1.00 | 1.00 |
0.80 | 25% | 1.00 | 1.00 |
0.80 | 45% | 1.00 | 1.00 |
0.80 | 65% | 1.00 | 1.00 |