Introduction
Already by the age of 16, more than one out of ten children have suffered from an anxiety disorder or depression (Costello et al.
2003). Anxiety and depression are associated with adverse effects in meaningful life areas, including friendships, school performance, and family life, which results in suffering for the child (Birmaher et al.
1996; Donovan and Spence
2000). Moreover, anxiety and depression have been shown to predict future psychiatric diagnoses and increase the risk of suicidal behavior and substance abuse (Bittner et al.
2007; Costello et al.
2003) leading to significant costs for society (Snell et al.
2013). Based on the demonstrably high prevalence, severe consequences for the individual, and high costs to society, it is important to further evaluate methods to prevent anxiety and depression. Especially, since only about 30% of children suffering from anxiety or depressive disorders receive any mental health services (Bienvenu and Ginsburg
2007). Universal prevention is of particular interest, as it potentially involves low costs, does not involve the stigma associated with participation in targeted interventions, and provides an ideal opportunity to access the whole population. However, universal prevention, contrary to targeted interventions, has in general reported small effect sizes, many times not significantly larger than zero (Stice et al.
2009; Teubert and Pinquart
2011). One program of interest, and potentially more effective than average (Fisak et al.
2011), is the widely evaluated Australian program FRIENDS for Life (FFL), a cognitive behavioral prevention program aimed at promoting mental health in children (Barrett
2010). The program developer and her colleagues have conducted three cluster-randomized trials to assess FFL as a universal prevention program. To summarize Barrett’s results, a significantly lower degree of anxiety symptoms has consistently been found in the intervention groups, as compared to the control groups at both post-test and at follow-up. However, the results of depressive symptoms have been somewhat inconsistent between studies, showing both higher and lower depressive symptoms at post and lower depressive symptoms at follow-up. These studies involved interventions administered by teachers (Lowry-Webster et al.
2003), psychologists (Barrett et al.
2006), and teachers or psychologists (Barrett and Turner
2001). In these studies, no differences in effects were found between psychologists or teachers as administrators, which suggests the generalizability and sustainability of the FFL as administered by teachers. Given the possibility of effectively administering the FFL using school staff, the authors argue that FFL may be cost-effective and a good alternative for providing effective prevention to communities with a shortage of trained mental health professionals. However, and in contrast to this optimistic view, more recent studies outside of Australia have not accomplished to replicate these findings when evaluating teacher-administered FFL. Three cluster-randomized trials have been conducted, two in Canada and one in Great Britain. In the two Canadian trials (Miller et al.
2011a,
b), school personnel administered the intervention. The results did not indicate a significant difference between intervention and control groups at post- or at follow-up. The trial in Great Britain (Stallard et al.
2014) found significantly lower child-rated anxiety and depressive symptoms in mental health personnel-administered intervention group at the 12-month follow-up compared to the control group. There were no significant differences between the teacher-administered intervention group and the control group. In the previous trials of FFL described above, training and supervision for facilitators have varied between studies. Most studies report an intense 1-day training, but two studies report a 2-day training (Lowry-Webster et al.
2003; Stallard et al.
2014). In contrast to the trials by Miller et al. (
2011a,
b), the trial conducted by Lowry-Webster et al. (
2003) included regular supervision together with the program leader over the course of the 10-week intervention. In the trial by Stallard et al. (
2014), teachers were offered supervision every 2 weeks, but the authors report that only a few teachers attended these sessions. In summary, the difference in results between studies of teacher-administered FFL could potentially be partially explained by differences centered in levels of training and supervision of teachers.
Self-ratings of children have generally served as the only outcome measure in earlier trials of FFL, and studies which have included parent ratings have suffered from high incidences of missing data. Also, earlier randomized trials of FFL have in general suffered from inadequate statistical analyses of data, due to a failure to consider clustering effects, which occur within the trials’ designs. In short, not considering clustering effects leads to incorrectly estimated confidence intervals (too small), which implies an increased risk of type I error (Ahlen et al.
2015). Different factors (e.g., age, gender, provider credentials) moderating the effect of preventive interventions have been reported (e.g., Stice et al.
2009; Teubert and Pinquart
2011). However, when only examining universal prevention, these results have not been replicated. Further investigations of factors enhancing the effects of universal prevention program are therefore very important (Ahlen et al.
2015). Our study aimed at evaluating a teacher-administered intervention with multiple informants to provide a comprehensive understanding of the effect of the intervention. Further, our study evaluated whether baseline symptoms, age, gender, and levels of supervision enhanced the effect. The following research questions were addressed: Does a teacher-administered FFL universal prevention program affect (1) children’s ratings of anxiety and depressive symptoms, (2) parent’s ratings of children’s anxiety symptoms and general mental health, (3) teacher’s ratings of children’s emotional problems, pro-social behavior, and academic achievement, and (4) the incidence of anxiety and depressive disorders? Also, (5) do baseline symptoms, gender, age, or teachers’ use of supervision enhance the effect of the intervention?
Results
Attendance, Adherence, and Social Acceptability
The attendance of students was monitored in the intervention group. School class medians of non-attendance ranged between 4.2 and 6.1% between classes. Regarding attendance in supervision, three teachers did not attend the supervision at all, eight attended the first session only, six attended two sessions, and three attended all three sessions offered. Seventeen teachers conducted all ten sessions in the program, two teachers only performed eight sessions, and one teacher six sessions. Unfortunately, only three teachers recorded sessions satisfactorily. Another three teachers participated in recording sessions, but only in small portions. The remaining 14 teachers did not record any sessions. The social acceptability measure was completed by 90% of the children in the intervention group. A total of 80% of the children in the high-supervision group enjoyed FFL “much” or “some” compared to 68% in the low-supervision group. A total of 79% in the high-supervision group thought that they learned much, or quite much about what to do when feeling scared or worried, compared to 69% in the low-supervision group. Furthermore, in the high-supervision group, 33% of classes reported that they had been given homework assignments every week, 44% some weeks, and 22% had not been assigned homework assignments. In the low-supervision group, 9% of classes reported that they had been given homework assignments every week, 9% some weeks, and 82% had not been assigned homework assignments.
Baseline Comparisons and Attrition Analyses
At baseline, we found a difference regarding age (t(690) = 7.27, p < .001), where the intervention group was significantly older than the control group (d = 0.55). We also found a difference regarding household income (χ
2(3, N = 463) = 10.02, p = .02), where the intervention groups had a significantly higher income than the control group (Cramer’s V = 0.15). There were also differences regarding teacher’s ratings of the children’s emotional problems and pro-social behavior at baseline (t(432) = 5.32, p < .001; t(432) = 2.11, p = .04), where the intervention group had significantly more emotional problems (d = 0.54) and fewer pro-social behaviors (d = 0.19). Consequently, given it not being a trivial effect size, age was included as a covariate in all analyses, and baseline scores of emotional problems were included as a covariate in teacher ratings of emotional symptoms.
Regarding children who did not complete one or several assessment points, there were no differences in patterns of attrition between intervention or control group. Regarding parents, there was a difference in age, where parents in the intervention group who did not complete measures had older children than parents in the control group who did not complete baseline, post-assessment, and follow-up assessment (t(211) = 3.44, p < .001; t(229) = 3.82, p < .001; t(281) = 5.06, p < .001). Missing teacher ratings appeared to a larger amount in the intervention group at baseline assessment (intervention group, n = 187; control group, n = 74; χ
2(1, N = 695) = 72.74, p < .001). But on the contrary, to a larger amount in the control group at follow-up (intervention group, n = 55; control group, n = 91; χ
2(1, N = 695) = 12.73, p < .001).
Intervention Effects
Table
3 displays descriptive statistics for all outcomes and measurement points. Effect sizes are presented below as positive when in the desired direction (e.g., when the intervention group showed lower anxiety symptoms than the control group). Two separate repeated measures LMMs showed that there were no significant group*time interactions over the intervention period regarding the child–rated questionnaires the SCAS,
B = −0.38, 95% CI [−2.48, 1.37],
d = 0.02, and the CDI-S,
B = −0.32, 95% CI [−0.71, 0.07],
d = 0.11. Likewise, four repeated measures LMMs showed that there were no significant group*time interactions over the intervention period regarding the parent-rated questionnaires, the
B = 0.87, 95% CI [−0.46, 2.31],
d = −0.03, the SDQ-Tot,
B = −0.04, 95% CI [−0.72, 0.65],
d = 0.01, the SDQ-Emo,
B = −0.07, 95% CI [−0.35, 0.20],
d = 0.06, or the SDQ-Pro,
B = −0.14, 95% CI [−0.38, 0.10],
d = −0.07. Two repeated measures LMMs showed that there were no significant group*time interactions over the intervention period regarding the teacher-rated SDQ-Pro subscale,
B = −0.27, 95% CI [−0.79, 0.30],
d = −0.06, or AP,
B = 0.05, 95% CI [−0.09, 0.19],
d = 0.12. Finally, a LMM showed no main effect of group at post-assessment regarding the teacher-rated SDQ-Emo subscale,
B = −0.16, 95% CI [−0.79, 0.49],
d = 0.04.
Table 3
Means, standard deviations, and number of participants for pre-, post-, and follow-up assessments, broken down per condition from raw data
Child ratings |
SCAS | 26.60 (15.72) | 333 | 21.02 (15.11) | 320 | 20.49 (13.50) | 294 | 27.26 (14.40) | 322 | 21.78 (15.76) | 317 | 20.76 (13.54) | 279 |
CDI-S | 1.77 (2.50) | 329 | 1.72 (2.47) | 315 | 1.55 (2.49) | 292 | 1.82 (2.51) | 322 | 2.02 (3.06) | 310 | 1.63 (2.54) | 278 |
Parent ratings |
SCAS-P | 15.45 (9.33) | 237 | 15.06 (10.25) | 236 | 15.35 (10.94) | 197 | 14.6 (9.55) | 244 | 13.00 (8.27) | 226 | 13.92 (10.99) | 213 |
SDQ-total difficulties | 7.03 (5.42) | 232 | 7.52 (5.66) | 235 | 7.42 (6.00) | 193 | 6.13 (5.22) | 241 | 6.47 (5.33) | 226 | 6.28 (5.40) | 213 |
Emotional problems | 1.68 (1.91) | 232 | 1.61 (1.86) | 235 | 1.72 (2.02) | 193 | 1.24 (1.73) | 241 | 1.23 (1.65) | 226 | 1.29 (1.81) | 213 |
Pro-social behavior | 8.38 (1.83) | 232 | 8.19 (1.95) | 235 | 8.23 (1.83) | 193 | 8.50 (1.53) | 241 | 8.43 (1.63) | 226 | 8.43 (1.60) | 213 |
Teacher ratings |
Emotional problems | 2.31 (2.61) | 166 | 1.47 (1.96) | 259 | 1.62 (2.30) | 298 | 1.19 (1.75) | 268 | 1.27 (1.86) | 256 | 1.43 (2.08) | 251 |
Pro-social behavior | 6.76 (2.71) | 166 | 7.29 (2.76) | 259 | 7.32 (2.78) | 298 | 7.30 (2.50) | 268 | 7.58 (2.30) | 256 | 7.29 (2.80) | 251 |
School performance | 3.11 (0.80) | 131 | 3.17 (0.76) | 201 | 3.26 (0.81) | 298 | 3.18 (0.71) | 258 | 3.23 (0.82) | 256 | 3.17 (0.82) | 251 |
Two separate repeated measures LMMs showed that there were no significant group*time interactions over the whole period regarding the SCAS, B = −0.07, 95% CI [−1.10, 0.98], d = 0.01, and the CDI-S, B = −0.09, 95% CI [−0.31, 0.13], d = 0.07. Four repeated measures LMMs showed that there were no significant group*time interactions over the whole period regarding SCAS-P, B = −0.21, 95% CI [−0.98, 0.55], d = 0.04, the SDQ-Tot, B = 0.00, 95% CI [−0.38, 0.37], d = 0.00, the SDQ-Emo, B = −0.07, 95% CI [−0.23, 0.08], d = 0.07, or the SDQ-Pro, B = −0.07, 95% CI [−.17, 0.10], d = −0.04. Further, two LMMs showed that there were no significant group*time interactions over the whole period regarding the teacher-rated SDQ-Pro, B = −0.32, 95% CI [−0.35, 0.20], d = −0.04, or AP, B = 0.07, 95% CI [0.00, 0.13], d = 0.15. Finally, a LMM showed no main effect of group at follow-up assessment regarding the teacher-rated SDQ-Emo subscale, B = −0.32, 95% CI [−1.38, 0.82], d = 0.05.
Subgroup Analyses
In the high-anxiety subgroup (n = 119), we received consent for participation in the MINI-KID for 55 children (46%). Eighteen children (15%) had changed schools, ten children (9%) refused to participate, and 36 (30%) did not respond to the invitation. The participating children did not differ from the non-participating children on any baseline symptom ratings, gender, age, parent’s education, or household income. At 12-month follow-up, 36% of the high-anxiety subgroup in the control condition met criteria for an anxiety disorder, compared to 20% in the intervention condition, χ
2(1, N = 55) = 1.76, p = .19. No child met criteria for a depressive disorder at 12-month follow-up; consequently, we did not perform any MINI-KID analyses for the high-depressive subgroup. In the random sample (n = 100) of children with no elevated symptoms, we received consent for participation for 50 children, 14 had changed schools, six refused to participate, and 30 did not respond to the invitation. The participating children from the random sample (n = 50) did not differ from all other children with non-elevated symptoms (n = 501) on any baseline symptom ratings, gender, age, parent’s education, or household income. In the interviewed random sample, 14% of children in the control group met criteria for an anxiety disorder, compared to 9% in the intervention group at 12-month follow-up, χ
2(1, N = 50) = 0.32, p = .58.
Moderation Analyses
A series of LMMs showed no gender*group, or age*group interaction short, or long-term effects for any measure. However, a LMM showed a baseline symptom*group interaction short-term effect regarding the CDI, B = 0.39, 95% CI [0.26, 0.53], d = 0.43, which implies that higher levels of baseline symptoms involved greater decrease in depressive symptoms between pre and post in the intervention condition (compared to the control condition). In order to in more depth understand the moderation effect of baseline depressive symptoms, we conducted follow-up analyses in three subgroups. These subgroups included children with CDI baseline symptoms (1) above the median (of the current sample), (2) above the third quartile (75th percentile), and (3) above the 90th percentile. Two separate LMMs showed no significant group*time interactions over the intervention period regarding children with baseline scores above the median or the third quartile, B = −0.75, 95% CI [−1.63, 0.07], d = 0.23 and B = 1.00, 95% CI [−2.02, 0.18], d = 0.27, respectively. However, a LMM showed a significant group*time interaction over the intervention period regarding children with CDI baseline symptoms above the 90th percentile, B = −2.71, 95% CI [−5.12, −0.55], d = 0.67. Finally, no significant long-term interaction effect was found regarding the CDI, and no significant short-, or long-term effects were found regarding baseline symptoms*group interaction regarding the SCAS.
Supervision
There was a significant difference between groups divided by supervision regarding SCAS baseline symptoms,
F(2689) = 3.279,
p = .038 and a significant difference in age,
F(2689) = 27.35,
p < .001. Consequently, these variables were included as covariates in the analyses. In addition, we examined the supervision groups according to norms presented by the author of the SCAS (Spence
2010a,
b). In the high-supervision group, 21 out of 134 children (16%) had elevated levels of anxiety symptoms at baseline assessment. In the low-supervision group and the control group, the corresponding proportions were 16 out of 199 (8%) and 30 out of 322 (9%), respectively. The distribution of children with elevated levels and children without elevated levels of anxiety symptoms was not significantly different between groups,
χ
2(2,
N = 655) = 5.65,
p = .06. There was no evidence of problems with outliers (defined as above the T score of 70) in the high-supervision group (
n = 2, 1.5%), the low-supervision group (
n = 0), or the control group (
n = 3, 0.9%). Moreover, these proportions were not significantly different between groups,
χ
2(2,
N = 655) = 2.59,
p = .27. A LMM showed a larger short-term (but no long-term) reduction in anxiety symptoms in the high-supervision group compared to the low-supervision group,
B = 3.27, 95% CI [0.27, 6.15],
d = 0.22, and the control group,
B = 2.93, 95% CI [0.11, 5.47],
d = 0.21. There was no significant difference between the low supervision or control group,
B = −0.35, 95% CI [2.82, 2.07],
d = 0.03.
To understand the enhanced effect of the high-supervision group regarding anxiety symptoms, we examined two class-level variables which we hypothesized could be serving as mediators: (1) level of homework assignments and (2) child reports on how much they thought they learned on how to respond to fear or worry. These variables were aggregated values on class level, due to confidentiality on individual level. Furthermore, as a possible individual-level mediator, we additionally examined the intermediate change in anxiety symptoms during the intervention according to the SCAS-12 (i.e., change between sessions 1–5, sessions 5–7, and sessions 7–10), in order to see if the pre-post effect was driven by change in a specific phase of the intervention. Although classes in the high-supervision group had significantly more homework assignments than the low-supervision group (p = .02), a mediator analysis showed no significant indirect effect on change in anxiety symptoms (ACME = 0.79, 95% CI [−0.97, 3.05], p = .38). There was no significant difference on class averages regarding child reports of what they learned about fear (p = .06), and thus as expected, no significant indirect effect on change in anxiety symptoms (ACME = 0.76, 95% CI [−0.79, 2.92, 0.79], p = .36). Regarding the individual-level mediator, no indirect effects on change in pre- to post-anxiety was found for the two first phases as mediators (sessions 1–5, ACME = 0.70, 95% CI [−0.34, 1.84], p = .16; and sessions 5–7, ACME = −0.27, 95% CI [−1.19, 0.60], p = .54). However, an indirect effect was found for the last phase (sessions 7–10) as a mediator, ACME = 0.95, 95% CI [0.05, 2.00], p = .04, suggesting that level of supervision increased the reduction of anxiety symptoms at the end of the intervention, which partially explained the difference in pre- to post-changes in anxiety between supervision levels.
Discussion
The present study aimed at evaluating the effectiveness of the FFL when delivered by classroom teachers to children 8–11 years old in Swedish schools. The results failed to find an effect of the intervention for any outcome regarding the whole population. However, when dividing the intervention group by level of supervision, we found a short-term effect on child-rated anxiety. Further, we also found an enhanced effect on child-rated depressive symptoms for children in the intervention group with elevated depressive symptoms at baseline, suggesting the intervention could be quite meaningful for a subsample of the population. Our study shows similar results as several recent trials of FFL, which have failed to find effects of FFL when implemented as teacher-administered universal prevention (Miller et al.
2011a,
b; Stallard et al.
2014). On the contrary, in other trials where FFL has been administered by psychologists or mental health personnel, researchers have found significant effects of the intervention (Essau et al.
2012, Stallard et al.
2014). This is also consistent with the results of recent meta-analyses which have found larger effects of interventions administered by mental health professionals compared to school personnel both regarding anxiety (Teubert and Pinquart
2011) and depression (Stice et al.
2009). A convincing argument to implement a universal intervention in favor of a targeted intervention is the possible cost-effectiveness. Undoubtedly, one way of lowering the costs of an intervention is to let teachers administer it during school hours. However, the optimism that teachers easily can administer the intervention without deflating the effect is seriously put into questioning by our study in resemblance with recent trials. Although similar to recent trials outside of Australia, the results of our study do not harmonize with trials conducted in Australia, where teacher-administered FFL have shown significant effects. There are several possible hypotheses that could explain the disparity in results between studies. One hypothesis is that teachers in different countries may have more or less experience in working with social emotional strategies. Many schools in Sweden have in recent years incorporated the subject “life knowledge” in the curriculum. Even though no teacher in the control group in our study used any comparable program, they may still have incorporated such strategies in their teaching. Second, the creator of the FFL in Australia has continuously developed and improved training, and provided feedback in line with the teachers’ needs. It is possible that the training and supervision of teachers executed in other countries did not reach the same standard were not sufficiently tailored to teachers’ needs or that teachers did not attend to it as scheduled. The analyses based on levels of supervision in our trial lend some support to this hypothesis, as a short-term effect of the intervention was evident among students whose teachers attended a larger number of supervision sessions. The mediation analyses further suggested that this effect was driven by change in the last phase of the intervention. The result of the mediation analysis is theoretically quite plausible and strengthens the evidence that supervision possibly plays an important role in enhancing teachers’ ability to administer the FFL effectively. Basically, teachers in the low-supervision group attended at most one supervision session, which was scheduled after completing the third session. In comparison, the high-supervision group attended additional supervision sessions which was scheduled after sessions five/six, and sessions seven/eight, respectively. Our interpretation of the mediation results posits teachers in the high-supervision group to a larger extent received support in planning the latter sessions, and also better comprehended the strategies learned in these sessions. When interpreting the analyses of levels of supervision, it is important to remember that the effect cannot plainly be interpreted as a treatment effect. Although a reasonable interpretation is that teacher might be able to effectively administer FFL given a larger amount of support, it is also possible that other teacher variables (that covaries with the tendency to attend supervision, e.g., engagement or persistence) drove the pre- to post-changes, rather than the treatment. Moreover, it is also important to underscore that there was no random allocation to levels of supervision. Given the baseline differences in anxiety symptoms between supervision groups, it is possible that teachers with more anxious children in their class were more interested to receive a higher amount of supervision sessions.
Limitations
One major limitation in the present trial involves the recordings of adherence which did not go according to plan. The recordings of the classroom sessions were technically easily to implement, but the majority of the teachers perceived it as intrusive and refused to record the sessions. The lack of recordings made it impossible to provide a clear and complete account of adherence. Thus, attending the supervision sessions was used as a proxy, which obviously is a limited aspect of the multifaceted nature of adherence. We have assumed that the more the teachers attend the supervision sessions, the more adherent they will deliver the intervention. Although this is also partially reflected in child ratings, we are aware of the difficulties inherent in the nature of this assumption and interpret the outcome cautiously. A continuous collection of recordings would have made us aware of the problems at an early stage and possibly an opportunity to discuss it with the teachers to increase the number of recordings. Further, in addition to the teachers’ low completion rates of recordings and relatively low attendance in supervision, we also encountered some difficulties in collecting parental consent to the structured interviews at follow-up. All in all, these indicators of low engagement highlight the general problem of engaging participants (e.g., teachers and parents) in large longitudinal studies. Low engagement, leading to either attrition or non-compliance or both, obviously involves serious threats to the internal validity of the results. Future trials might benefit from incorporating knowledge generated from implementation research, or even combining effectiveness studies and implementation research as suggested by some researchers (e.g., Curran et al.
2012). Moreover, regarding teachers-ratings, teachers generally rated all children in a class, which meant that attrition appeared in clusters. The results of the teacher ratings should therefore be interpreted with caution, due to the different patterns of attrition between intervention and control group. Finally, the recent meta-analysis by Ahlen et al. (
2015) reports very small effect sizes in universal trials regarding anxiety and depression. Following these results, power was a limitation in our study. Specifically, the number of schools might have been too few in order to estimate the standard errors of the effects with adequate precision. Also, having too few randomized units (in our case schools) tends to involve imbalances between the conditions, which in our case was evident especially regarding the mean age in the different conditions. With these limitations in mind, we conclude that if further developed and evaluated as teacher-administered universal prevention in Sweden, efforts should be made to ensure that teachers attend supervision, and adhere the overall implementation of the intervention.