Mismatch between intervention and outcome measurement age
One of the most serious methodological problems with the article has to do with the age range of the children assessed. The article fails to make it clear to the reader that there is a substantial mismatch between the intervention and the outcome measurement with respect to child age range and sampling. Parental participation in the intervention (i.e., the independent variable) targeted children 2-16 years of age, while the teacher-reported outcome variable (i.e., the dependent variable) assessed 4 and 5 year-olds. The article provided detail about the number of families participating in the intervention, and briefly mentioned that a substantial proportion (i.e., 40% or more) of the families that received parenting services had children that were too old to be picked up by the outcome measure. The 40% figure refers to the percentage of children who were older than age five at the time their parents participated in the intervention. However, the implications of attempting to detect population-level impact by assessing a diluted, marginal sample of those who actually received the intervention have not been discussed.
Related to the age and sample mismatch issue, many of the families receiving the parenting intervention did not have a child within an eligible age-range for inclusion in the teacher-report outcome assessment, and thus were not represented in the evaluation of the intervention. Likewise, many of the children assessed via teacher report were from families who had never received the intervention. No data were provided regarding the proportion of children assessed that had a parent who had participated in Triple P. This omission, along with the aforementioned lack of control condition, makes it impossible to calculate common indices representative of population-level impact, such as risk ratios or number-needed-to-treat.
Narrow focus for assessing mental health impact
Triple P aims to reduce child behavioural and emotional difficulties through the mechanism of promoting change in parenting practices, and thus the primary focus is on producing change in the family context. Although Marryat et al. [
5] claimed to evaluate the mental health impact of Triple P, they presented data related only to child difficulties at school (in this case, the nursery or pre-school context) via a routinely collected teacher-report questionnaire, the Strengths and Difficulties Questionnaire (SDQ) [
6]. This narrow focus is important because (a) changes within the school setting are not the primary target of the Triple P intervention, and (b) the aims and conclusions outlined by the authors do not align with the actual data reported.
The impact of family-based interventions like Triple P on school adjustment is an important research question, and one that would be reasonable to explore. We might anticipate that significant improvements in child mental health or behavioural difficulties seen within the home context might also be seen at school, particularly if the child has significant difficulties at school in the first place. However, reliance on teacher-report data as the sole indicator of population-level impact on child mental health is seriously flawed. First, there are generally low levels of concordance between teacher and parent reports regarding child difficulties, with often only modest correlation (e.g., < .30) between parents and teachers as informants, and teachers typically reporting fewer problems overall (e.g. [
7,
8]).
Teacher report cannot be used as a proxy for parents’ experiences with their children at home or for parental reports on children’s mental health status. To support the decision to present only teacher data, the authors claimed that reliance on parental report can be problematic due to its potential to introduce a measurement confound with parent’s mental state, however no evidence is presented that teacher-reported data provides a more realistic or reliable indication of children’s mental health or difficult behaviours than parent-reported data. Teacher-report data can provide a valuable contribution within a multi-informant approach to understanding the broader impact of a parenting intervention such as Triple P, yet as with any single-informant approach to data collection, findings should be framed within the confines of the extent to which these are generalisable—in this case, teachers’ views of child behaviour within the preschool setting posited to generalise to the home, and children’s general mental health. We acknowledge that issues of pragmatism can preclude the collection of data from multiple sources, but the authors failed to acknowledge this limitation. The result, unfortunately, took the form of over-reaching and inappropriately generalised conclusions regarding the population-level impact on mental health.
Factual errors and misleading statements
The authors claimed that Triple P has little effect in deprived communities. This claim ignores studies showing that socioeconomic status does not moderate effect sizes for child outcomes in Triple P studies [
10] and the mounting evidence that Triple P works well in low resource communities (e.g. [
2,
4,
11]). There have since been a number of high quality studies showing the value of Triple P in a range of disadvantaged communities. Examples include: a place-based randomised trial of the Triple P system in the US showing population level effects on child maltreatment in communities with substantial representation of disadvantaged families [
2]; an RCT of low intensity Triple P Discussion Groups in Panama showing positive effects on child and parent outcomes with parents in deprived communities [
11]; an RCT of Triple P Discussion Groups with a Maori indigenous population in New Zealand [
12]; evaluations of Group Triple P with Aboriginal and Torres Strait Islander samples in Australia [
13]; and a trial of Triple P Online with vulnerable disadvantaged urban mainly African American and Latino families in Los Angeles [
14]. Qualitative studies showing high levels of consumer acceptance of Triple P principles and techniques have been conducted with homeless parents [
15], vulnerable low income families involved with child protective services [
14], and women in shelters who have histories of domestic violence [
16]. Fives et al. [
4] reported that many participants in the Ireland population roll out of Triple P were low SES (39% of Group Triple P participants, 33% of workshop participants, and 26% of seminar participants had a medical card, a key indicator of low SES). Contrary to Marryat et al.’s conclusion [
5], Triple P has been found to be a promising intervention with many vulnerable, socially disadvantaged parents.
The paper also raised concerns about the costs of Triple P without defining the costs or placing the costs in perspective relative to not intervening, or the costs of other intervention strategies. Serving 10,000 families ostensibly costs more than serving 100 families, but the key metric would be the per-family cost, which the article ignored in making a general pronouncement (i.e., “consumes substantial resources”). It did not discuss the potential cost savings of brief, early, minimal intervention, or the mix of varied delivery formats, for example, the cost saving in offering group programs serving several families in the same amount of staff time as conducting individual sessions.
The paper failed to take into account that during the intervention period in the same catchment area other parenting interventions were also being supported and implemented concurrently, albeit on a smaller scale. This again highlights the need for control data to allow suitable comparisons to support conclusions around population-level impact of any universal prevention or public health initiative.
Finally, there are some major errors in the article. Firstly, it reports null results of “a recent cluster randomized control trial exploring the impact of Triple P levels 2 and 3 on pre-schoolers’ externalizing behaviours and parental mental health”. The references cited relate to Hiscock et al. [
17], a study that was not a Triple P intervention, and Malti et al. [
18], a study that tested one level of Triple P (Level 4 Group). Similarly, the article refers to Prinz and Sanders [
19] in relation to “previous work in which no significant improvement in child-based outcomes resulted from a public health parenting programme” which is an incorrect citation—the article cited is a theoretical piece about population-level interventions and does not include an evaluation nor any discussion of child outcome results. The authors also cite a study reporting a subgroup analysis focusing on lone parent families that showed no benefit from the Triple P intervention [
20]. It is true the study reported no group difference between intervention and control parents around parenting and child behaviour based on self-report data. However, independent clinical observations reported within the same paper showed significant improvements in positive parenting behaviour and decreases in negative child behaviour for the intervention group. We find it curious that this finding was omitted from the authors’ discussion, particularly considering it reports data from an independent source which would seem of relevance given the prior arguments made by the authors.
The paper claimed to have registered the study protocol, yet the reference list only cites a University of Glasgow webpage for a description of the protocol, no trial registration number. Furthermore, the protocol as described is significantly different from the primary findings reported in the paper or the final evaluation report.
Measurement problems
The study had a number of measurement problems. First, one of the primary outcome measures was a modified version of the Conduct Problems subscale of the SDQ. Using only three of the original five items for this scale resulted in a modified version that had low internal consistency (α = 0.66), which is below the commonly accepted threshold (0.7) for scientific acceptability, and which relied on a questionably small number of items (three). Additionally, they use a weighted procedure to compute an average score for this modified subscale, and then applied the standard cut-off levels intended for the full subscale. Given these measurement issues the validity of this scale as a primary outcome variable is uncertain and highly questionable.