Introduction
The clustering of health behaviours has important consequences for health as the risks associated with engagement in any particular behaviour may increase, or decrease, depending on which other behaviours an individual engages in [
1]. Where behaviours do cluster, multi-behavioural prevention and health promotion strategies may also be more effective than those targeting a single behaviour. Similarly, the effectiveness of efforts targeting one behaviour in isolation may vary depending on which other behaviours individuals’ engage in [
1].
Analyses of the clustering of health behaviours are interested in whether individuals participate in each of a set of health behaviours and whether an exhaustive set of ‘clusters’ or ‘behavioural types’ can summarise the patterns of participation seen across a population [
2]. For example, three clusters may broadly summarise the patterns of participation in a population: individuals either (i) smoke, drink heavily, and use illicit drugs; (ii) drink heavily; (iii) do none of these behaviours. Analyses of clustering investigate underlying associations between concurrent behaviour [
2] and they seek to exhaustively classify patterns of behaviour across the whole population rather than describing patterns in one part of the population (e.g. the tendency for illicit drug users to also smoke).
Clustered patterns of health-related behaviour often emerge in adolescence [
3‐
6], and clusters involving multiple adverse health-related behaviours have been found to be more prevalent amongst younger adults than in older age groups [
7]. A 2006 review of health-related behaviours among young people considered the relationships between alcohol, smoking, safe sex, and dietary behaviours amongst 10–18 year olds [
8]. The authors found extensive evidence that smoking and alcohol consumption cluster within individuals and, to a lesser extent, found clustering of alcohol consumption, smoking and risky sexual behaviour. More recent reviews [
7,
9] have focused on adult populations. In these reviews, both ‘healthy’ and ‘non-risky’ clusters were common: such clusters were characterised by low, or no, participant engagement in the risk behaviours considered by studies [
7,
9]. Polarisation was also apparent: primary studies often reported engagement by some participants in all, or none, of the health-related behaviours measured [
7,
9].
In addition to a lack of recent reviews of the clustering literature for adolescents, there are a number of other limitations within current evidence. First, the extent to which reviews are able to compare behavioural clusters is limited by significant heterogeneity between primary studies. Such heterogeneity is apparent in terms of the measures used, and the statistical analysis techniques employed (the sensitivity of those techniques to small variations in the data [
2]). Reviews to date have not addressed this directly, tending to focus elucidating the behaviours that consistently cluster between studies [
7,
9]. Second, although many studies examine clustering of diet, physical activity, alcohol consumption and smoking; other behaviours, such as risky sexual behaviour and gambling, are given less attention [
10‐
12]. Moreover, health-related behaviours that are emerging as areas of concern for health, such as overuse of internet-based technologies [
12,
13], are not addressed at all. Third, explorations of how health-enhancing behaviours relate to health-compromising behaviours, is limited [
8].
Given these limitations, this study aims to systematically review the literature on the clustering of a broad range of health-related behaviours amongst 11–16 year olds. A secondary aim, is to identify a method for synthesising highly heterogeneous results from clustering studies.
Methodology
Literature search
We searched the MEDLINE, CINAHL and PsychINFO databases on 24th September 2019.
Terms relating to four areas (analytical method, adolescents, health-related behaviour(s) as a general concept, and specific health-related behaviours such as alcohol use) informed a combination of free text and MESH search terms (see Supplementary Table
3 for full search strategy). Methodological terms were selected to identify analyses of the clustering of multiple behaviours [
2,
9] rather than analyses of the co-occurrence of two behaviours (e.g. bivariate correlations). No time limits were imposed on the search. The study protocol was not preregistered.
Inclusion and exclusion criteria
Included studies were from high income countries (identified in relation to World Bank criteria) to increase comparability of findings. Informed by recent youth health behavioural trends that have been limited to high income countries [
14], we reasoned that differences in the lives and health behaviours of young people between high and low income countries may be substantive. We defined studies of clustering as primary studies using any of the following analytical methods: cluster analysis, latent class analysis, prevalence odds ratios, principal component analysis, and factor analysis.
We initially planned to review studies of 11–24 year-olds, but narrowed this to 11–16 year-olds after completing study selection due to the number of eligible studies identified and the heterogeneity of the age groups studies and the clusters identified within those studies. Data were typically from school surveys of 15 year olds and younger (e.g. the Health Behaviours in School Children survey or the European School Survey Project on Alcohol and other Drugs), or of adults aged 18 years and above. Therefore, to reduce methodological heterogeneity across our included studies, we screened titles and abstracts for samples aged 11–24 and then screened full papers for samples aged 11 up to and including 16 years. Studies reporting data from a sample with a wider age range than 11–16 years were included if it could be determined that 50% or more of the sample were aged 11–16 years or that the mean age was 16.
We initially defined eight key health-related behavioural areas of interest: alcohol consumption, tobacco smoking, cannabis use, other illicit drug use, sexual activity, physical activity, dietary behaviours, and internet-based technology use. However, although there is increasing concern about the health and social risks associated with adolescents’ use of internet-based technologies, evidence increasingly suggests it is the mode, pattern or extent of use, not use per se, that is problematic [
12]. Our initial searches revealed these aspects of use are not well-measured in the available literature and we subsequently removed internet-based technology use from our behaviours of interest to avoid weakening the analysis. Behavioural areas of interest ranged in their scope: some encompassed a single behaviour (e.g. smoking), while others, such as drug use, encompassed multiple behaviours (e.g. cocaine use, cannabis use). Consequently, included studies were required to analyse the clustering of at least three health-related behaviours across two or more of the behavioural areas of interest (e.g. studies examining alcohol drinking, heroin use and cocaine use were permissible as this covers two areas; those examining heroin, cocaine and cannabis use were not as this is a single area – drug use).
Analyses employing cluster transition analyses were excluded as we wished to establish the composition of behavioural clusters at a given time, rather than the pathways between behavioural clusters over time. Studies with vulnerable populations were also excluded to increase the comparability of findings. A vulnerable group was defined in relation to whether the group in question would be expected to be associated with particular groups of risky health behaviours or social marginalisation. For example, young people in the youth justice system exhibit elevated levels of substance use [
15]. We acknowledge the limitations of this approach in the discussion.
Paper screening and data extraction
Two authors (VW and MO) screened paper titles and abstracts. Four, separate, random subsamples of 100 titles and abstracts (400 in total) were double coded and Cronbach’s alpha was used after each subsample to measure internal consistency. Chronologically, the results were: 0.46 (fair agreement), 0.69 (good agreement), 0.53 (fair agreement) and 1.00 (excellent agreement). The lower agreement in early subsamples reflects a lack of clarity in many titles and abstracts regarding the analytical methods used. Disagreement was overcome through group discussion and analysis of the full text.
Data extraction was undertaken by MO, VW, JB, JH, and HF. For the purposes of the analysis presented in this paper, data pertaining to the age, ethnicity, and gender of participants, the behavioural clusters identified by each study, and the geographical origin of the study were extracted. Quality appraisal of individual studies was conducted by JB using the AXIS critical appraisal tool [
16]. MO double appraised studies to check for agreement.
Analyses of behavioural clusters generate a large number of numerical results and different analytical methods produce different metrics. To aid comparison of data during synthesis, we converted the primary study results into prose using a protocol agreed between the data extractors. Specifically, we converted probabilities and factor loadings into the following language: No = < 5% (or < 0.05), Very unlikely = 5 - < 15%, Unlikely = 15 - < 35%, May = 35 - < 65%, Likely = 65 - < 85%, Very likely = 85 - < 95%, All = 95%+. Where analyses provided mean scores rather than probabilities (e.g. in cluster analyses), data extractors compared the scores across clusters to decide whether they were reflective of low, medium or high on measures of different behaviours. For example, in an instance where there were 3 clusters which scored a mean of 1, 5 and 10 respectively on a measure, the first would be considered low, the second medium and the third high.
Synthesis of clusters from included studies
Existing guidance for synthesising findings from reviews of clustering analyses is limited, we therefore followed Noble et al. [
9] by tabulating which of our seven behaviours of interest were measured by each primary study. Next, we calculated the percentage of studies by the numbers and combinations of our behaviours of interest that they measured. However, we also required a method to group together clusters with apparently similar behavioural patterns identified in different studies. Through group discussion, we developed a new iterative approach that involved organising clusters into ‘archetypes’.
The process for constructing the archetypes is summarised below. Unlike previous reviews [
7,
9], this relied solely on the behaviours measured and the patterns of engagement in behaviours reported within clusters. Cluster titles provide a poor basis for comparison between studies as they are often informed by the topic foci of individual studies, which were highly varied. While titles akin to ‘substance users’ were common, the measures used to define substance use were similarly varied between studies. Also, titles often referred to behaviours that were included in the analyses of an individual study, but which were outside of the scope of our review (e.g. substance using bullies). Cluster titles did not therefore inform the construction of archetypes. Our process was as follows:
1.
Extract a description of all clusters identified in the included primary studies using consistent natural language to describe patterns of engagement in behaviours reported within clusters.
2.
Develop an initial set of archetypes by grouping together clusters involving similar behaviours and patterns of engagement in those behaviours.
3.
Refine this initial set iteratively through discussion and consensus within the research team.
4.
Produce a written description of each archetype, including a name and inclusion criteria and check all constituent clusters fit this description.
5.
Discuss and resolve difficult cases that do not clearly fit within archetypes, refining archetype descriptions as necessary.
6.
Review archetypes for parsimony by, for example, renaming, aggregating or disaggregating them.
7.
Analyse the clusters to inform a narrative synthesis, giving particular attention to the number and key characteristics of each archetype’s constituent clusters.
Our archetypes were defined only in relation to the seven behavioural areas of interest discussed above and not with reference to other behaviours included (e.g. bullying, sleep). The seven behavioural areas were split into two categories to enable meaningful synthesis, namely: substance use (alcohol, tobacco and other drug use) and other behavioural risk indicators (diet, physical activity, gambling and sex). While some studies included measures of protected and unprotected sex, all but two samples [
17,
18] included children younger than the age of consent in the country of interest in the study sample. As most papers ran clustering analyses on the full sample, disaggregation of results by age were not possible. We therefore took a conservative approach and categorised any sexual activity as a negative risk indicator. Following Delk et al. [
19], we treated e-cigarette use and tobacco smoking as use of the same substance. Cannabis and synthetic cannabis were also treated as a single substance, as in Lee et al. [
20].
Discussion
This review examined the clustering of a broad range of health-related behaviours in 11–16 year-olds. Eight overarching behavioural archetypes were identified by grouping the clusters described within the primary studies. These archetypes were: (1) Poly-Substance Users, (2) Single Substance Users, (3) Substance Abstainers, (4) Substance Users with No/Low Behavioural Risk Indicators, (5) Substance abstainers with Behavioural Risk Indicators, (6) Complex Configurations, (7) Overall Unhealthy and (8) Overall Healthy.
Our eight overarching archetypes suggest three key findings. First, in the studies included in our review, most 11–16 year-olds fall into one of our ‘healthy’ archetypes which, on average, account for 51% (Substance Abstainers archetype) or 32% (Overall Healthy archetype) of the primary study populations. Second, studies consistently find that small minorities of young people engage in multiple unhealthy behaviours, including polysubstance use, or substance use alongside multiple other risk behaviours, such as having a poor diet, lacking exercise or engaging in sexual activity. These fall into archetypes that account on overage for 10% (Poly-Substance User archetype) and 10% (Overall Unhealthy archetype) of the primary study populations. As would be expected, the proportion of young people in these clusters decreases where greater numbers of substances are used, or when examining heavier use of substances. Third, substantial proportions of young people engage in varied combinations of behaviours (i.e. archetypes 4, 5 and 6) wherein both health promoting and health-risk behaviours co-occur. Young people who engage in health promoting behaviours may, therefore, simultaneously be engaging in other, unhealthy behaviours that counteract any benefits - or vice versa. Importantly, the identified combinations of unhealthy behaviours that young people engage in are diverse and inconsistent across studies. This may present a challenge to the development of effective multi-behavioural health interventions.
Strengths
This review is the first to examine clustering of health-related behaviours within 11–16 year olds and extends the focus of behaviours considered in other reviews of studies of adult and adolescent populations. A further strength is our development of a new method for synthesis of findings from the heterogeneous literature on behavioural clustering. While our approach does not directly redress the heterogeneity in the literature, it does summarise the key clusters observed in a way that can inform future research, policy and practice. Importantly, this approach facilitated the estimation of the average proportion of individuals falling into similar clusters across multiple studies in this review. Finally, unlike prior reviews which have taken the names of clusters and/or the probabilistic terminology used by primary study authors into account [
7‐
9], our synthesis is based solely on the behaviours measured and numerically standardised probability of engagement in those behaviours.
Limitations
Drawing data from studies using varied analytical approaches creates problems in comparing results and we are not aware of any available methods for standardising numerical findings from different clustering techniques. Therefore, we used prose, rather than numerical data, to address this problem. However, comparison was still problematic in places as, for example, clusters within archetypes were often characterised by very different levels of engagement in a behaviour, such as light drinking in one cluster and frequent drunkenness in another. In places, this created a false equivalence between different patterns of behaviour that may not have comparable risks of harm.
Despite our age focus, our included studies often included a minority of participants older than 11–16 years. This reflects wide variation in age groups included in primary studies and a decision not to limit our pool of included studies by imposing more rigid age criteria. Nevertheless, the extent and patterns of youth health-related behaviours are known to change across adolescence (for example: [
59,
60]; small changes in age foci may therefore result in changes in behavioural clusters, or in the proportions of study samples attributed to them.
We had insufficient data to compare clustering between population subgroups, despite arguments that socioeconomic status, region, age and gender may be important intersections [
7,
9,
58,
60]. Studies also came from multiple countries and different time points (ranging from 1982 to 2016), but we did not explore the potential effects of these cultural and temporal specificities. Furthermore, we excluded samples deemed particularly vulnerable to engaging in risky behaviours, such as those young people in the criminal justice system. As such, our conclusions are limited to the general population. We acknowledge that there are demographic groups included in our definition of the ‘general population’, such as those of lower socioeconomic status, wherein prevalence of specific risk behaviours may differ from the general population. However, sub-groups defined, for example, by socioeconomic status account for much larger proportions of the population than, for example, young people in the youth justice system and we elected to include them on this basis. Further analysis of the archetypes which emerge in relation to population sub-groups would therefore be of value.
Implications for policy and practice
Our behavioural archetypes show that the combinations of health-related behaviours that young people engage in are diverse and complex. Health policy and practice, particularly those advocating multi-behavioural approaches, should therefore be sensitive to such complexity. Specific behavioural clusters identified in individual studies may therefore be insufficiently robust to inform multi-behavioural interventions that are generalizable beyond the context of the original study. In particular, while policy makers and practitioners working in the same context as our primary studies may prefer local evidence, our analysis suggests they should also consider syntheses of broader evidence. This is because researchers’ choices about which behaviours to study and which cluster analysis method to use may also markedly shape findings alongside local factors. While health outcomes were not our focus of attention in constructing archetypes, the complexity we reveal points to a need to determine the clusters associated with greater or lesser risks (or benefits) to health, over time.
Implications for research
Clustering methods are sensitive to small changes in the data and the measures used: the results of any single study should consequently be treated with caution. To reduce heterogeneity in this literature and maximise comparability across studies, it is important that researchers incorporate similar behaviours and measures in their analyses wherever feasible. Our eight behavioural archetypes can help researchers to think about how to achieve such comparability by suggesting which behaviours commonly cluster (e.g. alcohol, smoking, cannabis use) and which measures provide the most meaningful insight (e.g. measures which differentiate between the level of engagement in different behaviours rather than measures which solely focus on ‘ever use’).
Recognising that researchers will inevitably have their own research interests, we suggest that maximising comparability across studies using different datasets should be prioritised. In other words, where researchers wish to study additional or emerging behaviours (for example, social media use), we suggest these should be added to rather than substitute from a core list of key health-related behaviours. Studies proposing to focus on substance use alone may derive particular benefits from the inclusion of additional behaviours (for example, diet and exercise to avoid the construction of a large ‘abstaining’ cluster which indicates what young people do not do, without additional insight into their health, or the health-related behaviours in which they do engage. Further attention should also be given to how behavioural patterns may vary between population sub-groups. In particular, variation in relation to age, gender, socio-economic status and within vulnerable groups are important lines of future enquiry.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.