Study characteristics
All included studies were published between 2000 and 2021. Qualitative studies (
N = 4) included three survey and one questionnaire design. Mixed methods designs were included in this section as they report qualitative data. A total of 870 participants (ranging from 63 to 545) took part in all studies, of which 106 were female (12.2%). Age ranged from 14 to 33, with one study looking exclusively at youth (under 15) and one study not stating its age range. Playing level ranged from junior (youth/under 15) to professional. All studies surveyed players, one study additionally surveyed coaches. No studies reported on socio-economic or racial factors. All studies reported rates of headgear use and perspectives on the level of protection provided by headgear. Four studies reported reasons for/against wearing headgear. Two studies reported players’ self-report on concussion incidence. Results are summarised in Table
1.
Table 1
Characteristics of qualitative studies
| University, ages 18–22 N = 131 | Those wearing HG: 40% (26/40) wear “often” 23% (15/40) wear “sometimes” All participants: 51% (67/131) “never” wear HG | Not reported | HG effective against concussion: Sometimes: 23% Often: 26% Always: 9% Mean perception of effectiveness: 0.840 (Likert Scale 1–5) | HG concussion: N = 32 No HG concussion: N = 104 No association measures of effect |
| Youth, U-15, ages 14–16 N = 140 | 79.3% wear HG for “most of the season” | For: safety, previous head/neck injury Against: uncomfortable, hot, don’t like to, don’t need to | Of HG wearing players: 67%: play more confidently Previous injury motivating to wear HG | Not reported |
| Youth to pro N = 545 Male: 85% Female: 15% | 67% wear HG “at some point” 36% wear regularly in matches | For: protection from minor injuries Against: uncomfortable, hot, claustrophobic, don’t need it due to position, restricted vision, unable to hear | 37% believe HG prevents head injury Youth players believe this more strongly 10% wear HG to protect brain from injury | HG users: 63% report concussion Non-users: 55% |
| University, mean age 19.5 N = 122 | 18.9% wear HG | For: previous concussion Against: reasons not stated | 38.1% believed HG prevents concussion 42.2% stated wearing HG would lead them to play more aggressively | Not reported |
| HS, univ., club, national level, ages 15–33 Players & coaches Male: 62% Female: 38% N = 63 | 27% wear HG: National: 62% HS: 26% Univ: 0% Club: 10% | For: not reported Against: not mandatory, uncomfortable, cost, too hot, gets grabbed during play | HG may prevent concussion: 62% of players (39/63) 33% of coaches (3/9) | Not reported |
Field studies (
N = 7) included three prospective cohort studies (PC), two randomised controlled trials (RCT), one retrospective cohort study (RC), and one survey design. A total of 9,905 participants took part in all studies, of which just 87 were female (0.88%). The effect studied regarding headgear was incidence of concussion of players wearing headgear vs. players not wearing headgear. One study additionally assessed modified vs. standard headgear, while another additionally assessed days absent from concussion in headgear vs. non-headgear users. Two studies reported time loss from concussion. All studies looked at concussion incidence during match-play, one study also looked at training. Although the Marshall et al. [
5] study conducted telephone interviews, it was included in this category as its primary outcomes were injury rates. The Kahanov [
20] study was included as it reported quantitative data (closed-ended questions) in its survey. Playing level ranged from under-15 to professional. Two studies exclusively looked at youth (under-20) populations. Results are summarised in
Table 2Table 2
Characteristics of field studies
| PC | Non-professional, ages 15 + N = 3,207 | HG use: “always” vs. “rarely” Reports from doctor or trainer | Used mTBI, definition listed in study | Overall 7.97/1,000 player hours Always HG: IRR 0.57 (95% CI) | Not reported |
| PC | Club (amateur) N = 304 (240 male, 87 female) | HG use, injury rate ratio Telephone interviews | Injury definition used only, does not define concussion | No significant reduction in risk Adjusted RR: 1.13 (95% CI) | Not reported |
| RCT | Youth, U-15 N = 294 | HG group vs. no HG Reports, direct observation, video recording | Medical officers verified injury diagnosis | N = 9 (7 w/ HG, 2 w/o) No significant difference in rate in two study arms (p = 0.48) headgear vs. no headgear | Not reported |
| RCT | Youth, U-13, 15, 18, 20 N = 4,095 | Standard vs. modified HG use vs. no HG Reports collected in field | Concussion defined according to Vienna consensus statement | No significant difference in 3 arms of study (95% CI): No HG: 7.3/1,000 player hours Modified: 6.6/1,000 player hours Standard: 6.5/1,000 player hours | 8% of concussion cases missed at least 2 games |
| RC | Professional N = 1117 | Concussion vs. non-concussion injuries, video analysis | Physician diagnosed, operational definition for HIA | No significant effect on incidence 16% of concussion cases and non-injury controls wore HG Adjusted OR: 1.09 (95% CI) | W/ HG: 8 days W/o HG: 7 days Previous concussion increased concussion incidence four-fold (OR: 4.55, 95% CI) |
| PC | Professional N = 757 | Injury rate HG vs. no HG in verified concussive injuries Reports, video analysis | CISG defined, physician monitored, use of Maddocks questions | Concussion incidence w/ HG: 2.0/1,000 player hours Concussion incidence w/o HG: 4.6/1,000 player hours Training incidence: 0.02/1,000 player hours All 95% CI | 48% of concussed players returned to play within 7 days |
| Survey | University, ages 18–22 N = 131 | Likert scale; self-reported concussion, headgear use, perception of headgear effectiveness | Not defined | Not reported, numbers of concussions and percentages reported | Not reported |
Lab studies (
N = 7) included a variety of methods for dropping headforms onto a surface. Drop heights ranged from 20 to 91.2 cm. Numbers of headgear tested ranged from two to ten. Criteria for drop testing followed standards set by World Rugby (WR) [
35] or the National Operating Committee on Standards for Athletic Equipment (NOCSAE) [
36]. Six studies looked at repetitive impacts. All studies used a variety of impacts sites on the headforms. Five studies measured peak linear acceleration (g), two measured impact energy (J) or the Head Impact Criterion (HIC), and one measured impact velocity (m/s) or the Gadd Severity Index [
37]. One study additionally measured peak rotational acceleration (rad/s/s). Results are summarised in Table
3.
Table 3
Characteristics of lab studies
McIntosh and McCrory [ 45] | 20–60 cm | Fronto-lateral Centre-front | N = 8 Six manufacturers | Head injury criteria (HIC) Max headform PLA and RLA | Yes | Impact velocity (m/s) Impact energy (J) | Magnitude of accelerations increased as drop height increased Foam material completely compressed at 20 J |
| 30 cm | Parietal-lateral Occipital | N = 10 Honeycomb Vanguard | Attenuation of linear impact forces | Yes | Peak acceleration (g) Gadd Severity Index | Decreased ability to attenuate force w/ repeated impacts Type II (Vanguard): lower peak g recordings Higher peak g at occipital site |
| 30 cm 40 cm 50 cm 60 cm 80 cm | Lateral Centre-front | N = 2 Albion Headpro Canterbury (honeycomb) | Impact energy attenuation | Yes | Acceleration force (g) | Polyethylene foam: thickness increased to 16 mm improves attenuation Modified HG: significant attenuation when increasing density/thickness |
| 87 cm | Rear, top, front, side | N = 7 Commercially available | Impact energy attenuation HIC | No | Impact energy (J) | Only thickest HG (15 mm) had HIC of < 1,000 for side and front |
| 30 cm | Crown, temple, forehead | N = 3 NPro 2 controls | Linear and rotational impact energy | Yes | Peak linear acceleration (g) Peak rotational acceleration (rad/s/s) | Drop tests: NPro significant attenuation vs. control (67–72% reduction), impact to back of head not measured Pendulum tests: NPro reduced impact (34%) vs. bare headforms |
| 27.9 cm | Front, back, right | N = 7 Commercially available | Impact attenuation | Yes | Peak linear acceleration (g) | Each HG demo’d significant decreased acceleration from baseline Canterbury Ventilator most effective |
| 23.8 cm 30.0 cm 61.0 cm 91.2 cm | Rear boss, rear, side, forehead, front boss | N = 6 1–3: lightweight foam 4–6: thicker, new-style HG | Reduction of linear impact acceleration HIC | Yes | Impact velocity (m/s) Impact energy (J) Peak acceleration (g) | All HG reduced impact vs. no HG Impact attenuation decreased with increased drop height HG 4–6 had consistently larger PLA and HIC reduction than HG 1–3 |
Qualitative studies
Qualitative studies were analysed according to the Standards for Reporting Qualitative Research (SRQR) checklist [
34]. Results can be seen in Table
4. Results show commonalities of research design principles present in most studies. Titles, abstracts, results, and discussion sections generally adhered to the checklist, although three studies [
21,
38] provided only partial information for item 4 (purpose or research question). Methods sections were less adherent, with no studies providing adequate information for item 5 (qualitative approach and research paradigm) and one study [
38] providing information for item 6 (researcher characteristics and reflexivity). Information on data collection methods, instruments, and processing was minimal in two studies [
19,
21]. Similarly, the ‘other’ section of the checklist revealed discrepancies: one study did not provide information on conflicts of interest [
19], and only one study provided information on funding [
38].
Table 4
Qualitative studies analysed through Standards for Reporting Qualitative Research (SRQR)
Title | + | + | + | + |
Abstract | + | + | + | + |
Problem formulation | + | + | + | + |
Purpose or research question | + | ± | + | ± |
Qualitative approach & research paradigm | – | – | – | – |
Researcher characteristics & reflexivity | – | – | – | + |
Context | + | + | + | + |
Sampling strategy | + | – | + | + |
Ethical issues pertaining to human subjects | + | – | – | + |
Data collection methods | + | + | + | + |
Data collection instruments & technologies | + | + | ± | + |
Units of study | + | + | + | + |
Data processing | + | – | – | + |
Data analysis | + | + | – | + |
Techniques to enhance trustworthiness | + | – | – | + |
Synthesis & interpretation | + | + | + | + |
Integration with prior work, implications, transferability, and contribution to the field | + | + | + | + |
Limitations | + | + | + | – |
Conflicts of interest | + | + | – | + |
Funding | – | – | – | + |
Field studies
Observational field studies were analysed according to the Strengthening and Reporting of Observational Studies Epidemiology (STROBE) [
33]. Results can be seen in Table
5. Direct comparison of all studies was problematic due to variability in research design and concussion definition. Common limitations of field studies included unclear objectives, lack of clearly defined variables (especially potential confounders), unclear explanation of quantitative variables (such as why groupings were chosen), and inadequate descriptions of study limitations.
Table 5
Field studies analysed through Strengthening and Reporting of Observational Studies Epidemiology (STROBE)
Title/abstract | | + | + | + | + | + | + | + |
Introduction | Background, rationale | + | + | + | + | + | + | + |
| Objectives | + | + | + | ± | ± | + | + |
Methods | Study design | + | + | + | + | + | ± | + |
| Setting | + | + | + | + | + | + | ± |
| Participants | + | + | + | + | – | + | ± |
| Variables | + | – | – | – | + | + | - |
| Data sources, measurement | + | + | + | + | + | + | + |
| Bias | + | + | + | – | – | – | – |
| Study size | + | + | + | + | + | + | – |
| Quantitative variables | + | + | + | – | + | – | ± |
| Statistical methods | + | + | + | + | + | – | + |
Results | Participants | + | + | + | + | + | + | + |
| Descriptive data | + | + | + | + | + | + | + |
| Outcome data | + | + | + | + | + | + | + |
| Main results | + | + | + | + | + | + | + |
| Other analyses | + | + | + | – | – | + | + |
Discussion | Key results | + | + | + | + | + | + | + |
| Limitations | + | – | + | – | + | + | – |
| Interpretation | + | + | + | + | + | + | + |
| Generalisability | + | + | + | + | + | + | + |
Other info | Funding | + | – | + | + | + | + | – |
Regarding headgear use, there is a general lack of specific information on headgear wearing rates such as not reporting on the number of participants who actually wore headgear in each group. The Hollis et al. and Kahanov et al. studies relied on a questionnaire or survey asking about headgear use at a single point in the season [
39]. This is problematic, as the reported rates of concussion are reliant on accurate description of headgear use. The Marshall et al. study was found to be susceptible to self-report bias as the authors were reliant on obtaining injury data from players on a weekly basis, and such reports were gathered by telephone [
40]. By contrast, other studies involved direct observation by trained personnel. Video verification was used in the Stokes et al. study (all games) [
41] and the McIntosh and McCrory study (six randomly chosen games) [
42].
Regarding validity, the Kemp study used a wide definition of concussion which may have overestimated incidence rates [
43]. The Marshall et al. study included wide demographic variation (different age groups, levels of play, different genders) which may decrease the specificity of its findings towards a particular target population [
40]. The McIntosh and McCrory study had a lack of control for confounding variables [
42]. All other studies adjusted for confounding variables, thereby improving internal validity. External validity appeared sound; no evidence exists to suggest the staff, places, and facilities described in studies were not representative of the environments to which participants would normally be exposed.
Studies did not consider that higher risk-taking behaviour may negate benefits of headgear prevention, but there is also potential bias in incidence data if players wearing headgear played more conservatively. Training exposure was only captured in one study [
40]. The Hollis et al., Stokes et al., and Marshall et al. studies included adjusted rate ratios accounting for previous injury [
39‐
41], therefore the remaining three studies may have over-estimated incidence by not considering this variable [
42‐
44].
Studies consistently refer to the challenging nature of objectively defining concussion. Participants may have sustained sub-clinical concussions or other head trauma, making them more susceptible to concussion despite the use of protective equipment. There is variation on the standards used to measure injury and concussion incidence: four studies used player-hours [
39,
41,
43,
44], the remaining two used concussion time-loss [
40,
42]. The Hollis et al. study lists its outcome as mTBI rather than concussion [
39]. Baseline concussion history was only reported by Hollis et al. [
39]. The Kahanov et al. study did not report an outcome measure of effect, had self-reported concussions, and did not use a denominator to calculate incidence rates [
20].
Lab studies
Experimental lab studies were analysed. Direct comparison of all studies was problematic due to variability in measurements used to assess impact acceleration and attenuation. Given the lack of a standard quality assessment tool for lab studies, a checklist was generated using a model adapted from Benson et al. [
22]. Results can be seen in Table
6. Lab-based study designs lack external validity, namely, game and practice conditions differ from laboratory settings and the impact forces required to produce concussion are not agreed upon [
14]. All studies reported on the validity and variability of the testing apparatus used. Levels of evidence in lab studies are dependent on multiple factors.
Table 6
Laboratory studies analysed using a model adapted from Benson et al. [
22]
Title/abstract | + | + | + | + | + | + | + |
Background/rationale | + | + | + | + | + | + | + |
Objectives | + | + | + | – | + | + | + |
Study design | + | + | + | + | + | + | + |
Variables incl. environmental conditions | – | – | – | – | – | – | + |
Validity of testing apparatus | + | + | + | + | + | + | + |
Statistical methods | + | + | + | + | + | - | - |
Exposure methods incl. repetition | + | + | + | – | + | + | + |
Outcome measures incl. PLA & PRA | – | – | + | – | – | – | + |
Properties of equipment specified | + | – | – | + | + | + | + |
Equipment likely to be worn by/ accepted by participants | + | + | – | + | + | + | + |
Reporting on mechanism of injury consistent with real life examples | + | + | + | + | + | + | + |
Results regarding generalisability | + | + | – | – | + | – | + |
Data analysis | + | + | + | + | + | + | + |
Synthesis & interpretation | + | + | + | + | + | + | + |
Integration with prior work, implications, transferability, and contribution to the field | + | + | ± | + | + | + | + |
Limitations | + | + | – | + | – | + | – |
Conflicts of interest | + | + | + | – | – | + | + |
Funding | – | + | + | – | – | + | + |
All studies report on types of headgear studied. The McIntosh et al. study used a modified headgear model with increased bulk [
45]. Commercially available headgear is more likely to be worn by players but tended to perform worse on impact tests. The three most recent studies used a no headgear condition to act as a control [
11,
24,
46]. Therefore, these studies provided more valid comparisons of headgear performance vs. a baseline. The two McIntosh studies provided information on foam testing [
45,
47]. All studies reported on the headgear used (density range 45–87 kg/m
3; thickness range 7–20 mm). Two studies did not provide detail on the thickness and density of headgear used but do provide information on brands used [
24,
46]. The Knouse et al. and Draper et al. studies provided the greatest detail in makeup of the type of headgear (honeycomb vs. flat panel), thickness, and density [
11,
12]. The Hrysomallis and McIntosh studies (
N = 3) did not report on the cellular makeup of the foam present in the headgear [
45,
47,
48].
All studies except Hyrsomallis [
48] used repetitive impact testing and its rationale for use, i.e. mirroring repetitive impacts experienced in gameplay. Number of repetitions varied from two to ten times. The Ganly and McMahon study carried out a separate repeated impact test on the NPro headgear, involving 1,920 impacts, to simulate up to three years of use [
24]. All studies reported on the sites of impacts tested. Three studies reported on the lack of padding at the back of headgear [
11,
12,
48], suggesting impacts received to the unprotected occipital region might result in higher impacts than front or side impacts. The Ganly and McMahon study included pendulum testing on two moving headforms thereby replicating on-field head-to-head contacts, but did not test impact to the back of the head [
24]. The Draper et al. study justified inclusion of side impacts as the most common impact site in gameplay [
11].
Most studies did not report testing headgear in different environmental conditions (wet, humid, temperature). Four studies [
12,
45,
47,
48] reported testing took part in ambient temperature (20–22 °C) while the McIntosh et al. study reported testing headgear in ambient, hot (50 °C), and cold (− 10 °C) temperatures [
47]. All studies reported using spherical headforms in testing, McIntosh and McCrory provided more detail in describing a deformable skin more like a human head [
45]. Four studies [
11,
24,
45,
47] reported the use of a Hybrid III headform [
49]. Two studies reported on details of the glycerine “brain” encased inside a sealed cranium and included details on the “neck” of the headform [
12,
48]. All studies reported on the details of the impact surface used. The McIntosh et al. study used a flat, rigid force plate which may not mimic impact of a spherical human head as effectively [
47]. All studies reported on the makeup of the impact pad, with three studies reporting on the thickness of the impact pads used (1–1.6 cm, 1.3 cm, 2.5 cm, respectively) [
11,
12,
47]. The Frizzell et al. study used an artificial pitch impact pad to mimic gameplay [
46].
Regarding the measurement of rotational as well as linear acceleration, one study (Ganly & McMahon) reported data on peak rotational acceleration (PRA) but provided less detail on its results [
24]. The Draper et al. study asserted there is no currently accepted method to measure PRA [
11]. Regarding outcome values, two studies reported peak acceleration (g) values only [
46,
47]. Three studies reported both peak acceleration and HIC [
11,
45,
48]. The Knouse et al. study also used the Gadd Severity Index [
12]. The Ganly and McMahon study also measured rotational acceleration (rads/s
2) and velocity (km/h) [
24], but two studies reported PRA is not currently able to be studied adequately under lab conditions at this time [
11,
46].
All studies reported on the details of the drop testing rigs used to measure linear acceleration. The Ganly and McMahon study also used a pendulum rig to simulate head-to-head player contact [
24]. All studies report on height and impact energy. The McIntosh et al. and Frizzell et al. studies did not report mass [
46,
47]. The Knouse et al. and Frizzell et al. studies did not report velocity [
12,
46]. The Knouse et al. and Draper et al. studies provided detail on the use of twin wires to guide the headform onto the anvil and reported on calibration of the testing rig prior to study commencing [
11,
12].
Regarding the quoting of authority standards used in approving equipment and procedures, four studies reported on using NOCSAE standards [
11,
24,
46,
48]. The Frizzell et al. study did not report on World Rugby standards for headgear [
46]. The three most recent studies [
11,
24,
46] quoted EN960 standards for headforms used [
50]. The McIntosh et al. study provided minimal information on approval specifications for the testing rig [
47], while the Hrysomallis study calculated mean impacts for concussion from previous field reports [
48]. Frizzell et al. quoted popularity of artificial pitches in choosing its impact pad [
46].
Finally, reporting on statistical analyses varied. The two McIntosh studies did not report statistical analysis used [
45,
47]. The Hrysomallis study did not explain which statistical analysis was used in its methods section, but quoted correlation co-efficients in its results section [
48]. The four remaining studies utilized multiple data analysis methods, which may improve internal validity [
11,
12,
24,
46].