Summary of key findings
This review identified six CAWH-specific HRQoL tools and evaluated their conceptual framework, psychometric properties, administration burden, and patient involvement in development using the COSMIN framework. By employing a comprehensive search strategy, dual-stage screening, and structured data extraction, the review adheres to JBI guidelines aimed at ensuring methodological transparency, comprehensive data coverage, and robust reporting of findings. A key finding is that none of the tools comprehensively capture all aspects of Quality of Life relevant to CAWH patients, particularly variable was the lack of patient input during tool development. Additionally, psychometric validation remains incomplete for most instruments, with gaps in content and structural validity.
Although ten generic HRQoL tools were also identified, these were not evaluated in depth using the COSMIN framework as they were not designed specifically for CAWH patients and lacked the disease-specific relevance required for tailored outcome measurement in this population. Instead, these generic tools serve as broad assessments of general health status, often used in hernia research but without the specificity to fully capture the unique quality-of-life concerns associated with CAWH.
The tools vary in their measurement approach, with some focusing primarily on pain and function, with only a few including mental health and body image. Despite these differences, all tools face challenges in ensuring relevance, responsiveness, and usability. The absence of comparative psychometric data between instruments makes it difficult to determine which tool, if any, is most suitable for assessing HRQoL in CAWH patients. These limitations raise concerns about the appropriateness of current tools and highlight the need for a more patient-driven approach to HRQoL measurement in this population.
The conceptual and measurement model
A HRQoL tool is made up of a stem of items that measure different aspects of HRQoL. A shift in the rating of the items indicates either an improvement or deterioration in HRQoL. Shifting items are important when selecting a HRQoL measure and a ‘good’ questionnaire may benefit from many shifting items [
38]. The shift of items depends mainly on what questions are included within the stem, for example: if a tool does not ask about a problem relevant to HRQoL in that patient population the items will not shift, meaning problems and improvements go unreported; and questions relating to a mild impairment in patients HRQoL are also less likely to demonstrate evidence of improvement.
QoL, by its very nature, is specific to an individual [
27], meaning that item relevance is important when deciding whether to use a generic HRQoL measure or a specific tool [
39]. An important question is ‘are we capturing specific HRQoL data related to a patient living with a CAWH’? The generic scales such as SF-36 do not capture this in detail. Work by Heniford et al. showed that SF-36 has limited sensitivity and specificity in comparing hernia surgery outcomes between patients or changes in QOL during the postoperative period [
9].
Secondly, most tools we examined compared data sets across different patients and then factor analysed these to see how items/questions compare to each other – either for the purpose of reducing the number of questions to increase acceptability to patients or to compare them to other generic questionnaires as a way of establishing validity in the eyes of the researchers. This method is flawed when you are aiming to examine specific areas of QoL related to a specific ‘disease’ as it arguably means the nuances and the items not previously identified will be lost. At its worst, factor analysis could reduce all the items down to one ‘How do you feel?’. Though extreme, it describes the aim of many researchers and often the aim of their factor analysis.
Hyland states that he has
“never come across a QoL scale that is incapable of demonstrating validating correlations with other QoL scales” [
38]. One of the reasons is that
“self-report measures are strongly correlated with the personality trait of negative affectivity (e.g., neuroticism, depression, anxiety)” [
38]
. Thus, non-specific or generic QoL scales will inter-correlate amongst themselves as they are possibly measuring the patient’s personality trait rather than comparing specific QoL issues such as “Can you bend down and touch your toes?”.
Psychometric properties of CAWH tools
Psychometric properties are very important to consider when selecting any HRQoL tool, so it is worth bearing two things in mind. Firstly, where scales do not have psychometric properties, they should not be used. Secondly, when two similar scales are both adequate the one that is most reliable should be selected [
38]. Whilst psychometric properties ensure ‘reliability’ they should also ensure they inform us if the
right questions are being asked (‘validity’) and ensure these questions are ‘responsive to changes’. At present, no study compares the psychometric properties between the outlined CAWH specific HRQoL tools. This may be because they are incomparable given that none appear to be based on any existing HRQoL model and consist of differing items. Furthermore, tools like the HerQL use mathematical modelling to justify the tool, but while it satisfies statistical parameters, it does not mean that it is asking relevant questions to begin with. For instance, the authors state that it was developed with AWH and groin hernia patients in mind [
40,
41]. These tools are heterogenous and, therefore, do not account for if different things matter to groin hernia patients compared with CAWH patients in terms of HRQoL.
As with inguinal hernia measures [
42] and most outcome measures used in general surgery [
43], CAWH specific HRQoL tools are insufficiently validated. These tools still have a utility but lack insufficient evidence, particularly concerning
content and structural validity.
Content validity (Suppl. file 2) has not been established for CCS, HerQLes and AAS nor for the EuraHS-Qol or the HerQL. Therefore, we do not know whether patients perceive the items within these tools as truly and/or wholly reflective of their CAWH experience. This is pertinent to the above issues highlighted in the previous section. Typically, content validity is attained via individual qualitative interviews or focus groups where patients are specifically asked about their opinions regarding the tools and their relevancy [
38]. In this regard, the AHQ achieves content validity through using the Scientific Advisory Committee of the Medical Outcomes Trust and the National Quality Forum guidelines for PROM development as a framework to create the AHQ score. This process generated a framework containing 45 items, with purposively sampled hernia patients commenting on this preliminary instrument. These patients then underwent focus groups responding to the content of the questionnaire, which were content analysed. Whilst valid, patients were not involved in the development of the initial 45 items, meaning items may be missing that are important to CAWH patients, and a wholly patient-centred approach not attained.
Structural validity (Suppl. file 2) provides evidence for construct validity by measuring and assessing the number of dimensions that comprise an instrument. For instance, the AHQ consists of three factors in the post-operative element [
24]. Therefore, a three-factor model should be substantiated by statistical methods such as factor analysis, which would provide useful information about the relationship between items [
44]. It is necessary to note that although HRQoL CAWH tools lack evidence for content and structural validity, it does not mean that these tools are completely insufficient or lack utility, it simply means that such evidence is inconclusive.
Some CAWH specific QoL tools are not truly HRQoL measures. Due to confusing lexicon within the literature, it is possible that some instruments are being promoted as HRQoL tools when they perhaps more accurately fall under the umbrella of functional assessment tools. For instance, the CCS is a tool used widely in CAWH and has been tested for reliability across varying cultures as well as other types of hernia [57]. It was
“specifically designed to evaluate patient abdominal function as it relates to the patient’s hernia and hernia repair” [
45]. However, it was also proposed as a
“quality-of-life survey pertaining specifically to patients undergoing hernia repair with mesh” [
9]. Undoubtedly, it is a useful tool and represents an important step towards hernia specific HRQoL tools but its use is limited pre-operatively; limited to only patients who have had mesh repair; and because of the focus on mesh related symptoms and pain related to movement, and because of its assumption that the effect of a prosthetic on a patient encompasses CAWH related QoL [
45], it is perhaps best served as a predominantly functional assessment tool.
Similarly AAS, despite non-CAWH origins, is classified as a CAWH HRQoL tool in systematic reviews [
4,
14]. Whilst the AAS has been used to measure HRQoL in CAWH patients [
46], it has otherwise been used as a reliable and valid instrument to evaluate patient function in two different patient populations – laparoscopic and open groin hernia [
36], and in women post pelvic reconstruction surgery [
37]. Again, this draws questions of validity and whether the AAS is better served as a “functional assessment tool”. It is necessary to move away from automatically equating function with HRQoL without acknowledging the broader multidimensional QoL [
47,
48].
The HerQLes tool consists of several logical items, but its sole base on expert opinion means it lacks content and structural validity. The creators of the HerQLes revised this instrument, now promoting its use in AWH [
45]. Given the necessary COSMIN due diligence, such extensions may be less valid than designing instruments from scratch with key intentions and outcomes in mind.
At the time of writing, the AHQ is a relatively new tool, meaning that there is limited but developing information published pertaining to its use. The one existing study assesses user burden, test–retest reliability and longitudinal validation (against HerQLes and the generic Short Form-12) [
23]. This tool is promising but, at present, there is insufficient data to draw any firm conclusions, especially given the criticism regarding the tools it has been validated against.
It is relevant for surgeons to understand the psychometric properties underpinning the different HRQoL tools so that they and their research teams select the right instrument to answer their posed question. COSMIN provides a useful taxonomy concerning what constitutes psychometric properties, which are detailed in Supplementary file 2.
Expert opinion has featured prominently in the development of CAWH QoL tools such as the EuraHS-QoL [
25], while the HerQLes tool was based on a literature review by an expert panel of only four general surgeons [
45]. The CCS was “
initially modelled on other questionnaires measuring quality of life in other areas of surgery because there was no QOL scoring system then for hernia repair or AWR
but we moved through that phase quickly. We performed the questionnaire and discussed it with patients and moved to its current form” (personal email correspondence from Dr. Todd Heniford). The AHQ tool was generated with some patient involvement via focus groups, where patients commented on an already pre-designed and itemised tool that had been based on expert opinion [
50]. Finally, whilst the AAS tool interviewed patients regarding ‘function’, as well as conducting an expert panel of six surgeons [
36], it did not explore the wider aspect of patient quality of life and was not based on any existing evidence based HRQoL model [
47,
48].
The lack of patient input
at the time of genesis of a disease specific HRQoL tool means that we cannot be sure if the items comprising the tool are
appropriate to assess change in QoL in CAWH patients. Also, when patients are asked to comment on a tool designed by experts, they may not feel they are able to due to power differentials and a ‘surgeons know best’ attitude. As such, the existing HRQoL CAWH tools may contain items important to operating surgeons, but may not include items that really matter to CAWH patients. This neglects the fact that no one knows more about how a pathology affects their own QoL than the person suffering with it—a sentiment noted in 2012 by Jon Stamford, a Parkinson’s patient and patient advocate who stated that, “
it seems to be that Quality of Life is when you tell me what’s missing in my life. That seems to me to be rather odd” [
51]. It is clear that current tools have not been developed from the ground up by asking CAWH patients what matters to them, and then developing on this fundamental knowledge base.
Whilst no doubt multi-casual, we note Wicks’s (2014) point that,
“the primary concern of instrument developers was whether their scale would get published in a top journal or used in a clinical trial” [
51]. Whilst the development of the AHQ represents a shift away from this mindset, more work needs to be done to generate tools that measure what matters most to patients suffering with CAWH. The Abdominal Hernia-Q (AHQ) is the only tool among those reviewed to include some degree of patient involvement however, this participation was not present at the earliest stages of tool development. Patients contributed to reviewing a 45-item draft instrument and participated in focus groups and concept elicitation interviews; however, these occurred after a preliminary conceptual framework and item set had already been developed by the study team. As stated by Carney et al. in their initial paper on its development,"Semi-structured questions were developed by the research team and were based on a preliminary conceptual hernia framework developed by a team member (AB)", and the framework itself was informed"via a clinically derived set of domains… [including] plastic surgery principles relating to appearance and function as well as the senior authors’ experiences treating hernia patients."[
50]
Thus the foundational structure and content of the tool were clinician-generated prior to patient input. While patients did later complete the instrument and provide feedback ("patients independently completed the preliminary 45-item instrument") and contributed to qualitative refinement, the initial domains and items had already been defined. This is an important distinction. True patient-driven tool development begins with patients' lived experiences shaping the core domains, typically through qualitative interviews or focus groups before any items are generated [
52]. Without this foundational step, tools may reflect the clinicians’ understanding of what matters, rather than the authentic and nuanced experiences of those living with the condition. As such, while the AHQ represents progress toward patient-centred design, it does not meet the criteria for a wholly patient-derived instrument. We feel that this work should start with Gram-Hanssen et al.’s (2020) premise that,
“We do not know what is important to the individual patient if we do not ask them” [
42]
. Such a move may enable a much more relevant, honest, and useful HRQoL to be developed.
Study limitations
While this review provides a comprehensive evaluation of CAWH-specific HRQoL tools, several limitations should be acknowledged.
Firstly, the study relied on published literature, meaning that unpublished tools or ongoing validation studies may not have been captured. Some relevant data on psychometric properties, content validity, or tool administration may exist in grey literature, clinical trials, or industry reports that were not accessible. Additionally, despite attempts to retrieve all eligible studies, 17 full-text articles could not be accessed, which may have contained relevant findings.
Secondly, the lack of direct comparative psychometric studies between CAWH-specific HRQoL tools makes it difficult to determine which tool, if any, performs best in assessing Quality of Life. Although COSMIN provides a structured framework for evaluating HRQoL instruments, the current review was limited by heterogeneous reporting of psychometric data, making direct comparisons between tools challenging. Future research should conduct head-to-head psychometric testing in CAWH populations to address this gap.
Finally, while this study highlights the lack of patient involvement in tool development, it does not include direct patient perspectives on existing tools. Future research could engage CAWH patients in qualitative studies to validate and refine HRQoL domains based on lived experiences.
Despite these limitations, this review provides a critical assessment of existing tools, highlighting gaps in content validity, psychometric robustness, and patient-centred development. These findings reinforce the need for a more comprehensive, patient-informed HRQoL instrument for CAWH.