Introduction
The Global Burden of Disease study has highlighted that low back pain (LBP) is the leading global contributor to years lived with disability and the sixth global contributor to disability-adjusted life years [
1,
2]. The global prevalence of activity-limiting LBP was recently estimated to be approximately 39 % for lifetime prevalence and 18 % for point-prevalence [
3]. Only a small proportion of people experiencing LBP seek health care but these account for high costs that represent an important burden to society [
4,
5]. The large majority of patients with LBP are labelled as having non-specific LBP (NSLBP) because no underlying pathology or cause can be found [
6‐
8]. A wide range of health interventions exists for patients with NSLBP and related clinical trials are often summarized in systematic reviews [
9,
10]. However, authors of these reviews report that outcomes are inconsistently measured and reported across trials [
11‐
13]. This inconsistency may limit the comparison of findings among trials and hinder statistical pooling [
14]. In addition, inconsistent reporting can be due to selective reporting bias (e.g. reporting only favourable outcomes in a publication), which may strongly affect the conclusions of systematic reviews [
15].
The development and use of core outcome sets (COS) for specific health conditions has been suggested to reduce inconsistency in outcomes measured and reported across clinical trials [
14]. A COS represents an agreed set of outcomes that should be measured and reported, as a minimum, in all clinical trials for specific health conditions [
16]. Such a set does not restrict measurement or the choice of the primary outcome, but mandates collection and reporting of the COS alongside the outcomes of interest [
16]. A COS thus creates a minimum standard of outcomes reported, reducing the risk of selective reporting bias and increasing the validity and statistical power of meta-analyses [
17].
The recently launched Core Outcome Measures in Effectiveness Trials (COMET) initiative fosters methodological research and provides methodological guidance on the development of a COS [
16]. The expertise accumulated by the Outcome Measures in Rheumatology (OMERACT) initiative is also a fundamental guidance in COS development [
18]. A stepwise approach is suggested by both initiatives: first, the core outcome domains should be selected (i.e. ‘what’ to measure), and then the measurement instruments for each domain (i.e. ‘how’ to measure) [
16,
19].
In the field of LBP, recommendations for standardized reporting of outcome measurement instruments in clinical studies were formulated at an expert panel discussion held at the 1997 International Forum on LBP in Primary Care (The Hague, The Netherlands) [
20]. Specific recommendations were made for five outcome domains (i.e. ‘pain symptoms’, ‘back-related function’, ‘generic well-being’, ‘disability social role’ and ‘satisfaction with care’) [
20,
21]. A workshop discussion among LBP researchers during the 2012 International LBP Forum (Odense, Denmark) agreed on the need of updating the existing recommendations [
22]. This was motivated by recent advances in understanding of construct development and measurement properties that stress the need to explore whether relevant domains are missing and to critically appraise recommended instruments [
22]. Deyo et al. [
20] proposed also a parsimonious set of six questions covering the five domains suggested for measurement in LBP clinical research. These questions were extracted from existing questionnaires and were proposed as the minimum to be used in a wide variety of settings, including routine clinical care [
20]. This brief set was labelled as ‘Core Outcome Measures Index’ (COMI) by other investigators who assessed its measurement properties and feasibility of implementation [
23,
24]. However, it is out of the scope of this study to update the set of questions included in the COMI for LBP.
The aim of this study is to update the existing standardized set of outcome domains and measurement instruments recommended for LBP [
20,
21], through the development of a COS. This COS is intended for the measurement of efficacy or effectiveness of health interventions assessed in all clinical trials for patients with NSLBP. We defined NSLBP as “low back pain not attributable to a recognizable, known specific pathology (e.g. infection, tumour, fracture, axial spondyloarthritis)” [
25]. The first step in the development of this COS and focus of this manuscript was to perform a Delphi study to reach international consensus on core outcome domains.
Methods
A detailed description of the methods of this Delphi study is presented elsewhere [
26]. An International Steering Committee with members from four continents, including researchers, care providers and patients’ representatives, worked on the development of this COS. The day-to-day conduction of the study was performed by a project team of four people (AC, CT, MB, RO) working at the same institution (VU University/VU Medical Center, Amsterdam) who designed and addressed key aspects of the study. The other members of the Committee were regularly consulted by e-mail regarding critical decisions.
The Steering Committee decided to involve four groups of stakeholders in the Delphi study: health care researchers, health care providers, professionals working both as researchers and providers, and patients with NSLBP. Professionals from many fields of clinical research relevant for NSLBP (e.g. orthopaedics, physiotherapy, epidemiology, psychology, rheumatology, rehabilitation medicine) were involved. Patients are judged to be essential in developing COSs as they can bring the perspective of those living with a health condition [
16,
18]. Previous COS efforts involving patients or the public identified core outcome domains that were not previously identified by other stakeholders [
27‐
29].
The main advantages of a Delphi method include the involvement of informed individuals, anonymity of responses that reduces influence of prominent personalities, and the possibility for Delphi panellists to reconsider their views based on feedback reports of previous rounds [
30,
31]. As this project did not involve experiments with patients or study subjects, according to the Dutch Medical Research in Human Subjects Act (WMO), it was exempt from ethical approval. All patients involved were asked for their consent prior to participation and all procedures were conducted according to the Declaration of Helsinki.
Selection of panellists
A list of health care researchers who had extensively published on LBP over the last 10 years (2003–2013) was made by one reviewer (AC) through a structured search in Web of Science (accessed October 7, 2013) and PubMed [
26]. Other researchers and health care providers were added to this list through convenience sampling. Patients were recruited through the Steering Committee, seeking people who sought care for a present or past episode of NSLBP and had a fluent understanding of written English. When patients willing to participate were identified, they were contacted by email, given further information on the study and asked for consent to participate. Patients agreeing to participate were sent an information document giving simplified explanations of the terminology used in the study. Members of the Committee were also selected to participate in the Delphi so that they could express their vote on core domains. The final list of potential panellists was managed by the project team and names in the list remained blinded to all those selected for participation.
Generation of a list of potential core domains
The Steering Committee took responsibility for drawing a list of potential core domains that was used in the Delphi study. This list resulted from a search of outcome domains measured in clinical trials included in five recent systematic reviews [
12,
13,
32,
33] (one of which not published yet) with addition of the (sub) domains included in the comprehensive International Classification of Functioning (ICF) core set for LBP [
34], and in a conceptual model developed to characterize the burden of LBP [
35]. This conceptual model and the ICF core set were adopted to account for the patients’ perspective in this early phase. The model on the burden of LBP was developed by asking different stakeholders (including patients) which aspects of health were the most relevant to them [
35]; the comprehensive ICF core set was shown to cover all health issues identified by patients with LBP [
36]. The OMERACT Filter 2.0 framework was used to structure the list of potential core domains, subdividing it into four core areas that encompass the complete content of what is potentially measurable in a clinical trial (“
Appendix I”) [
19]. To determine wording and definitions of the potential core domains, terminology used in existing health frameworks or COSs were consulted: ICF [
37], Patient Reported Outcomes Measurement Information System (PROMIS) [
38], Wilson and Cleary Model [
39] and IMMPACT [
40,
41].
Delphi procedure
Three Delphi rounds, including open- and close-ended questions, were used to reach consensus on core outcome domains. Individuals not participating in one round, and who did not explicitly express their desire to opt-out, were invited to each subsequent round. The Delphi study was conducted using SurveyMonkey software and invitations to participate were sent by email.
In the first round, panellists were asked to judge whether each potential core domain was important enough to be included in this COS with possible answers ‘yes’, ‘no’ and ‘unsure/not my expertise’. Panellists were given the opportunity to propose changes of wording and definitions of domains, to indicate if some domains had major conceptual overlap or had to be aggregated, and to suggest the inclusion of missing potential core domains. A question was asked about the ideal number of domains for this COS and another about reporting of adverse events (AEs). Panellists were always encouraged to provide a rationale for their answers. A priori cut-off criteria were established for excluding domains that were rejected by more than 60 % and favoured by less than 20 % of respondents.
In the second round, a proposal was made for exclusion of domains that did not have at least 67 % of the first round respondents answering ‘yes’ or ‘unsure/not my expertise’. Other proposals were made for excluding or retaining domains suggested as having large conceptual overlap. Consensus for the second round was a priori set at 67 % of respondents agreeing with a proposal. Panellists were also asked to judge whether the potential core domains suggested as missing were important enough to be included in the COS, as done for the other domains in the first round.
The remaining potential core domains were presented in the third round to ask the panellists if each was indeed core. A priori consensus was set at 67 % of the panel agreeing that a domain is core. In each round, descriptive statistics were used to summarize all the questions. All rationales provided by panellists were checked against the quantitative results to evaluate whether substantial inconsistencies emerged. Responses of the patients’ group were always analysed separately to assess whether discrepancies were emerging with the rest of the panel. In the third round, frequencies of responses for each domain were calculated for the whole panel and separately for each of the stakeholder groups.
Final decisions
The project team made some proposals to the Steering Committee regarding the interpretation of the final results of the Delphi. Committee members expressed their opinion on each proposal and the opinion supported by more than 50 % of members was followed. Some proposals concerned the inclusion of a ‘death’ and a ‘pathophysiological manifestations’ domain in the COS (as recommended by the OMERACT initiative for all COSs [
19]), and what would be an appropriate approach for the reporting of adverse events (AEs).
Discussion
Using the methodological guidance of initiatives like COMET and OMERACT [
16,
19], we performed a Delphi study to provide an international, multidisciplinary and multistakeholder consensus-based update of an earlier standardized set of outcome domains for LBP research [
20,
21]. Sufficient agreement was reached on core outcome domains that are part of a COS intended for clinical trials assessing efficacy or effectiveness of health interventions in patients with NSLBP. The domains included in this COS are ‘physical functioning’, ‘pain intensity’, ‘health-related quality of life’ and ‘number of deaths’ (see definitions in Table
2).
The domain ‘physical functioning’ reached the highest level of consensus in this study and the definition focuses on ability to engage in daily physical activities (Table
2). Our definition of ‘physical functioning’ will be fundamental to determine which measurement instrument would best measure this domain. IMMPACT recommendations for chronic pain clinical trials also suggest measuring ‘physical functioning’ as a core outcome domain [
40,
41], and this convergence strengthens its inclusion.
‘Pain intensity’ also reached a very high level of consensus for inclusion in this COS. The inclusion of a pain domain is in line with the original core set [
20,
21] and IMMPACT recommendations [
40,
41]. ‘Pain intensity’ for this COS refers to the magnitude of the pain experience, whereas other pain (sub)domains were suggested for consideration by the previous core set and/or IMMPACT (e.g. ‘bothersomeness of pain’, ‘pain quality’, ‘temporal aspects of domains’, ‘pain medications’) [
20,
21,
40,
41]. Some of those pain domains and others (i.e. ‘Pain behaviour’, ‘pain interference’) were presented as potential core domains in this Delphi but not sufficient agreement was reached to consider them as core (Figs.
2,
3).
‘Health-related quality of life’ included in this COS could be considered as the ‘successor’ of ‘general well-being’ included in the previous set [
20,
21]. However, a definition of ‘general well-being’ was not given for the previous set and this makes a clear comparison of the two constructs challenging. Taking into account the widely accepted bio-psycho-social model for LBP [
42,
43], it may be appropriate to have a domain like ‘health-related quality of life’ in this COS as its definition includes all components of the model (Table
2). The inclusion of all components of the bio-psycho-social model is also in line with the domains included in a conceptual framework developed to characterize the burden of LBP [
35] and with the results of a review that attempted to summarize qualitative research conducted on the impact of LBP on people’s lives [
44]. However, it will be clear only when choosing measurement instruments for this COS if the different components of ‘health-related quality of life’ can be treated as separate domains or as one multidimensional domain. The choice of instruments will also be guided by the intention of minimizing redundancy of measurement, to avoid large overlap of instruments and promote brevity of the COS.
Another key aspect in the development of a COS is the definition of contextual factors (i.e. potential confounders and effect modifiers) that should be measured alongside core outcome domains [
19]. However, it was beyond the scope of this study to address contextual factors and for the measurement of these factors a reference is made to the prominent work of the National Institutes of Health (NIH) Task Force [
45]. This Task Force recently published a report on minimum baseline standards that should be collected in clinical studies for chronic LBP, to standardize their assessment [
45].
This COS includes refined versions of three domains included in the previous standardized set but does not incorporate the other two: ‘disability social role’ and ‘satisfaction with care’ [
20,
21]. ‘Disability social role’ referred to work absenteeism and could be replaced by the domain ‘work productivity’ used in this study, while ‘satisfaction with care’ was formulated as ‘satisfaction with treatment services’ in this study, but neither was supported by the Delphi panel (Figs.
2,
3). ‘Work productivity’ refers to indirect non-medical costs that are the first cost drivers for LBP [
5] and it is an undoubtedly important outcome for clinical trials with economic evaluations alongside. However, this domain poses the challenge of its measurement in clinical trials aimed at assessing efficacy of interventions, in which an economic evaluation might be out of the scope of the trial. To support the exclusion of ‘satisfaction with treatment services’ several panellists underlined that it could be highly influenced by factors unrelated to an intervention (e.g. waiting list, amiability of providers, unfriendly receptionist, parking difficulty) and, consequently, that it could say relatively little about efficacy or effectiveness of that intervention.
This is the first Delphi study conducted to explore international, multistakeholder, and multidisciplinary consensus on core outcome domains to be reported in NSLBP clinical trials. This study highlighted diverging opinions on the importance of some domains and reinforced the wisdom of a comprehensive exercise to determine which outcome domains are felt by the majority to be core. The strengths of this study include methods that followed guidance of initiatives like COMET and OMERACT [
16,
19], having a large expert panel of varied stakeholders representing various disciplines and countries, giving the opportunity to Delphi panellists to provide comments for each choice, allowing panellists to reconsider their views after considering other panellists’ reasoning, attempting to address strong arguments emerging from the Delphi panel, and rigorous reporting of methods [
26] and results. One limitation of this study could be the relatively small number of patients involved in the Delphi rounds, which could have led to under or overestimation of the importance of certain domains from their perspective. However, the goal of this study was not to develop a comprehensive range of outcome domains important to all stakeholders, but rather a core set for inclusion in all clinical trials. Patients can also be involved in trial management teams where they can shape the range of outcomes measures collected in individual trials and this should represent good practice. Finally, the definition of COSs places emphasis on the concept of a minimum set [
16,
19] and the four domains included in this COS seem to fit perfectly within this definition. The existence of a small COS for NSLBP should facilitate its inclusion in clinical trials, alongside trial-specific outcomes.
The development of a COS is a stepwise approach [
16,
19] and this study determined core outcome domains for clinical trials on NSLBP. The next step will be to reach consensus on which measurement instruments should be used to measure these outcome domains. The selection of instruments will be focused on those that have demonstrated adequate measurement properties for these domains with the least participant burden. Recently published methodological guidance on this topic [
46,
47] will help to conduct the next step for this COS in NSLBP.
Acknowledgments
We would like to acknowledge all the people who participated in at least one round of the Delphi study. These people (excluding members of the Steering Committee) are listed here in alphabetical order: William A. Abdu, Luc Ailliet, Marcelo Anderson Bracht, Gunnar Andersson, Adri T. Apeldoorn, Majid Artus, Julie Ashworth, Steven J. Atlas, Roxy Azoory, Marco Barbero, Heinz Dieter Basler, David Baxter, Ramsin Benyamin, Mark D. Bishop, Paul Bishop, David Borenstein, Lex Bouter, Hilary Bradbury, Alan Breen, Jens Ivar Brox, Elaine Buchanan, Alex Burdorf, Eugene J. Carragee, John David Cassidy, Roger Chou, Aldo Ciuro, Kris Clark, Steven P. Cohen, Pierre Côté, Peter Croft, Vinicius Cunha Oliveira, Wim Dankaerts, Gavin Davis, Ric Day, Rob de Bie, Henrica C. W. de Vet, Clermont E. Dionne, Wendy T. Enthoven, Hege R. Eriksen, Felipe Fagundes, Carmen Fernandez, Silvano Ferrari, Manuela Ferreira, Paulo H. Ferreira, Timothy W. Flynn, Victoria Franzinetti, Robert Froud, Andrea Furlan, Diego Galace, Robert J. Gatchel, Steven George, Sergio Gimenez Basalotte, Hedley Griffiths, Lars Grovle, Andrew John Haig, Murray Hames, Mark Hancock, Ian Harris, Jan Hartvigsen, Anne Julsrud Haugen, Elaine Hay, Rowland G. Hazard, Standiford Helm, Rob Herbert, Jan Hildebrandt, Deirdre A. Hurley, Eric L. Hurwitz, Julia Hush, Frank Huygen, Wilco C. Jacobs, Matthew Jennings, Johan Juch, Steven J. Kamper, Jaro Karppinen, Peter Kent, Suraj Kumar, Charlotte Leboeuf-Yde, Myeong So Lee, Martyn Lewis, Patrick Loisel, Pim A. J. Luijsterburg, Jon D. Lurie, Luciana Macedo, Luciana Machado, Laxmaiah Manchikanti, Anne F. Mannion, Lynn March, Norman Marcus, Teresa Marin, James McAuley, Alison McGregor, Luciola Menezes Costa, Jan Mens, Stephan Milosavljevic, Shail K. Mirza, Marco Monticone, Lorimer Moseley, Paulo Nascimento, Stefano Negrini, Colin Nelson, Jo Nijs, Oystein Nygaard, John O’Dowd, Teddy Oosterhuis, Richard Osborne, Peter O’Sullivan, Adriano Pezolato, Michael Pfingsten, Serge Poiraudeau, Jan Pool, Pina Porzio, Kristen Radcliff, James Rainville, Francois Rannou, Lisa Roberts, Michael E. Robinson, Myron Rogers, Martin Roland, Ana Royuela, Tamer Sabet, Petry Saeys, Marcus Schiltenwolf, Gay Schoene, Jesus Seco Calvo, Ruth Sephton, William S. Shaw, Karen J. Sherman, Rob Smeets, Anne J. Smith, Matthew Smuck, Bart Staal, Kjersti Storheim, Liv Inger Strand, Michael Sullivan, Simo Taimela, Kazuhisa Takahashi, Judith A. Turner, Martin Underwood, Alexander Vaccaro, Allard van der Beek, Bob van der Meiracker, Danielle van der Windt, Hans van Helvoirt, Willem van Mechelen, Rodrigo Vasconcelos, Arianne Verhagen, Johan W. S. Vlaeyen, Michael von Korff, Debra K. Weiner, Harriet Wittink, Ian Wright, Gustavo Zanoli.