Background
The shoulder is one of the more complex joints in the human body and a wide variety of conditions can affect the structures of the shoulder. The most common cause of shoulder pain is rotator cuff disease, which, in one primary care study, accounted for 85% of all shoulder pain presentations [
1]. Adhesive capsulitis is also a common cause in middle-aged individuals, while osteoarthritis is becoming increasingly prevalent in older people. Shoulder-related disorders account for substantial medical, economic, and social costs [
2,
3]. In 2000, the direct costs for the treatment of shoulder dysfunction in the USA totaled US$7 billion [
4,
5]. Nearly 20 million Americans reported shoulder pain in 2005 alone, establishing shoulder pain third only to knee and back pain [
6]. A systematic review showed that the estimated prevalence of shoulder pain in the general population varies greatly among studies, with a lifetime reported prevalence ranging from 7% to 67% [
7]. With the aging “baby boomer” generation, we can expect the prevalence of shoulder disorders to increase significantly over the next two decades [
6].
Shoulder disorders are associated with acute or chronic pain that is often disabling; some disorders also result in weakness and dysfunction of the upper extremity [
8]. They have a substantial effect on quality of life, including altered sleep patterns, and adversely impact work and recreation [
9]. For example, up to 30% of workers diagnosed with a new episode of shoulder pain take sick leave because of the shoulder disorder [
10]. Also, patient-reported outcomes (PROs) research suggests that shoulder disorders may compromise an individual’s health status similar to major medical diseases, including congestive heart failure, acute myocardial infarction, diabetes mellitus, and depression [
11,
12].
There are many hundreds of controlled clinical trials for shoulder disorders and some evidence suggests that these studies tend to use a heterogeneous array of outcome measures [
13‐
17]. For example, four recent Cochrane reviews, limited to randomized and quasi-randomized trials investigating manual therapy and exercise or electrotherapy for adhesive capsulitis or rotator cuff disease, included 32, 19, 60, and 47 trials, respectively [
13‐
16]. A review of the included trials found that trialists included a measure of pain in 87%, function in 72%, range of motion in 67%, adverse events in 27%, patient-reported treatment success in 24%, strength in 18%, health-related quality of life in 18%, work disability in 4%, and referral for surgery in 2% [
17]. Rotator cuff disease trials more commonly included a measure of strength (26% versus 2% for adhesive capsulitis), whereas adhesive capsulitis trials more commonly included a range of motion measure (82% versus 58% in rotator cuff disease trials). Also, the measurement tools used to assess these domains varied widely. For example, there were 35 different outcome measures for pain. Furthermore, between 1973 and 2014 there was a marked rise in inclusion of a measure of function accompanied by a marked decline in use of a measure of range of movement.
It has been suggested that few outcome measures for shoulder disorders possess acceptable measurement properties (e.g., [
18‐
21]). To be of use for clinical trials and patient care, health status measurement instruments must be valid, reliable, and responsive [
22,
23]. A valid tool measures what it proposes to measure and must fulfill requirements for face, content, construct, and/or criterion validity. A reliable instrument measures some phenomenon in a predictable manner (repeatability or reproducibility), whether it is self-reported (test-retest reliability) or is measured by someone else (intra-rater and inter-rater reliability). Finally, a responsive measure is able to detect clinically important change in the underlying construct over time, even if the changes are small, and crucially for clinical trials, must be able to detect clinically important differences in treatment effect [
22,
23]. There are several checklists and recommendations on how to assess the full array of psychometric properties across health-status measurement instruments (e.g., [
24‐
27]). Until recently, there was a paucity of studies that have comprehensively assessed outcome measures used for shoulder disorders, or identified where there are gaps in empiric data to guide further research efforts.
A systematic review of PRO measures used in studies of rotator cuff disease identified 73 separate citations for 16 distinct PRO measures [
19] and performed a comprehensive assessment of their methodological quality (using the consensus-based standards for the selection of health status measurement instruments (COSMIN) checklist) [
24,
25], psychometric properties (using criteria proposed by Terwee et al. [
26]), and overall evidence using accepted methods [
27]. Outcomes had empiric data supporting an average of only 50% of recommended measurement properties. Tools such as The Western Ontario Rotator Cuff (WORC) Index, Disability of the Arm, Shoulder and Hand measures (DASH), Shoulder Pain and Disability Index (SPADI), and Simple Shoulder Test (SST) had good evidence in support of their measurement properties, while there were concerns about other tools relating to internal consistency, reliability, measurement error, hypothesis testing, and responsiveness.
Another recent systematic review assessed the psychometric properties of shoulder-specific PRO measures using a different tool - Evaluating Measures of Patient Reported Outcomes (EMPRO) [
28]. It identified 11 instruments assessed across 112 studies. The American Shoulder and Elbow Surgeons (ASES) shoulder assessment, SST, and Oxford Shoulder Score (OSS) were found to have low administration burden and the best overall scores for validity, reliability, and responsiveness, while the Flexilevel Scale of Shoulder Function, SPADI, and the Dutch Shoulder Disability Questionnaire had some acceptable properties, but several required further evaluation.
A third systematic review used the COSMIN methodological quality checklist to assess questionnaires used to evaluate interventions for rotator cuff disease, including surgery [
29]. Sixteen studies evaluating two instruments, the WORC and the Rotator Cuff Quality-of-Life (RC-QOL) measure, were identified. Both tools were found to have adequate construct validity, reliability, responsiveness, internal consistency, and translation but additional methodological aspects - including measurement error, content, structural, cross-cultural and criterion validity, and interpretability - needed further evaluation. A fourth paper assessed the psychometric properties (using criteria proposed by Terwee et al. [
26]) of four commonly used shoulder outcome instruments - the ASES, the Constant-Murley score, the DASH and the OSS - and reported that each of them had limited evidence to support their use in shoulder trauma populations [
30]. Last, a fifth systematic review of measurement properties of self-administered PRO measures in patients with nonspecific shoulder pain and activity limitations found that none of the seven PRO measures had strong positive evidence for all properties but that the SPADI was the best and was recommended for use in these patients [
31].
The lack of uniformity in outcome measurement across trials limits our ability to compare findings between studies or to pool data for meta-analyses. Selective outcome reporting (i.e., selective reporting of favorable or statistically significant outcomes) can also bias the results of systematic reviews [
32]. In an effort to reduce heterogeneity in outcomes measured across clinical trials, the development of core outcome sets (COSs) for specific health conditions has been recommended [
33]. A COS is defined as an agreed minimum selection of outcomes that should be measured and reported in all clinical trials for a particular health condition [
34]. There would be an expectation that the core set of outcomes would always be collected and reported, but it would not preclude use of additional outcomes in a particular trial. A COS would increase the reporting of important outcomes, reduce the risk of selective outcome reporting, and increase the feasibility of conducting meta-analyses on such topics [
34,
35]. We searched the COMET database and no COS was identified for this area.
The aim of this project is to develop a COS for clinical trials of shoulder disorders. The Core Outcome Measures in Effectiveness Trials (COMET) [
34] and the Outcome Measures in Rheumatology (OMERACT) [
36] initiatives provide methodological guidance, including a stepwise approach, for the development of a COS [
37,
38]. The long-term goal of this work is to ensure the use of an internationally agreed COS based upon the best available evidence, for all trials of shoulder disorders. This will greatly improve our ability to interpret and compare the findings of different trials and synthesize the evidence in meta-analyses, and will also address the issue of selective outcome reporting.