Background
Musculoskeletal disorders are the most common form of long-term illness, with neck pain disorders rated as the most frequent complaint in Denmark [
1]. It is estimated that 70% of the population will experience neck pain during their lifetime [
2], and the one-year prevalence has been reported to be approximately 35% [
3].
People with a Whiplash Associated Disorder (WAD), as well as those with idiopathic neck pain often develop neck symptoms lasting for more than 3 months [
2,
4]. People with chronic neck pain often present with a variety of other symptoms with a potential large impact on function and quality of life [
5]. Several treatment modalities have been evaluated for neck pain, however, there is little evidence to suggest that one treatment is superior to others [
6,
7]. The reasons for this may be that 1) the investigated treatment modalities have not been effectively targeted to the right patients, 2) the intervention has not focused on relevant functions, and/or 3) the clinical outcome measures used are not reliable, valid and responsive to detect a change beyond measurement errors.
Clinical tests have been developed for people with neck pain which target the assessment of neuromuscular control and function, such as strength and endurance of the deep neck flexors and extensors [
8‐
11]. Additionally, tests for sensorimotor control such as head repositioning, postural control and head-eye coordination are often included in the clinical examination of patients with chronic neck pain [
12‐
14]. Moreover, since both primary and secondary hyperalgesia is often evident in people with chronic neck pain, tests for pressure pain sensitivity is used increasingly during the clinical examination [
15].
In order for a test to be useful in a clinical setting, it needs to be low cost, safe, easy to use and operational within the time frame of a clinical assessment. Furthermore, it has to be reliable, valid and responsive to detect changes. In a previous study, clinical tests including the craniocervical flexion test (CCFT), cervical active range of motion (ROM), gaze stability (GS), smooth pursuit neck torsion test (SPNTT), test for the cervical extensors (CE), balance tests using sway measurements (SWAY) on a Wii balance board and Pressure Pain Threshold (PPT) tests all showed satisfactory reliability, construct and discriminative validity in people with chronic neck pain and in asymptomatic controls [
16]. However, the responsiveness, that is the ability of a test to detect a change, remains unknown for these tests. Yet estimating responsiveness of a clinical test is highly relevant in order to evaluate the effect of a given intervention. Three systematic reviews evaluating clinimetrics of cervical muscle function, ROM and cervical sensorimotor control concluded that the responsiveness of such tests was insufficiently described [
17‐
19]. Only PPT tested over the upper trapezius muscle was reported as having acceptable responsiveness when tested in people with neck pain [
20]. Therefore, the objective of the present study was to examine the responsiveness of four clinical tests with continuous variables for people with chronic neck pain, which included CCFT, ROM, CE and PPT, since these tests are commonly used in the clinical setting to evaluate the effect of an intervention.
It was hypothesized that the change score of the included clinical tests from baseline to 4-months following an active intervention [
21] would correlate with the change in Neck Disability Index (NDI) score over the same time period. It was further hypothesized that all clinical test variables would have an acceptable level of responsiveness.
Results
A total of 200 patients were included in the original RCT-study [
21]. Of these, 164 completed the 4-month follow up and were eligible for the present study. At the 4-month follow-up and according to the described groups classified by their NDI score, a total of 144 (86%) were classified as unchanged, and 20 (14%) as having improved. In the unchanged group, although they completed the 4-month follow up, 26 participants didn’t complete the clinical tests and therefore 118 were included in the analysis.
The two groups did not differ in their baseline demographic characteristics (Table
1), except for duration of symptoms, in which, on average the improved group had symptoms for a significantly longer time compared to those in the unchanged group.
Table 1
Baseline demographic characteristics of unchanged and improved groups, presented as mean (sd) and p-value
Sex – female n (%) | 128 (77) | 111 (77) | 16 (81) | 0.69 |
Age in years | 45.1 (11.7) | 45.2 (11.7) | 44.9 (12.1) | 0.92 |
Height in cm | 170.0 (8.2) | 169.8 (8.2) | 171.2 (8.6) | 0.47 |
Weight in Kg | 77.2 (16.8) | 76.8 (16.4) | 80.0 (19.5) | 0.41 |
Cause - traumatic onset (%) | 97 (59.0) | 90 (62.3) | 11 (52.7) | 0.75 |
Duration of neck symptoms in months | 108.4 (105.8) | 99.8 (94.1) | 167.2 (155.5) | 0.006* |
Baseline NDI | 21.5 (7.3) | 21.3 (7.2) | 22.9 (7.8) | 0.06 |
No significant differences in mean change score between unchanged and improved groups were found for any of the clinical tests (Table
2). ROM in neck extension was close to statistical significance (6.34 s (−0.29 to 12.96),
p = 0.06).
Table 2
Change scores, presented with Mean difference (sd), 95% Confidence Intervals and p-values for group differences
CCFT (mmHg) | 1.25 (2.23) | 1.80 (2.50) | 0.55 (−0.54 to 1.63) | 0.32 |
CE (s) | 14.20 (43.59) | 24.35 (46.64) | 10.16 (−10.90 to 31.21) | 0.34 |
Flexion (°) | −0.11 (12.20) | 2.00 (12.70) | 3.06 (−2.81 to 8.93) | 0.30 |
Extension (°) | −0.09 (14.02) | 6.25 (12.92) | 6.34 (−0.29 to 12.96) | 0.06 |
Rotation Right (°) | 0.19 (11.51) | 4.5 (11.34) | 4.31 (−1.18 to 9.79) | 0.12 |
Rotation Left (°) | 1.26 (10.76) | 3.00 (11.29) | 1.74 (−3.44 to 6.92) | 0.51 |
Lateral Flexion Right (°) | −0.36 (9.06) | 3.25 (10.10) | 3.60 (−0.79 to 8.00) | 0.11 |
Lateral Flexion Left (°) | −0.06 (7.72) | 0.40 (5.78) | 0.46 (−3.12 to 4.04) | 0.80 |
PPT Tibialis Anterior Right (Kg/f) | −0.36 (1.60) | −0.13 (1.81) | 0.33 (−0.55 to 1.01) | 0.55 |
PPT Tibialis Anterior Left (Kg/f) | −0.22 (1.66) | 0.08 (1.27) | 0.30 (−0.47 to 1.06) | 0.45 |
PPT C5 Right (Kg/f) | 0.51 (1.62) | 0.16 (2.01) | −0.35 (−2.42 to 1.71) | 0.74 |
PPT C5 Left (Kg/f) | −0.05 (1.50) | 0.34 (1.17) | 0.39 (−0.31 to 1.09) | 0.27 |
Correlations between the NDI and the clinical tests were estimated using Pearson’s (r) and ranged from 0.09-0.21, and were below the acceptable level of at least 0.3. Significant correlations were found for ROM in extension and lateral flexion to the right and PPT at C5 left. AUC ranged from 0.50-0.62, (just above discriminate ability beyond chance), and were all below the recommended acceptable level of at least 0.7 (Table
3).
Table 3
Correlations (Pearson’s) between change scores of NDI and clinical tests, and AUC with 95% Confidence Interval
CCFT | 0.19 (0.03 to 0.35) | 0.54 (0.42 to 0.67) |
CE | 0.09 (−0.08 to 0.25) | 0.54 (0.40 to 0.68) |
Flexion | 0.15 (−0.01 to 0.31) | 0.55 (0.41 to 0.69) |
Extension | 0.21 (0.04 to 0.36)a
| 0.62 (0.50 to 0.75) |
Rotation Right | 0.09 (−0.08 to 0.25) | 0.61 (0.48 to 0.75) |
Rotation Left | 0.10 (−0.07 to 0.26) | 0.56 (0.42 to 0.71) |
Lateral Flexion Right | 0.20 (0.04 to 0.36)a
| 0.57 (0.44 to 0.70) |
Lateral Flexion Left | 0.12 (−0.05 to 0.28) | 0.52 (0.39 to 0.65) |
PPT Tibialis Anterior Right | 0.17 (0.01 to 0.33)a
| 0.50 (0.34 to 0.65) |
PPT Tibialis Anterior Left | 0.14 (−0.30 to 0.30) | 0.53 (0.39 to 0.68) |
PPT C5 Right | 0.14 (−0.03 to 0.31) | 0.54 (0.39 to 0.69) |
PPT C5 Left | 0.21 (0.04 to 0.36)a
| 0.59 (0.44 to 0.74) |
MCID was generally large, and the corresponding sensitivity and specificity were low with sensitivity measures ranging from 20 to 60% (highest for ROM), while specificity ranged from 54 to 86% (highest for CCFT and PPT) (Table
4).
Table 4
Minimum Clinically Important Difference, Sensitivity and Specificity, Likelihood ratios and predictive values
CCFT (mm Hg) | 2.00 | 50.0 | 54.2 | 1.1 | 0.9 | 15.6 | 86.5 |
CE (s) | 73.0 | 30.0 | 85.5 | 2.1 | 0.9 | 26.1 | 87.7 |
Flexion (°) | 6.0 | 40.0 | 74.6 | 1.6 | 08 | 21.1 | 88 |
Extension (°) | 4.0 | 55.0 | 61 | 1.4 | 0.7 | 19.3 | 88.9 |
Rotation Right (°) | 10.0 | 40.0 | 79.5 | 2.0 | 0.8 | 25.0 | 88.6 |
Rotation Left (°) | 5.0 | 60.0 | 55.9 | 1.4 | 0.7 | 18.8 | 89.2 |
Lateral Flexion Right (°) | 3.0 | 35.0 | 63.6 | 1.0 | 1.0 | 14 | 85.2 |
Lateral Flexion Left (°) | 5.0 | 20.0 | 75.6 | 0.8 | 1.1 | 12.1 | 84.9 |
PPT Tibialis Anterior Right (Kg/f) | 0.83 | 25.0 | 84.3 | 1.6 | 0.9 | 21.7 | 86.6 |
PPT Tibialis Anterior Left (Kg/f) | 0.9 | 30.0 | 81.4 | 1.61 | 0.86 | 21.4 | 87.4 |
PPT C5 Right (Kg/f) | 0.07 | 45.0 | 53.5 | 0.96 | 1.03 | 14.5 | 84.7 |
PPT C5 Left (Kg/f) | 0.48 | 45.0 | 78.3 | 2.07 | 0.70 | 25.5 | 89.1 |
Further, LR+ (0.8-2.07) and LR- (0.7-1.1) showed low diagnostic value for all variables [
36], and PPV and NPV ranged from 12.1 to 26.1 and from 84.7 to 89.2, respectively.
Discussion
Responsiveness of the clinical tests evaluated in this study was generally poor when using NDI as an anchor of at least 7 change points for improvement from baseline to the 4-month follow-up in people with chronic neck pain. AUC was low for all variables, likewise all variables (CCFT, ROM, CE and PPT) demonstrated non-satisfactory correlations with NDI, and the MCID was large.
To the best of our knowledge this is the first study to assess responsiveness of clinical tests for people with neck pain since only PPT variables have been evaluated previously in non-chronic neck pain patients [
20]. The previous study demonstrated satisfactory responsiveness for PPT measured over the upper trapezius (AUC 0.76; 95% CI: 0.57;0,89) but not for PPT measured over the tibialis anterior (AUC = 0.65; 95% CI:0.46;0.84) [
20] which is in contrast to the current findings. There could be several reasons for these contrasting results. Firstly, the study population differs between the two studies, as Walton and colleagues [
20] included people with acute or chronic neck pain as opposed to the current study which included only people with chronic pain, and it is likely that differences in severity, symptoms and pain mechanisms affect responsiveness of PPT.
Secondly, although both studies used an anchor based method to measure responsiveness, the anchor differed between studies. Walton et al. used Global Perceived Effectiveness (GPE) in contrast to the current which considered the NDI [
20]. Choosing GPE as an anchor for real change as used in some previous studies [
29,
31,
37], may be biased due to the subjectiveness of GPE and the questioned reliability and validity of this measure [
33,
38]. In the current study with a 4-month follow-up, recall-bias may have been present if GPE was selected, since previous studies have shown GPE to have higher correlation with present than initial status [
38,
39]. Moreover, GPE is a generic health related outcome, as opposed to the specific tests evaluated in this study, which is why the NDI, with higher emphasis on self-reported neck function, was selected. However, the correlations between the anchor and the clinical tests were all below the previously set level of acceptance (0.3), indicating that the current clinical tests are not sufficiently covered by changes on the NDI. Since NDI has been critizised for poor sensitivity to longitudinal changes [
40] it seems questionable whether large changes on the NDI reflect a longitudinal change in the current study.
The choice of the cut off (at least 7 change points on the NDI) is important for determining the responsiveness. The current cut-off was selected based on the MCID calculated on the NDI in the previously reported systematic review [
27]. Choosing another cut-off, for instance a change point of 3 on the NDI, as in some previous studies [
31,
32], and/or different cut-offs for the different tests, may have resulted in different estimates. However, post-hoc analysis with a cut-off of 3 change points did not change estimates considerably.
The current MCID variables were all lower than previously reported Minimum Detectable Change (MDC) [
16], except for CCFT, meaning that the current calculated MCID could be attributed to measurement error. The current MCID is based on dichotomization of patients as improved and not improved, and does not take into account whether the clinical status on other areas has actually changed. However, PPV and both Likelihood ratios were all below the acceptable levels.
Limitations of this study are the relatively small sample for the improved group (although the total group was large) and the difficulties in identifying the appropriate anchor, that measures the same dimensions as the clinical tests. Using NDI doesn’t seem to be an optimal anchor due to the small group of responders. Since pain is one of the main complaints in this patient group, a measure of pain intensity (eg. Visual Analogue Scale) could be suitable for classifying patients as improved or worsened. The appropriateness of alternative new neck instruments as an anchor remains to be studied in the future. In addition, the clinical tests used may only be classified as semi-objective.
A strength of this present study is that it followed a strict and standardised protocol [
21] with a detailed description, and training in the clinical tests and their interpretation. In addition, the study was performed in a clinical setting using simple and low cost clinical tests, previously shown to have satisfactory reliability [
16], aiming for high generalizability.
Conclusion
In conclusion, responsiveness of the included clinical tests (CCFT, ROM, CE and PPT) was generally low when using NDI change score greater than 7 as the anchor point from baseline to a 4-month follow up. A major limitation is the use of NDI as an anchor and further investigations of responsiveness are warranted, possibly using other anchors, for instance pain measures which to a higher degree resemble similar dimensions as the current clinical tests.