Background
“Manipulable lesion” is a diagnostic term often used by manual therapists [
1]. Static palpation is commonly used by chiropractors and manual therapists to detect manipulable lesions [
2]. In essence, it is used clinically to assess areas of pain and stiffness within the spine that, may indicate spinal dysfunction and identify the location of treatment for manual therapists [
3]. Spinal manipulation is a manual treatment where a vertebral joint is passively moved using a low-amplitude high-velocity thrust [
4]. The appropriate application of spinal manipulation to areas of “spinal dysfunction” is thought to improve segmental function and motion, with reductions in pain and associated symptoms [
5]. Manipulable lesions identified by static palpation have been described as increased stiffness, a decrease in the segmental joint and musculature elasticity or springiness, and increased tenderness [
6‐
9].
Bergmann and Peterson define static palpation as “Palpation of bony landmarks (that) incorporates a scanning assessment of contour, tenderness, and alignment of the spinous processes, transverse processes, rib angles, interspinous space and intercostal space.” [
10]. According to this definition of static palpation one element is to determine tenderness; but how much tenderness is needed? There has been no previous agreement on the magnitude of tenderness needed to determine whether a manipulable lesion is present or not. This question could be answered through a consensus of experts using a method such as the Delphi technique.
Studies that examined static palpation of the thoracic spine, have shown inconsistent evidence with fair (κ: 0.24) [
11] to substainal (κ: 0.67) [
12] agreement for tenderness but only slight agreement with (κ: 0.07) [
6] to (κ: 0.15) [
11] for stiffness. Studies evaluating the reliability of static palpation in the cervical and lumbar spine have also shown mixed results ranging from poor [
8] to almost perfect [
13] (κ range 0.03 to 0.90 respectively). Generally, there is a low reliability with static palpation alone, however in combination with palpation of tender segments, a higher reliability level has been reported [
14‐
17].
Questions remain regarding the reliability of static palpation. Previous research has focussed on palpation of the cervical and lumbar spine, although chiropractors commonly treat the thoracic spine [
13]. There is little evidence on the reliability of static palpation to test for manipulable lesion through tenderness and stiffness within the thoracic spine. It would therefore be beneficial for patient diagnosis and potentially patient outcome to determine if static palpation of the thoracic spine is a reliable measure. Also, there are no known studies that have determined an agreement on the magnitude of tenderness needed to determine whether a manipulable lesion is notionally present or not.
The primary aims of this study were to establish the interrater reliability of static palpation of the thoracic spine for eliciting tenderness and segmental spinal stiffness and determine the effect of standardised training [
10] for examiners on these outcomes. The secondary aim was to explore expert consensus on the level of segmental tenderness required to locate “manipulable lesion”.
Discussion
Overall reliability of static palpation for segmental tenderness showed a higher level of reliability than palpation for stiffness which is in accord with previous literature [
14,
15]. There is a higher level of reliability of static palpation within the mid-thoracic spine when assessing for tenderness.
The Delphi Study showed a minimum of 2 out of 10 on the NPRS was required to be a potential manipulable lesion suggesting that tenderness should not just be a yes/no question. In a study of this nature, it seems preferable to use the NPRS and a potential manipulable lesion is scored as a NPRS score above 2 out of 10. This finding should assist with any potential limitation that pain and tenderness are subjective measurements.
There was a good range of participants with a mix of female and male participants, and equal numbers of asymptomatic and symptomatic participants. This mix is consistent with recommendations to assemble a study sample better matching clinical practice [
15,
28]. The mean age of participants at 22.4 years is younger than many of the previous studies [
12,
29] however was similar to some other studies [
6,
11,
30].
The low level of reliability for determining segmental stiffness for strict agreement could be due to the high level of possible chance agreement. This gave high prevalence indices which led to an underestimation of the Kappa coefficient. We used a prevalence-adjusted, bias-adjusted kappa (PABAK), and this accounted for the high prevalence bias. When this bias was adjusted for, there was moderate reliability for most spinal levels. This is also reflected in the Kappa max values. Our results have produced higher levels of agreement when compared to other studies, however most previous studies did not account for prevalence bias. When Schneider and others [
20] did account for potential prevalence bias they also found a higher level of reliability.
In comparing other reliability studies of static palpation, the findings are similar to our own i.e. a low level of reliability with static palpation alone for spinal stiffness. Ghoukassian, Nicholls and McLaughlin [
6] examined interexaminer reliability using the Johnston and Friedman method for thoracic spine palpation, they found slight interexaminer reliability, Kappa of 0.07 [
6]. Potter, McCarthy, and Oldham [
31] examined the intraexaminer reliability of multiple examination procedures including range of motion, motion palpation, and static palpation and found an intraclass correlation coefficient in the thoracic spine of 0.70 (95%[CI], 0.27–0.90). Cooperstein, Haneline, and Young [
32] considered the examiners’ confidence of their judgements and assessed the most ‘fixated’ level of the thoracic spine, they found overall a poor intraclass correlation coefficient (0.31), however when both examiners were “very confident” in their findings, analysis of this subgroup population (40% of participants) showed an increase in the intraclass correlation coefficient to 0.83 (95% [CI], 0.63–0.92).
There was a slight increase in the reliability of static palpation of spinal stiffness with training, however this was not statistically significant (p = 0.39). Interesting there was a higher level of reliability of static palpation within the mid-thoracic spine compared to the upper and lower thoracic spine when assessing for stiffness. We speculate that the anatomy of the thoracic spine in the mid region may be easier to palpate given its flexibility to anterior forces in a prone position.
When comparing our findings on static palpation for tenderness of the thoracic spine we found similar results to Christensen and others [
12], where their population age was similar to ours. They examined palpation tenderness of thoracic vertebral levels 1–8, and found with an expanded agreement an intraexaminer reliability Kappa of 0.59 to 0.77 and an interexaminer reliability Kappa of 0.67 to 0.70. Johnston and others [
33] examined the interexaminer reliability of paraspinal soft tissue tension by percussion finding 70–86% overall agreement. Dissimilar to our findings, Heiderscheit and Boissonnault [
11], examined static palpation with pain provocation in a population with ages similar to ours, and found pain provocation intraexaminer reliability with a Kappa of 0.28 to 0.66 and interexaminer reliability with a Kappa of 0.24.
We did find that reliability moderately increased with expanded vertebra for spinal segmental tenderness and for segmental stiffness, and this is understandable as collapsing levels for analysis delivers an inherently increased potential for agreement.
Overall there was a relatively low level of reliability for static palpation when testing for stiffness, and a higher level of reliability found for static palpation when testing for tenderness. Segmental assessment for stiffness is not sufficiently reliable, but improves when considering a region (multi-levels of vertebrae). Therefore, in clinical practice chiropractors may need only be concerned with approximate levels and any more detailed analysis using static palpation could be of limited utility. Also, reliability is better in the mid-thoracic spine when compared with the lower and upper thoracic spine which has direct clinical implications for spinal assessment. There was no significant difference in reliability for spinal stiffness and tenderness after a training session suggesting that the pragmatic approaches used by two experienced chiropractors were equivalent.
Strengths of study
The strengths of this study were it was fully powered, we blinded participants and examiners, we used randomization before each round and attempted to follow best practice recommendations from the literature. We carried out a training session with a consensus method [
9,
34] as per Bergmann and Peterson [
10] and marked thoracic spinous processes [
5,
34]. We explored the reliability of pain provocation assessment [
11,
15,
34] and rated this level of tenderness [
35]. Also during data analysis we not only calculated Kappa but also PABAK [
20], and analysed strict agreement and expanded agreement [
12].
Limitations of study
A limitation of the Delphi study was that there was only a 75% consensus. Although we did not reach the ideal 80% consensus the frequency statistics were overwhelmingly indicating 2 out of 10 on the NPRS. Within the Delphi study we defined an expert as having over 3 years’ clinical experience however we cannot guarantee that a similar sample of chiropractors would necessarily generate similar results. All expert chiropractors were recruited from Murdoch University School of Health Professions, however, they graduated from many different institutions worldwide. Nevertheless, recruiting from the one institution may limit external validity. The examiner training for standardization appeared adequate given the knowledge assessment however the training was brief and additional training may have additionally enhanced the reliability. Further, as the difference between the pragmatic and standardized approaches were not significant it could indicate that the training was inadequate or possibly the examiners were trained in this or a similar method during their studies. The large number of statistical comparisons may have increased the probability of type I error. Another limitation is the use of a non-clinical population, while there was a mixture of symptomatic and asymptomatic participants for spinal pain the participants were mostly healthy young students. This may adversely affect external validity. The examiners reported that towards the end they were experiencing fatigue and this may have influenced the results leading to a lower level of agreement.
Conclusion
A Delphi study of 10 experienced chiropractors concluded that the minimum level of quantifiable tenderness at a segmental spinal level should be 2 out of 10 on the NPRS to be considered a potential manipulable lesion. There was no significant impact on reliability with standardized training for stiffness or tenderness. There is a higher level of reliability of static palpation within the mid-thoracic spine when assessing both stiffness and tenderness. There was overall moderate reliability for static palpation for stiffness and tenderness, with tenderness showing a higher level of reliability. Reliability modestly increased when three adjacent vertebral levels were expanded for analysis, both for spinal segmental stiffness and tenderness. These reliability results should be taken into consideration in clinical practice when assessing the spine particularly as the validity of static palpation is still unknown.
Future research could consider static palpation reliability in different patient groups such as those with overt patient syndromes. Additionally, when assessing the reliability of segmental stiffness, PABAK is beneficial as issues related to prevalence or bias can result in a lower level of perceived reliability.