Introduction
Low back pain (LBP) is very common and poses a great health risk for society. Worldwide, it is the number one cause of years lived with disability [
1]. Up to 84% of the population will experience LBP at least once during their lifetime [
2]. In roughly 90% of cases, a specific source for the LBP cannot be identified [
3]. LBP is strongly associated with disability [
1,
4], work absence [
5,
6], and reduced quality of life [
6,
7]. As a result, medical and particularly non-medical costs related to LBP are very high [
8,
9].
Most patients improve substantially in the first six weeks after the onset of LBP [
10]. However, one year after onset, approximately two thirds of patients still experience pain and disability [
10‐
12]. Currently, LBP is looked at more and more as a long-lasting or recurrent condition rather than a series of unrelated episodes [
9,
13]. A review on the long-term course (follow-up ranged from one to 28 years) of LBP in the general population found that most patients experienced a somewhat stable or fluctuating occurrence of LBP over time [
14]. Becoming pain free was never reported as a common finding.
Despite the effects of LBP on physical, psychological, and social well-being, there are few longitudinal studies reporting multiple patient-centered outcomes. Cohort studies with long-term follow-up (> 2 years) often confine to investigating the presence of pain (yes/no) or the number of days with pain over the past month(s) or year [
13,
14]. Several consensus statements have been published on outcome measures in chronic (back) pain research [
15‐
17]. Most reports specifically provide recommendations for the evaluation of clinical trials, but there is an overall understanding that reporting on pain alone in LBP research is insufficient. Other important outcome domains include measures of physical function, generic measures of health and well-being, quality of life, and work (dis)ability.
At present, it is unclear what evidence is available from long-term studies on chronic non-specific LBP. More specifically, from studies examining patient-centered outcomes other than pain. We conducted a scoping review with the objective to identify and map the available evidence from studies on chronic LBP with long-term follow-up, to examine how these studies are conducted, and to address potential knowledge gaps. Where systematic reviews typically focus on more narrow and well-defined questions with appropriate study designs chosen in advance, a scoping review tends to address broader topics where many different study designs might be applicable [
18]. For the present study, we included experimental and observational studies reporting at least two-year follow-up on disability, quality of life, work participation or health care utilization in patients with chronic non-specific LBP. The results are not intended to provide evidence to inform clinical practice, but rather to gain insight into the scientific literature that is currently available. For studying the feasibility, appropriateness or effectiveness of a certain treatment or practice, a systematic review is a more valid approach [
19].
Discussion
The general purpose of this study was to identify and map the available evidence from long-term studies on chronic non-specific LBP. Our findings confirm the notion that there is little to no information available from natural cohorts when it comes to reporting on patient-centered outcomes other than pain. The majority (> 75%) of papers that were included examined long-term outcomes after invasive treatments. Surgical interventions, specifically lumbar fusion and disc arthroplasty, were most commonly reported. Among studies examining conservative treatments, physical therapy and multidisciplinary programs were most common. Overall, included studies were predominantly of moderate quality and differed in design, patient samples, and methods of data collection. These differences were most profound between studies on invasive and conservative treatments. In general, most studies reported improvements in pain and disability and, when measured, quality of life at long-term follow-up.
This review identifies several knowledge gaps regarding research into long-term outcomes of non-specific chronic LBP. First, there is still little insight into the natural course of LBP regarding outcomes such as disability, quality of life, work, and health care utilization, because no natural cohorts met the inclusion criteria. In a natural cohort, subjects would be followed in real life in which numerous situations and interventions may appear. It is not limited to one or several specified interventions to study its effect. The studies included in this review examined clinical outcomes of non-specific LBP and concerned patients that were actively seeking health-care. Therefore, they might not be representative of people with sub-chronic or chronic LBP in the general population. Secondly, we noticed that repeated measurements during long-term follow-up were scarce. Only ten studies (11%) took more than one measurement after the two-year mark. These studies reported lasting improvements in symptoms after lumbar fusion [
31,
32,
40,
41,
59,
72], disc arthroplasty [
53,
58,
76,
92], and chiropractic care or primary care by an MD [
102]. Nonetheless, recurrence of LBP is very common and studies with less than two years follow-up have also shown that post-treatment trajectories of pain and disability can vary a great deal between patients [
122‐
124]. Third, the present review also affirms the notion that across LBP trials, the primary focus has been on pain and disability as outcome measures [
125], even though other (generic) measures of health and well-being, such as quality of life and work (dis)ability have been recommended in core outcome sets to reflect the multidimensionality of LBP [
15,
126‐
128]. Furthermore, few studies seem to monitor health care utilization during follow-up. These data can be challenging to collect; however, they are an important piece of the puzzle in determining whether outcomes at long-term follow-up might be the result of the original intervention (at baseline) or other interventions that were provide during follow-up. To conclude, in order to really understand both the (natural) course of LBP and results of LBP-related interventions over time, frequent measurements of relevant patient-centered outcomes are needed, as well as the use of complete core outcome sets including quality of life and work disability, and an overview of patients’ health care utilization during follow-up.
Even though the patient reported outcome measures in this review seem to reflect more positive long-term pain, disability and quality of life status compared to baseline measurements, this should not be misinterpreted as treatment effectiveness. This scoping review was not designed to study long-term effectiveness of interventions. A number of factors might have contributed to the appearance of consistent improvement years after experiencing persistent LBP. First, the reported improvements derive from statistical significance and do not necessarily imply clinical relevance. It is unclear whether patients perceived their improvement on different outcome measures as clinically relevant. Only a select number of studies performed a responder analysis. A previous review on outcome measures also reported that merely 8% of 401 included LBP trials reported a number or proportion of improved patients [
125]. Although most of the studies in the present review that included a responder analysis reported high percentages of patients with clinically relevant improvement, cut-off scores for clinical success varied greatly. For instance, in some studies relative improvements of 25–30% on VAS or ODI were deemed successful, while others aimed for 50% [
35‐
37,
95].
Other factors might also have influenced improvement in LBP symptoms. A previous review in patients with non-specific LBP found that response to primary care treatment followed a pattern of rapid early improvement followed by a plateau, regardless of whether active treatment, usual care, or placebo treatment was used [
129]. Natural prognosis could be one explanation [
10,
11,
130]. However, natural prognosis at long-term is mostly unknown. People are also more likely to seek health care at a time when their pain and symptoms are at their worst or most debilitating, which could further explain a positive overall course. Regression to the mean could also have played a role in the improvements in symptoms that were found after the start of treatment [
131]. Overall, these factors likely influenced short-term improvements in LBP complaints, but if maintained, could also explain the reported long-term beneficial outcomes. Finally, publication and reporting bias cannot be ruled out. Only one study reported that patients had significantly worsened at long-term follow-up. Future (systematic) reviews on long-term studies on LBP should consider checking their findings against reported study protocols and/or unpublished trial data.
Surgical treatments are relatively over-represented in the present review. Safety issues and long-term adverse events are of more concern in surgical trials compared to conservative interventions, which may be why long-term data is collected and analyzed more often from invasive interventions. Also, surgical studies more often seem to utilize data that are retrospectively obtained from patient medical records [
132,
133]. This makes it easier to collect and report long-term follow-up data. In spine surgery, complication incidence is potentially underestimated with retrospective assessments [
134]; however, the present review includes results from PROMs and not occurrence of adverse events.
Studies on invasive and conservative treatments were notably different in their patient inclusion criteria. Invasive studies sought to include patients with disc-related diagnoses or symptoms, whereas conservative studies defined symptom-related criteria more generally (‘low back pain’). Although diagnoses based on lumbar structures (e.g., discogenic pain, facet joint pain) were very common in some settings, diagnostic tests do not reliably identify these structures as a source of LBP. The usefulness of these tests in clinical practice remains unclear [
22,
26,
135] and current guidelines on LBP usually classify these diagnoses as non-specific [
136]. Nevertheless, spine surgeons have claimed that these diagnoses should classify as specific LBP and that better and earlier identification combined with, if indicated, invasive treatment would improve prognosis in these patients [
137]. A Dutch task force that was tasked to develop a guideline for invasive treatment of lumbosacral pain syndromes has proposed to classify diagnoses such as facet joint pain, disc pain and FBSS as ‘degenerative uncomplicated spinal LBP syndromes’ [
138]. In short, LBP diagnoses, as well as the decision to operate or treat conservatively, vary between countries and between medical disciplines. At present, there is no consensus among health care professionals on the classification of specific versus non-specific LBP. Improved consensus on a classification system could lead to more targeted care, reduce the need for expensive diagnostic methods, and facilitate comparison among LBP studies [
17,
139,
140]
In line with worldwide research in the field of back pain, we identified a significant increase in annual publications on long-term outcomes of non-specific LBP [
141]. The majority of selected studies were from Western countries, with the USA being the most productive (26% of studies). Little to no research took place in low- or middle-income countries, while in the past few decades the largest increases in disability due to LBP have occurred there [
9,
142]. The impact of LBP in low- to middle-income countries potentially comes with disadvantages dissimilar to those in high-income countries and might therefore not be represented in the present review [
9].
Finally, methodological quality of studies seemed to also increase over the years. Only prospectively conducted studies (prospective cohorts and RCT/CCTs) received a global ‘strong’ rating with the quality assessment tool that was utilized. Selection bias was often present in retrospectively conducted studies. In these instances, patients were included based on complete availability of follow-up data. Two sensitivity analyses were performed on the scoring method of the quality assessment tool. First, the global quality rating of a study was determined by the amount of ‘weak’ ratings that was scored on all separate domains. This means that studies that scored ‘moderate’ on each separate domain would have received a ‘strong’ global rating. A separate analysis showed that changing the global rating from strong to moderate for these studies would have had no effect on the results, since there were no studies that rated moderate on each domain. Second, prospective cohort studies received a ‘moderate’ rating on the domain study design. It could be argued that prospective cohort studies are a strong design for studying long-term outcomes. However, changing these ratings from moderate to strong on this domain would have also had no effect on the global quality rating.
Limitations
As to be expected, a number of studies on long-term LBP outcomes had to be excluded from this review after not meeting our inclusion criteria. This occurred most often with studies on samples with non-specific LBP mixed with specific LBP, samples with acute mixed with sub-acute and chronic LBP, and studies that failed to report baseline results of the outcomes measured at long-term follow-up. The latter in particular was common for measures related to health care utilization, since information has to be available, or recalled, from before baseline. Ultimately, only four studies could be included that reported health care use in the period before baseline [
85,
99,
101,
114]. Another limitation is that this review gives limited insight into when the improvements that we observed took place. We chose to only report results from long-term follow-up (> 2 years), since the focus of was on mapping evidence from long-term follow-up studies. The complete course or trajectory of LBP symptoms could be studied in future reviews with a more narrow scope. Finally, the heterogeneity in the assessment and reporting of outcomes rendered it difficult to provide a qualitative synthesis of the results. A wide variety of instruments was used to measure pain, disability, quality of life, and work participation, and a considerable amount of studies did not report whether changes in scores between baseline and follow-up were statistically significant.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.