Skip to main content
Erschienen in:

Open Access 19.01.2022 | Review Article

What can we learn from long-term studies on chronic low back pain? A scoping review

verfasst von: Alisa L. Dutmer, Remko Soer, André P. Wolff, Michiel F. Reneman, Maarten H. Coppes, Henrica R. Schiphorst Preuper

Erschienen in: European Spine Journal | Ausgabe 4/2022

Abstract

Purpose

A scoping review was conducted with the objective to identify and map the available evidence from long-term studies on chronic non-specific low back pain (LBP), to examine how these studies are conducted, and to address potential knowledge gaps.

Method

We searched MEDLINE and EMBASE up to march 2021, not restricted by date or language. Experimental and observational study types were included. Inclusion criteria were: participants between 18 and 65 years old with non-specific sub-acute or chronic LBP, minimum average follow-up of > 2 years, and studies had to report at least one of the following outcome measures: disability, quality of life, work participation, or health care utilization. Methodological quality was assessed using the Effective Public Health Practice Project quality assessment. Data were extracted, tabulated, and reported thematically.

Results

Ninety studies met the inclusion criteria. Studies examined invasive treatments (72%), conservative (21%), or a comparison of both (7%). No natural cohorts were included. Methodological quality was weak (16% of studies), moderate (63%), or strong (21%) and generally improved after 2010. Disability (92%) and pain (86%) outcomes were most commonly reported, followed by work (25%), quality of life (15%), and health care utilization (4%). Most studies reported significant improvement at long-term follow-up (median 51 months, range 26 months–18 years). Only 10 (11%) studies took more than one measurement > 2 year after baseline.

Conclusion

Patients with persistent non-specific LBP seem to experience improvement in pain, disability and quality of life years after seeking treatment. However, it remains unclear what factors might have influenced these improvements, and whether they are treatment-related. Studies varied greatly in design, patient population, and methods of data collection. There is still little insight into the long-term natural course of LBP. Additionally, few studies perform repeated measurements during long-term follow-up or report on patient-centered outcomes other than pain or disability.
Hinweise

Supplementary Information

The online version contains supplementary material available at https://​doi.​org/​10.​1007/​s00586-022-07111-3.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Introduction

Low back pain (LBP) is very common and poses a great health risk for society. Worldwide, it is the number one cause of years lived with disability [1]. Up to 84% of the population will experience LBP at least once during their lifetime [2]. In roughly 90% of cases, a specific source for the LBP cannot be identified [3]. LBP is strongly associated with disability [1, 4], work absence [5, 6], and reduced quality of life [6, 7]. As a result, medical and particularly non-medical costs related to LBP are very high [8, 9].
Most patients improve substantially in the first six weeks after the onset of LBP [10]. However, one year after onset, approximately two thirds of patients still experience pain and disability [1012]. Currently, LBP is looked at more and more as a long-lasting or recurrent condition rather than a series of unrelated episodes [9, 13]. A review on the long-term course (follow-up ranged from one to 28 years) of LBP in the general population found that most patients experienced a somewhat stable or fluctuating occurrence of LBP over time [14]. Becoming pain free was never reported as a common finding.
Despite the effects of LBP on physical, psychological, and social well-being, there are few longitudinal studies reporting multiple patient-centered outcomes. Cohort studies with long-term follow-up (> 2 years) often confine to investigating the presence of pain (yes/no) or the number of days with pain over the past month(s) or year [13, 14]. Several consensus statements have been published on outcome measures in chronic (back) pain research [1517]. Most reports specifically provide recommendations for the evaluation of clinical trials, but there is an overall understanding that reporting on pain alone in LBP research is insufficient. Other important outcome domains include measures of physical function, generic measures of health and well-being, quality of life, and work (dis)ability.
At present, it is unclear what evidence is available from long-term studies on chronic non-specific LBP. More specifically, from studies examining patient-centered outcomes other than pain. We conducted a scoping review with the objective to identify and map the available evidence from studies on chronic LBP with long-term follow-up, to examine how these studies are conducted, and to address potential knowledge gaps. Where systematic reviews typically focus on more narrow and well-defined questions with appropriate study designs chosen in advance, a scoping review tends to address broader topics where many different study designs might be applicable [18]. For the present study, we included experimental and observational studies reporting at least two-year follow-up on disability, quality of life, work participation or health care utilization in patients with chronic non-specific LBP. The results are not intended to provide evidence to inform clinical practice, but rather to gain insight into the scientific literature that is currently available. For studying the feasibility, appropriateness or effectiveness of a certain treatment or practice, a systematic review is a more valid approach [19].

Methods

The PRISMA Extension for scoping reviews (PRISMA-ScR) was used as a reporting guideline for this review [20]. Although critical appraisal is optional, for the present study we evaluated methodological quality of the included studies with a quality assessment tool in order to be able to address any potential gaps in the literature related to low quality of research [21].

Eligibility criteria

Types of studies

Both experimental and observational studies investigating non-specific LBP with baseline measures and a minimum (mean) follow-up of > 2 years were included. Case reports and review studies were excluded.

Participants

Study participants were adults with sub-acute (6–12 weeks) or chronic (> 12 weeks) non-specific low back pain at study baseline, with or without leg pain. The average age of the study population had to be between 18 and 65 years. Studies that reported on LBP due to a specified physical cause (e.g., infection, tumor, osteoporosis, fracture, structural deformity, inflammatory disorder, radicular syndrome or cauda equina syndrome) were excluded. Studies on patients with LBP due to failed back surgery syndrome (FBSS) and LBP due to degenerative changes such as disk degeneration, osteoarthritis of facet joints, and a grade 1 degenerative spondylolisthesis were included, provided that there were no neurological symptoms. Little to no association has been found between imaging findings of these types of spine degeneration and the presence of LBP [2226]. We therefore classified these (radiological) diagnoses as non-specific. Studies with mixed LBP groups (specific and non-specific cause for LBP) or mixed pain populations (e.g., neck pain and LBP) were excluded unless subgroup data for baseline and follow-up were presented.

Outcome measures

To be included, studies had to report on at least one of the following outcome measures: disability, quality of life, work participation, or health care utilization. Pain was also an outcome measure, but studies that only reported on pain were not included.

Search methods for identification of studies

Electronic searches in MEDLINE and EMBASE were conducted using indexed terms and free text words. The searches were not restricted by date, language, or place of publication. The search strategy included terms related to LBP, long-term follow-up and outcome measures (Supplementary Digital Content [SDC] 1). The search results for both databases were downloaded into RefWorks and duplicates were removed. An initial literature search was performed, followed by several updates, of which the last took place on March 5 2021. Initially, search terms for spondylolysis and spondylolisthesis were included. However, these were removed in the updated searches. We found that studies that were retrieved with these search terms (and that would not have been retrieved by searching for terms related to low back pain) targeted patients with spondylolisthesis with a higher than grade 1 degree of severity.

Data collection and analysis

Study selection

Three review authors independently screened titles, abstracts, and full text of the studies retrieved from the databases. One author (AD) screened all studies and two authors (RS, RSP) each screened half. The inclusion criteria included type of participants, length of follow up, and outcome measures. To determine interrater agreement, a sample of 200 studies was selected for the three reviewers to screen on title, abstract and full text. Agreement ranged from 98 to 99% between reviewers with kappa scores ranging 0.56–0.98 (moderate or substantial agreement). However, kappa scores are deemed not very reliable for ‘rare findings’ [27] and in this sample of 200 studies ultimately only 3 studies were included after consensus was reached. Any disagreement in the selection of studies was discussed until consensus was reached. If the three reviewers could not reach consensus, the fourth reviewer (MR) was consulted.

Quality assessment

The Effective Public Health Practice Project Quality Assessment Tool (EPHPP) was used to evaluate methodological quality of the studies [28]. The tool can be used to evaluate a variety of study designs such as RCTs, observational, cross sectional, and before-and-after studies. The EPHPP assesses six domains: (1) selection bias, (2) study design, (3) confounders, (4) blinding, (5) data collection method, and (6) withdrawals and dropouts. Each domain can be rated strong, moderate, or weak resulting in a global rating of strong (no weak ratings), moderate (one weak rating), or weak (two or more weak ratings) for each study. The confounders domain was scored ‘not applicable’ when there was no comparison or control group, since the corresponding question was phrased “Were there important differences between groups prior to the intervention?”. Content and construct validity of the EPHPP have been established and inter-rater reliability is fair for the individual domains (ICC = 0.60) and excellent for the global rating (ICC = 0.77) [28, 29]. Four reviewers assessed methodological quality of the studies. One author (AD) assessed all studies and three authors (RS, RSP, MR) each assessed one third of the studies. Disagreements were resolved between the authors assessing the study or when in doubt were discussed with all four assessing authors to reach consensus.

Data extraction and synthesis

The following data were extracted by one author (AD) from each paper and presented in supplementary tables (SDC 2): first author, study setting and country, study design, intervention(s), patient characteristics (diagnoses, age, % female), outcome domain(s), instrument(s), duration of follow-up, and results of measurements taken at baseline and > 2 year follow-up. This includes the results of any responder analyses (i.e., the proportion of patients achieving a pre-defined level of improvement) [30]. For randomized controlled trials (RCT), results from the intention-to-treat analyses were reported. Studies were organized thematically according to intervention type. Study characteristics were also summarized in a narrative format and the overall findings were presented in a summary table. Per outcome, the number of treatment arms that showed a significant (p < 0.05) improvement, decline, or no change compared to baseline was reported. The number of studies that did not report p-values for the change in outcome at follow-up was also reported.

Results

Study selection

Together, the initial and updated searches returned 10,312 articles, of which 90 ultimately met the inclusion criteria (Fig. 1). Follow-up results of one study were presented in two different articles [31, 32]. An overview of study characteristics can be found in SDC 2. Studies (n = 89) were classified according to the type of treatment(s) that was investigated: invasive (72%, n = 64; Table 1, SDC 2), conservative (21%, n = 19; Table 2, SDC 2), or a comparison of invasive and conservative treatments (7%, n = 6; Table 3, SDC 2). By definition, (minimal) invasive procedures require (1) a method of access to the body (incision, natural orifice, or percutaneous access), (2) instrumentation (e.g., endoscopes, catheters, scalpels), and (3) requirement for operator skill [33]. All non-invasive treatments were classified under conservative treatments.

Quality assessment

Global quality rating was weak for 14 (16%), moderate for 56 (63%), and strong for 19 (21%) studies (Table 1). A global weak rating was more common with studies published before 2010, while most studies that rated strong were published in the last decade (Fig. 2). Most common design was either a prospective (44%, n = 39) or retrospective cohort study (31%, n = 28) (both rated ‘moderate’). Twenty-one studies (24%) conducted an RCT and one study was classified as a controlled clinical trial [34] (both rated ‘strong’). Weak ratings were prevalent with the domain ‘selection bias’, while strong ratings were prevalent for ‘data collection method’. Studies rated predominantly moderate (42%, n = 37) or strong (44%, n = 39) on ‘withdrawals and dropouts’. Sixty studies (67%) did not receive a rating on ‘confounders’ due to the absence of a comparison or control group. Twenty-six (29%) retrospective studies received a ‘moderate’ rating for scoring ‘not applicable’ on the item ‘percentage of patients completing the study’.
Table 1
EPHPP quality assessment scores of the included studies
References
EPHPP domain
Selection bias
Study design
Confounders
Blinding
Data collection method
Withdrawals and dropouts
Global rating
Studies on invasive treatments
 Al-Kaisy et al. [35]
M
M
n/a
M
S
S
Strong
 Amirdelfan et al. [36]
M
S
W
M
S
M
Moderate
 Aunoble et al. [37]
M
M
n/a
M
S
S
Strong
 Axelsson et al. [38]
W
M
n/a
M
W
M
Weak
 Buric and Pulidori [39]
W
M
n/a
M
S
S
Moderate
 Burkus et al. [40]
M
M
W
M
S
W
Weak
 Buttermann and Mullin [31] and Butterman et al. [32]
W
M
S
M
S
S
Moderate
 Buttermann et al. [41]
M
S
S
S
S
S
Strong
 Cakir et al. [42]
W
M
n/a
M
S
M
Moderate
 Cheng et al. [43]
W
M
n/a
M
S
S
Moderate
 Cheng et al. [44]
M
M
n/a
M
S
M
Strong
 Chung et al. [45]
W
S
S
M
S
S
Moderate
 Corenman et al. [46]
W
M
n/a
M
S
M
Moderate
 Di Silvestre et al. [47]
W
M
S
M
S
M
Moderate
 Fischgrund et al. [48]
W
M
n/a
M
S
S
Moderate
 Formica et al. [49]
W
M
n/a
M
S
M
Moderate
 Geerdes et al. [50]
W
M
n/a
M
S
M
Moderate
 Gepstein et al. [51]
W
M
n/a
M
S
S
Moderate
 Gioia et al. [52]
W
M
n/a
M
S
W
Weak
 Gornet et al. [53]
M
S
S
M
S
M
Strong
 Guyer et al. [54]
M
S
S
M
S
W
Moderate
 Hamm-Faber et al. [55]
W
M
n/a
M
S
S
Moderate
 Houten et al. [56]
M
M
n/a
M
W
W
Weak
 Kareem and Ulbricht [57]
M
M
n/a
M
S
W
Moderate
 Katsimihas et al. [58]
W
M
n/a
W
S
Ma
Weak
 Kuslich et al. [59]
W
M
n/a
M
W
S
Moderate
 Lee et al. [60]
M
M
n/a
M
S
S
Strong
 Liang et al. [61]
W
M
n/a
M
S
M
Moderate
 Lu et al. [62]
W
M
n/a
M
S
S
Moderate
 Lu et al. [63]
W
M
n/a
M
S
M
Moderate
 Madan and Boeree [64]
W
M
n/a
M
S
M
Moderate
 Madan et al. [65]
W
M
S
M
S
M
Moderate
 Maestretti et al. [66]
W
M
n/a
M
S
S
Moderate
 Malham and Parker et al. [67]
M
M
n/a
M
S
S
Strong
 Meir et al. [68]
W
M
n/a
M
S
S
Moderate
 Niemeyer et al. [69]
W
M
n/a
M
S
M
Moderate
 Noriega et al. [70]
W
S
W
S
S
S
Weak
 Nunley et al. [71]
M
M
n/a
M
S
W
Moderate
 Nystrom et al. [72]
M
M
n/a
M
S
S
Strong
 Ohtori et al. [73]
W
S
S
M
S
W
Weak
 Pan et al. [74]
M
M
n/a
M
S
S
Strong
 Park et al. [75]
W
M
n/a
M
S
M
Moderate
 Park et al. [76]
W
M
n/a
M
S
M
Moderate
 Peng et al. [77]
W
M
n/a
M
S
M
Moderate
 Petilon et al. [78]
W
M
n/a
M
S
M
Moderate
 Pettine et al. [79]
M
M
n/a
M
S
S
Strong
 Pihlajamaki et al. [80]
W
M
n/a
M
W
M
Weak
 Pimenta et al. [81]
W
S
S
M
S
S
Moderate
 Pimenta et al. [82]
W
M
n/a
M
S
S
Moderate
 Plais et al. [83]
W
M
n/a
M
S
M
Moderate
 Pokorny et al. [84]
W
M
n/a
M
S
S
Moderate
 Putzier et al. [85]
W
S
S
M
S
M
Moderate
 Raphael et al. [86]
W
M
n/a
M
W
M
Weak
 Ren et al. [87]
W
M
n/a
M
S
M
Moderate
 Rouben et al. [88]
W
M
n/a
M
S
M
Moderate
 Saal and Saal [89]
W
M
n/a
M
S
S
Moderate
 Schimmel et al. [90]
W
M
n/a
M
S
M
Moderate
 Schulte et al. [91]
W
M
n/a
M
S
M
Moderate
 Siepe et al. [92]
W
M
n/a
M
S
S
Moderate
 Sköld et al. [93]
M
S
W
M
S
S
Moderate
 Strube et al. [94]
W
M
n/a
M
S
M
Moderate
 Thalgott et al. [95]
W
M
n/a
M
S
M
Moderate
 Wuertinger et al. [96]
W
M
n/a
M
S
W
Weak
 Zeilstra et al. [97]
W
M
n/a
M
S
W
Weak
Studies on conservative treatments
 Bendix et al. [98]
M
S
M
M
W
M
Moderate
 Bentsen et al. [99]
W
S
W
M
W
S
Weak
 Carvalho et al. [100]
W
M
n/a
M
S
S
Moderate
 Groot et al. [101]
W
M
n/a
M
S
S
Moderate
 Haas et al. [102]
M
M
S
M
S
W
Moderate
 Hamre et al. [103]
S
M
n/a
M
S
M
Strong
 Indahl et al. [34]
M
S
S
M
S
S
Strong
 Lamb et al. [104]
M
S
S
M
S
W
Moderate
 Lanes et al. [105]
W
M
n/a
M
W
M
Weak
 Lankhorst et al. [106]
W
M
n/a
M
S
M
Moderate
 Patrick et al. [107]
S
M
n/a
M
S
W
Moderate
 Peng et al. [108]
W
M
n/a
M
S
S
Moderate
 Raak et al. [109]
W
M
n/a
M
S
S
Moderate
 Rantonen et al. [110]
M
S
S
M
S
S
Strong
 Rasmussen-Barr et al. [111]
M
S
S
M
S
M
Strong
 Rhyne et al. [112]
W
M
n/a
M
S
M
Moderate
 Udby et al. [113]
M
M
S
M
S
S
Strong
 Van Hoof et al. [114]
W
M
n/a
M
S
S
Moderate
 Vibe Fersum et al. [115]
S
S
S
W
S
W
Weak
Studies comparing invasive and conservative treatments
 Brox et al. [116]
M
S
S
M
S
S
Strong
 Froholdt et al. et al. [117]
M
S
S
M
S
S
Strong
 Froholdt et al. [118]
S
S
S
M
S
M
Strong
 Furunes et al. [119]
M
S
W
M
S
S
Moderate
 Hedlund et al. [120]
M
S
S
M
S
S
Strong
 Kleimeyer et al. [121]
W
M
S
M
S
M
Moderate
EPHPP Effective Public Health Practice Project Quality Assessment Tool, S strong, M moderate, W weak, n/a not applicable
a‘Withdrawals and dropout’ rating varies per time of measurement: strong at 3 years follow-up (< 20% dropouts), moderate at 4 years (< 40% dropouts), and weak at 5, 6, and 7 years follow-up (> 40% dropouts)

Study Characteristics

Year published

Studies were published between 1985 and 2021, with 52 out of 89 studies (58%) published in the last decade (Fig. 2).

Study Setting

The majority of selected studies (83%, n = 74) were from Western countries (SDC 2). More specifically, from European countries (54%, n = 48), such as Germany (10%, n = 8), Sweden, the UK (both 9%, n = 7), Norway, the Netherlands (both 8%, n = 7), and from the USA (27%, n = 24). Thirteen studies (15%) were from Asian countries of which seven (8%) from China. Two studies were from Brazil (2%). There were no studies from African countries, Central America, or Eastern Europe.
Less than half of the selected studies (44%, n = 39) specified the setting in which they took place. Forty-six out of 64 studies on invasive treatments did not report or were unclear in their report on where a specific intervention took place. The 18 remaining studies (20%) specified they took place in (university) hospitals or (out-patient) medical practices. Studies on conservative treatments mostly took place in (university) hospitals, physiotherapy clinics, and chiropractic and general practices. Five out of six studies that compared invasive with conservative treatments took place in university hospitals.

Interventions

Most common types of invasive treatment were lumbar fusion (38% of studies, n = 34) and disc arthroplasty (25%, n = 22), followed by intradiscal therapies (e.g., intradiscal electrothermal therapy or intradiscal bone marrow injection; 11%, n = 10), and implantable therapies (e.g., spinal cord stimulation) [35, 55, 86] (SDC 2). Less common were interspinous process devices [39, 63], dynamic spine stabilization systems [57, 85], and basivertebral nerve ablation [48]. Two studies used sham infiltration as a control for intradiscal bone marrow injection [36, 70].
Most common conservative treatments were multidisciplinary treatment (10% of studies, n = 9), physiotherapy or exercise training (7%, n = 6), cognitive therapies (4%, n = 4), advice and/or education (4%, n = 4). Other treatments consisted of (non-operative) care as usual [108, 112, 121] chiropractic care or primary care by a medical doctor [102], anthroposophic medicine [103], rehabilitation treatment [109], or open label placebo pills [100].
With the exception of two control groups that were assessed in studies on conservative treatments [98, 110], there were no studies examining long-term outcomes of LBP in people receiving no treatment. Two studies reported examining the natural history of LBP; however, their patient samples completed Swedish Back School [106] or received two months of conservative treatment [108] and were therefore categorized under ‘conservative treatments’ in this review.

Patient characteristics

Selection criteria of this review were set to include only studies on adults with sub-acute or chronic non-specific LBP. This also included patients with LBP due to FBSS, or degenerative changes such as disk degeneration and grade 1 spondylolisthesis, provided that there were no neurological symptoms. One study exclusively included patients with sub-acute LBP [34] and five studies included both patients with sub-acute and CLBP [102104, 110, 111].
The majority of studies (91%, n = 64) on invasive treatments (with or without conservative treatment as a control) included patients that fit their criteria for either degenerative disc disease (DDD), discogenic pain, internal disc disruption or a combination thereof. Other studies selected patients with Modic type 1 or 2 changes [48], patients with CLBP and radiating pain to the lower limb(s) [52], FBSS [55], either FBSS or mechanical LBP [86], or LBP originating from the endplate [77].
Only two studies investigating conservative treatment options sought to include patients with discogenic pain [108, 112]. One study specifically excluded patients with disk degeneration [100]. Commonly, patients with CLBP (58%, n = 11), sub-acute LBP [34], or both sub-acute and CLBP (29%, n = 5) were eligible for inclusion. Added criteria were: still working [111], permanent employment [110], or sick-leave due to LBP [34, 107]. One study reported results separately for patients with CLBP with or without modic changes [113].

Outcomes measurements

For the selected studies, disability (92%, n = 82) and pain (86%, n = 77) were the most commonly measured outcome domains, followed by work (25%, n = 22), and quality of life (15%, n = 13) (SDC 2). Only four studies (4%) measured health care use [85, 99, 101, 114]. Five out of seven most frequently used outcome measures were patient reported outcome measures (PROMs) of pain and disability (Fig. 3). The Oswestry Disability Index (ODI) and Visual Analogue Scale (VAS) back pain were used in the majority of studies. Less frequently used outcome measures were the SF-36 subscale ‘Bodily Pain’ (6%, n = 5) for measuring pain, the SF-36 subscales ‘Physical Functioning’ (4%, n = 4) and ‘Role Physical’ (3%, n = 3), the General Functioning Score (3%, n = 3) for disability, and ‘work status’ (3%, n = 3) for measuring work participation. A remaining 40 outcome measures, most for measuring pain, were each used by less than three studies.

Follow-up

Follow-up ranged between 26 months and 18 years with a median of 51 months (SDC 2). Forty-three studies (48%) reported an (average) duration of follow-up between 24 and 48 months, 22 (25%) studies between 49 months and six years, and 24 (27%) studies over six years. Only ten studies (11%) took more than one measurement at > 2 year after baseline. Follow-up was available for > 80% of patients in 39 studies, between 60 and 80% in 12 studies, and < 60% in six studies. The percentage was unclear in six studies. The remaining 26 studies were retrospective studies that included patients based on complete availability of follow-up. Furthermore, a total of 36 studies (all 28 retrospective studies, seven prospective studies, and one RCT [98]) reported only baseline results of those patients that completed a minimum length of follow-up.

Responder analyses

Twenty-six out of 89 studies (29%) reported the results of a responder analysis; 23 studies on invasive treatments, two studies on conservative treatments and one study that compared invasive with conservative treatments (SDC 2). An improvement in disability, measured with the ODI, was most commonly used to determine clinical success (85%, n = 22), followed by an improvement in back pain or leg pain (38%, n = 10) measured with VAS or NRS. The cut-off for clinical success varied greatly per instrument; 10 different cut-offs were used for the ODI and seven for the VAS or NRS. One study reported clinical success on pain and disability using an improvement in subscales for pain and functioning of the SF-36 [89]. Other studies analyzed improvement in quality of life (SF-36 Physical Component Scale) [67] or improvement in both pain and disability [44, 48, 60].

Summary of findings at long-term follow-up

Table 2 summarizes the overall findings of the selected studies per treatment type and duration of follow-up. Reported results were not specified for diagnoses or disease characteristics. Per outcome, the number of treatment arms that showed a significant improvement (‘+’), no significant change (‘0’), or a significant decline compared to baseline (‘−’) was reported. Several studies did not report p-values for the change in outcome at follow-up (‘?’). Results on work related outcomes were very rarely reported with a statistical level of significance. However, almost all results without a reported p-value showed some level of improvement between baseline and long-term follow-up. In general, pain, disability, and quality of life were significantly improved after an invasive intervention. Results after conservative treatments varied between significantly improved or unchanged. One study reported that patients had significantly worsened compared to baseline six years after following a rehabilitation program [109]. Since most studies reported significant improvement at follow-up, there was little difference in outcome at the different durations of follow-up.
Table 2
Summary of reported results per treatment type and duration of follow-up (Non-specified for diagnosis or disease characteristics)
 
Treatment arms (n)
Pain
Disability
Quality of life
Work
Health care utilization
Outcome re-ported (n)
Effect
Outcome re-ported (n)
Effect
Outcome re-ported (n)
Effect
Outcome re-ported (n)
Effect
Outcome re-ported (n)
Effect
+
0
?
+
0
?
+
0
?
+
0
?
+
0
?
Treatment type
    
 Invasive
    
  Lumbar fusion
41
34
23
11
38
26
1
11
8
4
1
3
10
1
9
nr
    
  Disc arthroplasty
21
19
15
4
21
15
6
8
5
3
2
2
nr
    
  Intradiscal therapy
7
7
6
1
7
6
1
1
1
2
2
nr
    
  Implantable therapy
3
3
3
3
3
2
2
2
1
1
1
1
  Other
5
5
5
5
5
1
1
nr
    
nr
    
 Conservative
    
  Multidisciplinary treatment
12
7
3
2
2
10
5
2
3
4
1
3
7
7
1
1
  Physiotherapy or exercise training
8
5
1
4
5
1
4
nr
    
4
1
3
1
1
  Cognitive treatment
4
4
1
3
4
1
3
     
1
1
1
1
  Usual (non-surgical) care
4
2
1
1
2
1
1
nr
    
1
1
nr
    
  Advice and/or education
4
2
1
1
2
1
1
nr
    
1
1
nr
    
  Other
5
5
2
1
2
5
2
1
2
1
1
nr
    
nr
    
No intervention
2
1
1
1
1
nr
    
2
2
nr
    
Sham/control for intradiscal therapy
3
3
1
2
3
1
2
nr
    
nr
    
nr
    
Follow-up
    
 24–36 m
38
34
18
1
15
36
22
1
13
8
7
1
6
2
4
2
1
1
 37–48 m
42
28
18
1
9
32
20
3
9
8
4
1
3
8
1
7
nr
    
 49–72 m
38
35
26
4
5
35
26
3
1
5
12
8
1
3
15
1
14
1
1
 73–216 m
33
29
22
7
33
24
9
8
5
3
5
5
1
1
n, number of intervention groups; +, significantly improved (p < 0.05); 0, no significant change (p > 0.05); −, significantly worsened (p < 0.05); ?, no p value reported; nr, not reported; m, months
Under ‘Treatment type’, results of the latest follow-up measurement were reported. Under ‘Follow-up’, results for all follow-up measurements were reported
One study reported results separately for ‘improved’ and ‘worsened/same’ subgroups of patients and was therefore not included in this table

Responder analyses

Setting aside the variety in definitions to determine clinical success and irrespective of the type of treatment patients received, we found that response on pain measures at long-term follow-up varied between 20 and 90% (10 studies with 15 treatment arms) and response on disability measures varied between 15 and 91% (22 studies with 32 treatment arms) (SDC 2).
Looking at different treatment types and taking into account the number of patients per treatment arm, clinical success on disability was achieved in 73% of patients that underwent a disc arthroplasty (n = 14 treatment arms), 75% of patients that underwent lumbar fusion (n = 7 treatment arms), 61% of patients that received multidisciplinary treatment or physiotherapy/exercise training (n = 4 treatment arms), and 63% of patients that received intradiscal therapies (n = 3 treatment arms). The only treatment type with > 3 treatments arms reporting response rates on pain measures was intradiscal therapy (n = 5 treatment arms), with 57% of patients achieving clinical success.

Discussion

The general purpose of this study was to identify and map the available evidence from long-term studies on chronic non-specific LBP. Our findings confirm the notion that there is little to no information available from natural cohorts when it comes to reporting on patient-centered outcomes other than pain. The majority (> 75%) of papers that were included examined long-term outcomes after invasive treatments. Surgical interventions, specifically lumbar fusion and disc arthroplasty, were most commonly reported. Among studies examining conservative treatments, physical therapy and multidisciplinary programs were most common. Overall, included studies were predominantly of moderate quality and differed in design, patient samples, and methods of data collection. These differences were most profound between studies on invasive and conservative treatments. In general, most studies reported improvements in pain and disability and, when measured, quality of life at long-term follow-up.
This review identifies several knowledge gaps regarding research into long-term outcomes of non-specific chronic LBP. First, there is still little insight into the natural course of LBP regarding outcomes such as disability, quality of life, work, and health care utilization, because no natural cohorts met the inclusion criteria. In a natural cohort, subjects would be followed in real life in which numerous situations and interventions may appear. It is not limited to one or several specified interventions to study its effect. The studies included in this review examined clinical outcomes of non-specific LBP and concerned patients that were actively seeking health-care. Therefore, they might not be representative of people with sub-chronic or chronic LBP in the general population. Secondly, we noticed that repeated measurements during long-term follow-up were scarce. Only ten studies (11%) took more than one measurement after the two-year mark. These studies reported lasting improvements in symptoms after lumbar fusion [31, 32, 40, 41, 59, 72], disc arthroplasty [53, 58, 76, 92], and chiropractic care or primary care by an MD [102]. Nonetheless, recurrence of LBP is very common and studies with less than two years follow-up have also shown that post-treatment trajectories of pain and disability can vary a great deal between patients [122124]. Third, the present review also affirms the notion that across LBP trials, the primary focus has been on pain and disability as outcome measures [125], even though other (generic) measures of health and well-being, such as quality of life and work (dis)ability have been recommended in core outcome sets to reflect the multidimensionality of LBP [15, 126128]. Furthermore, few studies seem to monitor health care utilization during follow-up. These data can be challenging to collect; however, they are an important piece of the puzzle in determining whether outcomes at long-term follow-up might be the result of the original intervention (at baseline) or other interventions that were provide during follow-up. To conclude, in order to really understand both the (natural) course of LBP and results of LBP-related interventions over time, frequent measurements of relevant patient-centered outcomes are needed, as well as the use of complete core outcome sets including quality of life and work disability, and an overview of patients’ health care utilization during follow-up.
Even though the patient reported outcome measures in this review seem to reflect more positive long-term pain, disability and quality of life status compared to baseline measurements, this should not be misinterpreted as treatment effectiveness. This scoping review was not designed to study long-term effectiveness of interventions. A number of factors might have contributed to the appearance of consistent improvement years after experiencing persistent LBP. First, the reported improvements derive from statistical significance and do not necessarily imply clinical relevance. It is unclear whether patients perceived their improvement on different outcome measures as clinically relevant. Only a select number of studies performed a responder analysis. A previous review on outcome measures also reported that merely 8% of 401 included LBP trials reported a number or proportion of improved patients [125]. Although most of the studies in the present review that included a responder analysis reported high percentages of patients with clinically relevant improvement, cut-off scores for clinical success varied greatly. For instance, in some studies relative improvements of 25–30% on VAS or ODI were deemed successful, while others aimed for 50% [3537, 95].
Other factors might also have influenced improvement in LBP symptoms. A previous review in patients with non-specific LBP found that response to primary care treatment followed a pattern of rapid early improvement followed by a plateau, regardless of whether active treatment, usual care, or placebo treatment was used [129]. Natural prognosis could be one explanation [10, 11, 130]. However, natural prognosis at long-term is mostly unknown. People are also more likely to seek health care at a time when their pain and symptoms are at their worst or most debilitating, which could further explain a positive overall course. Regression to the mean could also have played a role in the improvements in symptoms that were found after the start of treatment [131]. Overall, these factors likely influenced short-term improvements in LBP complaints, but if maintained, could also explain the reported long-term beneficial outcomes. Finally, publication and reporting bias cannot be ruled out. Only one study reported that patients had significantly worsened at long-term follow-up. Future (systematic) reviews on long-term studies on LBP should consider checking their findings against reported study protocols and/or unpublished trial data.
Surgical treatments are relatively over-represented in the present review. Safety issues and long-term adverse events are of more concern in surgical trials compared to conservative interventions, which may be why long-term data is collected and analyzed more often from invasive interventions. Also, surgical studies more often seem to utilize data that are retrospectively obtained from patient medical records [132, 133]. This makes it easier to collect and report long-term follow-up data. In spine surgery, complication incidence is potentially underestimated with retrospective assessments [134]; however, the present review includes results from PROMs and not occurrence of adverse events.
Studies on invasive and conservative treatments were notably different in their patient inclusion criteria. Invasive studies sought to include patients with disc-related diagnoses or symptoms, whereas conservative studies defined symptom-related criteria more generally (‘low back pain’). Although diagnoses based on lumbar structures (e.g., discogenic pain, facet joint pain) were very common in some settings, diagnostic tests do not reliably identify these structures as a source of LBP. The usefulness of these tests in clinical practice remains unclear [22, 26, 135] and current guidelines on LBP usually classify these diagnoses as non-specific [136]. Nevertheless, spine surgeons have claimed that these diagnoses should classify as specific LBP and that better and earlier identification combined with, if indicated, invasive treatment would improve prognosis in these patients [137]. A Dutch task force that was tasked to develop a guideline for invasive treatment of lumbosacral pain syndromes has proposed to classify diagnoses such as facet joint pain, disc pain and FBSS as ‘degenerative uncomplicated spinal LBP syndromes’ [138]. In short, LBP diagnoses, as well as the decision to operate or treat conservatively, vary between countries and between medical disciplines. At present, there is no consensus among health care professionals on the classification of specific versus non-specific LBP. Improved consensus on a classification system could lead to more targeted care, reduce the need for expensive diagnostic methods, and facilitate comparison among LBP studies [17, 139, 140]
In line with worldwide research in the field of back pain, we identified a significant increase in annual publications on long-term outcomes of non-specific LBP [141]. The majority of selected studies were from Western countries, with the USA being the most productive (26% of studies). Little to no research took place in low- or middle-income countries, while in the past few decades the largest increases in disability due to LBP have occurred there [9, 142]. The impact of LBP in low- to middle-income countries potentially comes with disadvantages dissimilar to those in high-income countries and might therefore not be represented in the present review [9].
Finally, methodological quality of studies seemed to also increase over the years. Only prospectively conducted studies (prospective cohorts and RCT/CCTs) received a global ‘strong’ rating with the quality assessment tool that was utilized. Selection bias was often present in retrospectively conducted studies. In these instances, patients were included based on complete availability of follow-up data. Two sensitivity analyses were performed on the scoring method of the quality assessment tool. First, the global quality rating of a study was determined by the amount of ‘weak’ ratings that was scored on all separate domains. This means that studies that scored ‘moderate’ on each separate domain would have received a ‘strong’ global rating. A separate analysis showed that changing the global rating from strong to moderate for these studies would have had no effect on the results, since there were no studies that rated moderate on each domain. Second, prospective cohort studies received a ‘moderate’ rating on the domain study design. It could be argued that prospective cohort studies are a strong design for studying long-term outcomes. However, changing these ratings from moderate to strong on this domain would have also had no effect on the global quality rating.

Limitations

As to be expected, a number of studies on long-term LBP outcomes had to be excluded from this review after not meeting our inclusion criteria. This occurred most often with studies on samples with non-specific LBP mixed with specific LBP, samples with acute mixed with sub-acute and chronic LBP, and studies that failed to report baseline results of the outcomes measured at long-term follow-up. The latter in particular was common for measures related to health care utilization, since information has to be available, or recalled, from before baseline. Ultimately, only four studies could be included that reported health care use in the period before baseline [85, 99, 101, 114]. Another limitation is that this review gives limited insight into when the improvements that we observed took place. We chose to only report results from long-term follow-up (> 2 years), since the focus of was on mapping evidence from long-term follow-up studies. The complete course or trajectory of LBP symptoms could be studied in future reviews with a more narrow scope. Finally, the heterogeneity in the assessment and reporting of outcomes rendered it difficult to provide a qualitative synthesis of the results. A wide variety of instruments was used to measure pain, disability, quality of life, and work participation, and a considerable amount of studies did not report whether changes in scores between baseline and follow-up were statistically significant.

Conclusion

Patients with persistent non-specific LBP report improvements in pain, disability and quality of life years after seeking treatment. However, it remains unclear what factors might have influenced these improvements, and whether they are treatment-related. In part, because there is very little long-term evidence available from natural cohorts. Finally, studies that examined long-term outcomes of LBP symptoms varied greatly in design, quality, patient samples, and methods of data collection, and only few performed a responder analysis or applied repeated measurements after two years of follow-up.

Declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Unsere Produktempfehlungen

e.Med Interdisziplinär

Kombi-Abonnement

Für Ihren Erfolg in Klinik und Praxis - Die beste Hilfe in Ihrem Arbeitsalltag

Mit e.Med Interdisziplinär erhalten Sie Zugang zu allen CME-Fortbildungen und Fachzeitschriften auf SpringerMedizin.de.

e.Med Orthopädie & Unfallchirurgie

Kombi-Abonnement

Mit e.Med Orthopädie & Unfallchirurgie erhalten Sie Zugang zu CME-Fortbildungen der Fachgebiete, den Premium-Inhalten der dazugehörigen Fachzeitschriften, inklusive einer gedruckten Zeitschrift Ihrer Wahl.

Anhänge

Supplementary Information

Below is the link to the electronic supplementary material.
Literatur
1.
Zurück zum Zitat GBD 2017 Disease and Injury Incidence and Prevalence Collaborators (2008) (2018) Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study. Lancet 392(10159):1789–1858. https://doi.org/10.1016/s0140-6736(18)32279-7CrossRef GBD 2017 Disease and Injury Incidence and Prevalence Collaborators (2008) (2018) Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the Global Burden of Disease Study. Lancet 392(10159):1789–1858. https://​doi.​org/​10.​1016/​s0140-6736(18)32279-7CrossRef
7.
Zurück zum Zitat Montazeri A, Mousavi SJ (2010) Quality of life and low back pain. In: Preedy VR, Watson RR (eds) Handbook of disease burdens and quality of life measures. Springer, New York, pp 3979–3994CrossRef Montazeri A, Mousavi SJ (2010) Quality of life and low back pain. In: Preedy VR, Watson RR (eds) Handbook of disease burdens and quality of life measures. Springer, New York, pp 3979–3994CrossRef
27.
Zurück zum Zitat Viera AJ, Garrett Joanne M (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363PubMed Viera AJ, Garrett Joanne M (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37(5):360–363PubMed
35.
39.
Zurück zum Zitat Buric J, Pulidori M (2011) Long-term reduction in pain and disability after surgery with the interspinous device for intervertebral assisted motion (DIAM) spinal stabilization system in patients with low back pain: 4-year follow-up from a longitudinal prospective case series. Eur Spine J 20(8):1204–1211. https://doi.org/10.1007/s00586-011-1697-6CrossRef Buric J, Pulidori M (2011) Long-term reduction in pain and disability after surgery with the interspinous device for intervertebral assisted motion (DIAM) spinal stabilization system in patients with low back pain: 4-year follow-up from a longitudinal prospective case series. Eur Spine J 20(8):1204–1211. https://​doi.​org/​10.​1007/​s00586-011-1697-6CrossRef
46.
Zurück zum Zitat Corenman DS, Gillard DM, Dornan GJ et al (2013) Recombinant human bone morphogenetic protein-2-augmented transforaminal lumbar interbody fusion for the treatment of chronic low back pain secondary to the homogeneous diagnosis of discogenic pain syndrome: two-year outcomes. Spine (Phila Pa 1976) 38(20):E1269-77. https://doi.org/10.1097/brs.0b013e31829fc56fCrossRef Corenman DS, Gillard DM, Dornan GJ et al (2013) Recombinant human bone morphogenetic protein-2-augmented transforaminal lumbar interbody fusion for the treatment of chronic low back pain secondary to the homogeneous diagnosis of discogenic pain syndrome: two-year outcomes. Spine (Phila Pa 1976) 38(20):E1269-77. https://​doi.​org/​10.​1097/​brs.​0b013e31829fc56f​CrossRef
48.
Zurück zum Zitat Fischgrund JS, Rhyne A, Macadaeg K et al (2020) Long-term outcomes following intraosseous basivertebral nerve ablation for the treatment of chronic low back pain: 5-year treatment arm results from a prospective randomized double-blind sham-controlled multi-center study. Eur Spine J 29(8):1925–1934. https://doi.org/10.1007/s00586-020-06448-xCrossRefPubMed Fischgrund JS, Rhyne A, Macadaeg K et al (2020) Long-term outcomes following intraosseous basivertebral nerve ablation for the treatment of chronic low back pain: 5-year treatment arm results from a prospective randomized double-blind sham-controlled multi-center study. Eur Spine J 29(8):1925–1934. https://​doi.​org/​10.​1007/​s00586-020-06448-xCrossRefPubMed
51.
Zurück zum Zitat Gepstein R, Werner D, Shabat S et al (2005) Percutaneous posterior lumbar interbody fusion using the B-twin expandable spinal spacer. Minim Invasive Neurosurg 48(6):330–333CrossRefPubMed Gepstein R, Werner D, Shabat S et al (2005) Percutaneous posterior lumbar interbody fusion using the B-twin expandable spinal spacer. Minim Invasive Neurosurg 48(6):330–333CrossRefPubMed
55.
58.
Zurück zum Zitat Katsimihas M, Baily CS, Issa K et al (2010) Prospective clinical and radiographic results of CHARITÉ III artificial total disc arthroplasty at 2- to 7-year follow-up: a Canadian experience. Can J Surg 53(6):408–414PubMedPubMedCentral Katsimihas M, Baily CS, Issa K et al (2010) Prospective clinical and radiographic results of CHARITÉ III artificial total disc arthroplasty at 2- to 7-year follow-up: a Canadian experience. Can J Surg 53(6):408–414PubMedPubMedCentral
60.
Zurück zum Zitat Lee MS, Cooper G, Lutz GE et al (2003) Intradiscal electrothermal therapy (IDET) for treatment of chronic lumbar discogenic pain: a minimum 2-year clinical outcome study. Pain Phys 6(4):443–448 Lee MS, Cooper G, Lutz GE et al (2003) Intradiscal electrothermal therapy (IDET) for treatment of chronic lumbar discogenic pain: a minimum 2-year clinical outcome study. Pain Phys 6(4):443–448
81.
Zurück zum Zitat Pimenta L, Marchi L, Oliveira L et al (2013) A prospective, randomized, controlled trial comparing radiographic and clinical outcomes between stand-alone lateral interbody lumbar fusion with either silicate calcium phosphate or rh-BMP2. J Neurol Surg A Cent Eur Neurosurg 74(6):343–350. https://doi.org/10.1055/s-0032-1333420CrossRefPubMed Pimenta L, Marchi L, Oliveira L et al (2013) A prospective, randomized, controlled trial comparing radiographic and clinical outcomes between stand-alone lateral interbody lumbar fusion with either silicate calcium phosphate or rh-BMP2. J Neurol Surg A Cent Eur Neurosurg 74(6):343–350. https://​doi.​org/​10.​1055/​s-0032-1333420CrossRefPubMed
97.
Zurück zum Zitat Zeilstra DJ, Staartjes VE, Schröder ML (2017) Minimally invasive transaxial lumbosacral interbody fusion: a ten year single-centre experience. Int Orthop 41(1):113–119CrossRefPubMed Zeilstra DJ, Staartjes VE, Schröder ML (2017) Minimally invasive transaxial lumbosacral interbody fusion: a ten year single-centre experience. Int Orthop 41(1):113–119CrossRefPubMed
99.
106.
Zurück zum Zitat Lankhorst GJ, van de Stadt RJ, van der Korst JK (1985) The natural history of idiopathic low back pain. A three-year follow-up study of spinal motion, pain and functional capacity. Scand J Rehabil Med 17(1):1–4PubMed Lankhorst GJ, van de Stadt RJ, van der Korst JK (1985) The natural history of idiopathic low back pain. A three-year follow-up study of spinal motion, pain and functional capacity. Scand J Rehabil Med 17(1):1–4PubMed
108.
Zurück zum Zitat Peng B, Fu X, Pang X et al (2012) Prospective clinical study on natural history of discogenic low back pain at 4 years of follow-up. Pain Phys 15(6):525–532CrossRef Peng B, Fu X, Pang X et al (2012) Prospective clinical study on natural history of discogenic low back pain at 4 years of follow-up. Pain Phys 15(6):525–532CrossRef
114.
Metadaten
Titel
What can we learn from long-term studies on chronic low back pain? A scoping review
verfasst von
Alisa L. Dutmer
Remko Soer
André P. Wolff
Michiel F. Reneman
Maarten H. Coppes
Henrica R. Schiphorst Preuper
Publikationsdatum
19.01.2022
Verlag
Springer Berlin Heidelberg
Erschienen in
European Spine Journal / Ausgabe 4/2022
Print ISSN: 0940-6719
Elektronische ISSN: 1432-0932
DOI
https://doi.org/10.1007/s00586-022-07111-3

Arthropedia

Grundlagenwissen der Arthroskopie und Gelenkchirurgie erweitert durch Fallbeispiele, Videos und Abbildungen. Zur Fortbildung und Wissenserweiterung, verfasst und geprüft von Expertinnen und Experten der Gesellschaft für Arthroskopie und Gelenkchirurgie (AGA).


Jetzt entdecken!

Neu im Fachgebiet Orthopädie und Unfallchirurgie

Aerobes Training hilft bei Fibromyalgie

Sport im aeroben Bereich ist ein veritables Mittel, um die Schmerzen von Menschen mit Fibromyalgie zu reduzieren, wie eine Metaanalyse zeigt. Frequenz, Dauer und Intensität der Übungen geben den Ausschlag.

Positive Langzeitdaten zu Inlays aus hochvernetztem Polyethylen

10-Jahres-Daten einer randomisierten Studie sprechen dafür, dass Knie-TEP-Inlays aus hochvernetztem (HXLPE) und konventionellem Polyethylen selten Revisionen erfordern – und HXLPE in puncto Abnutzung vielleicht sogar die Nase vorn hat.

Was bringt die Spritze ins Gelenk bei Arthrose?

Ein hochkarätiges Studienteam hat untersucht, wie gut intraartikulär verabreichte Substanzen zur Behandlung von Knie- und Hüftarthrose wirken. Für die meisten ist die Wirksamkeit alles andere als klar, einige werden offenbar überschätzt, für eine Substanz wird deutlich, dass sie vor allem schadet.

Leitlinienkonformes Management thermischer Verletzungen

Thermische Verletzungen gehören zu den schwerwiegendsten Traumen und hinterlassen oft langfristige körperliche und psychische Spuren. Die aktuelle S2k-Leitlinie „Behandlung thermischer Verletzungen im Kindesalter (Verbrennung, Verbrühung)“ bietet eine strukturierte Übersicht über das empfohlene Vorgehen.

Update Orthopädie und Unfallchirurgie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.