Introduction
Assessment of pain, range of motion, function, quality of life, and psychosocial status before and after lumbar spine surgery (LSS) is essential to monitor the success of surgery and rehabilitation [
1,
2]. Function evaluation is mainly evaluated with physical performance tests or patient-reported outcome measures (PROMs) [
3]. PROMs are valuable for evaluating subjective patient opinions [
4]. In particular, the functional status of patients before and after surgery and the assessment of personal difficulty-ease improvements in activities of daily living can be evaluated practically and cost-effectively with questionnaires [
5]. However, physical performance tests are used as a gold standard measurement method to observe the objective performance-based functions of individuals [
6,
7].
Various physical performance tests containing daily life tasks (gait, sit to stand, turns, steps, stair ascent and descent, straight leg raising, squat) are developed within standardized protocols, and their measurement properties are proven in clinical studies [
3,
8]. Since the essence of pain and functional advancements before and after LSS surgery is known, functional improvements of individuals are objectively evaluated with performance tests [
9]. One of the most preferred tests in individuals with LSS is Timed Up and Go (TUG). TUG is a practical assessment tool including sit-to-stand, gait, and 180-degree turnaround tasks without requiring expensive equipment [
10].
LSS patients have rehabilitated to be independent during the activities of daily living in the post-operative period [
11,
12]. Holistic exercise programs, including strengthening, endurance, balance, core stabilization, proprioception and aerobic exercises, provide essential recovery during the post-operative period [
13,
14]. Studies demonstrated the improvements in sit-to-stand and gait speed in individuals with LSS regarding lower extremity strength and endurance progress [
15,
16]. Patients’ somatosensorial parameters, including balance and proprioception, also improve during the turn tasks of walking. Therefore, the TUG test is a significant physical indicator assessment of patients before and after LSS [
10,
17].
In 2016, Gautschi and colleagues proved the reliability of TUG in LSS with a high intraclass correlation coefficient (ICC) (0.95–0.97) [
10]. Current studies have also extensively addressed the validity of the TUG with a comparison of pain, function and quality of life outcomes [
3,
10,
18‐
21]. Furthermore, TUG was analyzed regarding responsiveness before and after surgery with short, medium and long-term follow-up results [
3,
18‐
20,
22‐
25]. In addition, studies also proved minimal clinically important difference (MCID), standard error of measurement (SEM), standardized response mean (SMR) and minimal important change (MIC) values with the scope of measurement error of TUG [
3,
18‐
20,
24,
25].
Measurement properties are essential to reveal whether physical performance tests provide accurate measurement responses in the relevant case group [
26]. In addition, considering the different types of surgery (fusion, decompression, instrumentation), intervention methods (minimally invasive, conventional methods), patient follow-up duration (immediate, acute, mid-term, chronic) and differences in statistical methods (reliability, validity, responsiveness), it is essential to review whether TUG provides consistent results in individuals with LSS [
13,
14,
26]. No other systematic review examined the measurement properties of the TUG in LSS. The present systematic review and meta-analysis aimed to investigate TUG’s measurement properties (including criterion validity, responsiveness, measurement error and reliability) in patients with LSS.
Discussion
TUG test is one of the most commonly used physical performance assessment tools for ongoing and following LSS [
10,
22]. The present systematic review and meta-analysis aimed to investigate the measurement properties of the TUG in patients with LSS. According to the results, TUG was agreeably responsive (moderate to strong) at the mid-term (6 weeks) follow-up. TUG was primarily associated with COMI (moderate), evaluating pain, function, symptom-specific well-being, quality of life, and disability. TUG was also moderately related to physical function, pain and quality of life, respectively. In clinical practice, the TUG can be used as a reliable, valid and responsive tool to assess LSS patients’ general status, especially in the mid-term.
Lumbar decompression surgery (with or without fusion) is a safe surgical procedure that has been performed for years to reduce pain, loss of function and improve patients’ independence in daily living [
13,
14]. It is crucial to evaluate the physical performance of individuals before these surgeries with measurement tests that include standardized protocols in order to evaluate the patient’s actual clinical condition objectively and quantitatively [
3,
8]. To our knowledge, no other study has examined the measurement properties of TUG, perhaps the most important of the tests used in clinical practice, in individuals before and after LSS.
The mean age of the sample of the included studies ranged between 46 and 66 years [
3,
10,
18‐
25]. A vast majority of the studies include middle-aged individuals. Hence, some studies enrolled older adults. However, since most of the studies included middle-aged individuals (median 56.25), the decline in physical function observed due to the physiology of aging can be disregarded. The patients were followed during immediate, acute and chronic periods. Responsiveness of TUG during these several follow-up periods provided essential data to clinical practice [
18,
20]. In addition, although there were more male subjects in most studies, approximately 40% of female subjects displayed a homogeneous gender distribution.
The most notable result of the quality analysis was a negative (−) and “fair to good” score in most studies for criterion validity. The main reason for this issue was the < 100 sample size and correlation coefficient values less than 0.70 in COSMIN scoring [
26,
30]. In the responsiveness analysis, studies ranked “fair to good”, “(0) no information”, and “(?) indeterminate” scores as a result of insufficient data in sample size and statistical analysis. In addition, only 1 of the studies provided measurement and statistical data on reliability. On the other hand, due to lacking statistical analysis and a small sample size on “measurement error”, the results of the studies had lower quality. In this context, future studies can address TUG’s test–retest or inter-rater reliability more comprehensively with specific ICC Shrout Fleiss models [
34]. In addition, responsiveness results should also address the ROC and AUC curve with longer-term follow-up to provide more apparent measurement characteristics of TUG in individuals with LSS [
35]. Within the scope of criterion validity, TUG needed to be adequately compared with gold-standard performance tests such as the Five Times Sit to Stand Test, Stair Test, 6MWT, and 30 s Chair Sit to Stand Test. The correlation of these tests with each other may provide coefficients above 0.70, which might improve validity inferences’ quality at a higher evidence level [
26,
30].
“Validity” is an analysis to indicate the degree of accuracy of the test for an intended parameter [
36]. Validity results showed that TUG was primarily related to COMI. Since it is comprehended that COMI represents the general condition, such as function, pain, symptoms, and quality of life, owing to its holistic structure, it can be argued that TUG provides a comprehensive evaluation in cases with LSS [
37]. TUG was secondarily associated with ZC-PF, ZCQ-SS, ODI and RMDI. This concordance suggests that TUG secondarily indicates the function of the patients, as expected. It should be noted that TUG represents general condition rather than function. Thirdly, the relationship between pain and TUG was noteworthy. Since it is known that the increase in the pain level of individuals would increase the loss of function, the moderate pooled coefficient correlation with low back and leg pain was not surprising [
9]. Among the correlation coefficient pooling, TUG was least associated with quality-of-life scores. Since the correlational analysis of individuals in the pre-op period is usually presented, the correlation of TUG with SF-12 and EQ5D after surgical and rehabilitation interventions may present higher validation coefficients. Also, since the quality of life is more perceptible in the chronic period after the health service is provided, it would be vital to examine the criterion validity after long-term follow-up in future studies [
13,
14,
38].
Responsiveness analysis investigated whether the TUG provides a clinical improvement response following the treatment at different follow-up times. While the TUG was low responsive at a 3-day follow-up, it revealed a more responsive clinical improvement at a 6-week mid-term follow-up. This outcome suggests that postoperative functional gains usually occur in a moderate-term period, as rehabilitation effectiveness usually occurs after 1 month in LSS. It would be essential to prove the further responsiveness of TUG in terms of long-term monitorization of individuals. As a matter of fact, Jakobsson and colleagues and Master and colleagues, which we could not include in the meta-analysis, confirmed that TUG was responsive in individuals after LSS at 6 and 12 months, respectively [
3,
20]. Considering the data within the scope of effect size with additional studies may provide pooling results at a high level of evidence.
Only 1 study demonstrated test–retest and inter-rater reliability. Reliability indicates whether the questionnaire can consistently capture the clinical condition of the same individual under identical clinical conditions [
26,
39]. The TUG provided highly reliable results in individuals with LSS. In future studies, presenting the reliability with Bland Altman agreement analysis could reveal the reliability of TUG in individuals with LSS more comprehensively. MCID revealed the smallest clinically significant change in “seconds”. Among these studies, MCID was found to be 3.4 s in the study with a mean age of 46 years and 1.3 s in the study with a mean age of 62 years. In another study with an average age of 49 years, results ranging between 0.9 and 3 s were noteworthy. It was observed that advancements in smaller units were more clinically significant in aging (with greater age) individuals. These data may provide reference outcomes on treatment improvements in clinical practice.
Limitations
All databases were not searched in the present systematic review. Some databases (CINAHL) were inaccessible regarding public sources. Secondly, the surgical procedures in the studies were not homogenous. Since it is comprehended that the outcomes and rehabilitation responses of individuals with “minimally invasive or conventional surgical” methods or “decompression or fusion” techniques differ [
13,
14], a more homogeneous pooling should be considered for future studies. Last but not least, the study was not registered in a “systematic review database” (International Prospective Register of Systematic Reviews-PROSPERO). Protocol registration of reviews is essential for the integrity of the methodology.
Conclusions
In conclusion, TUG was agreeably responsive (moderate to strong) at the mid-term (6 weeks) follow-up. TUG was primarily associated with COMI (moderate), evaluating pain, function, symptom-specific well-being, quality of life, and disability. TUG was also moderately related to physical function, pain and quality of life, respectively. In clinical practice, the TUG can be used as a reliable, valid and responsive tool to assess LSS patients’ general status, especially in the mid-term.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.