Abstract
Objective:
To evaluate the validity and responsiveness of a modified SF-36 within a spinal cord-injured (SCI) population.
Study Design:
SF-36 scores collected at baseline and on completion of a randomized controlled trial in 305 patients with SCI and neuropathic bladder.
Setting:
New South Wales, Australia.
Methods:
Subjects were administered the standard SF-36 plus three additional questions, in which ‘walk’ was replaced with ‘wheel’ for three of the physical function (PF) questions. Discriminant validity was determined by comparing participants with paraplegia and tetraplegia using the effect size (ES). Responsiveness was assessed in the subset of patients who developed a urinary tract infection (UTI) during the trial using the standardized response mean (SRM).
Results:
Compared with the standard SF-36, the SF-36 walk-wheel modification (SF-36ww) increased the mean PF score from 18 to 39 (P<0.001) and the physical composite score from 33 to 37 (P<0.001). Discriminant validity was similar for both versions (PF paraplegia/tetraplegia: ES 1.09(SF-36) vs 1.08(SF-36ww), n=305). Among 138 SCI patients who developed a UTI, the SF-36ww almost doubled PF responsiveness for all neurological levels (SRM increased from 0.36 to 0.68), more so in tetraplegic (SRM, 0.11 vs 0.58; n=77) than paraplegic groups (SRM, 0.77 vs 0.86; n=61).
Conclusion:
The SF-36ww is a simple, pragmatic modification of the SF-36 PF items, which addresses some problems of content validity and floor effect for SCI subjects and greatly improves responsiveness, particularly for those with tetraplegia. Because it comprises a simple addition to the standard SF-36, external comparisons are preserved.
Similar content being viewed by others
Introduction
The SF-36 is a generic health status measurement tool, which has been widely used for research in spinal cord injury (SCI) as well as for many other disease groups. It comprises eight domains: physical functioning (PF), physical role limitation, emotional role limitation, bodily pain, general health, vitality, social functioning and mental health; physical and mental composite scale scores (PCS, MCS) can also be derived from these domains.1 When used in SCI, the SF-36 has been demonstrated to have sufficient discrimination to compare the health status of those with SCI with that of other health populations2 and to be able to detect disease-state change for urinary tract infection (UTI) within the SCI population.3
The SF-36 is not without problems when used in SCI and other severely disabled populations. Particular problems have been demonstrated with the PF domain including significant floor effects due to the inability of many patients to perform some of the physical tasks described. This has the effect of limiting responsiveness and creates problems when correlating the PF domain with other SF-36 domains.3 Another major difficulty has been content validity and acceptability within the SCI population. PF questions that relate specifically to walking and stair climbing (SF-36 items 3d, 3e and 3g–i) may be considered insulting or irrelevant for some SCI individuals.3, 4, 5, 6, 7 Tate et al.7 and Meyers and Andresen8 have suggested replacing the words ‘walk’ and ‘climb’ with ‘go’ and ‘go up’, respectively, to address this problem. Tate et al.7 states that construct validity remains adequate with these changes, but to date there has been no published validation study of this type of modification in SCI populations compared with the standard SF-36. In this paper, we aim to validate a modified SF-36 against the standard SF-36 within an SCI sample. We determine whether this modification improves internal SCI discriminant validity, responsiveness to disease-state change and physical floor effects while retaining comparability with other population groups. Supplementary PF, PCS and MCS scores (PFww, PCSww and MCSww, respectively) were thus generated.
Methods
The SF-36 scores were collected as part of the spinal-injured neuropathic bladder antisepsis randomized controlled trial (RCT) in patients with SCI and neuropathic bladder. Between November 2000 and August 2002, 543 eligible patients (mostly community dwelling) were invited to participate in the study, of whom 305 (56%) agreed.9 Subjects completed the standard SF-36 plus three additional questions, replacing the word ‘walk’ with ‘wheel’ for PF questions 9–11 (items 3g–i) of the SF-36. The original questions were also asked, allowing coding to either SF-36 walk-wheel (SF-36ww) or the original SF-36 (Box 1).
Measurements were made at baseline, on enrolment in the RCT and then either on development of the first UTI or (if no UTI occurred) at 6-month follow-up. The characteristics and inclusion criteria of this SCI sample have been described previously.2, 9 The SF-36ww was collected with the assistance of a research officer. This included physically assisting completion of the questionnaire where necessary.
Content validity was assessed by a retrospective review of datasheets for unprompted comments or indications of problems recorded by the patient or assistant on the data collection form during the course of administering the SF-36. The number of participants who made a comment on either baseline or follow-up questionnaire (for the walking questions only) was recorded.
Discriminant validity was determined by comparing participants with paraplegia and tetraplegia. Effect sizes were assessed using the formula: effect size (ES)=(m1−m2)/s1, where m1=reference group mean, m2=comparison mean, s1=reference group standard deviation.10 Internal consistency was assessed using Cronbach's α.11, 12
Responsiveness was assessed in the subset of patients who developed a UTI during the trial. UTI onset is a suitable condition for assessing responsiveness in the patient population because it is a common condition in SCI that leads to clinically relevant consequences to test health-state changes. Responsiveness was analysed by calculating change scores and standardized response means13 (SRM=mean change/standard deviation of change) for those patients (n=138) who developed a UTI during the course of the clinical trial. The number of patients necessary to detect the (paired sample) change in health status associated with developing a UTI was calculated using the derived formula:14, 15
where α=0.05 and β=0.20: Z1−α/2=1.96, Z1−β=0.84 and N is the number of patients who subsequently develop a UTI. The total number of patients required for a study is then obtained by dividing N by the proportion expected to get a UTI.
Results
All participants had SCI and neuropathic bladder. The mean age was 44 with a mean elapsed time since SCI of 14 years. Participants were mostly male (83%), 55% were tetraplegic and 49% had complete spinal injury (by ASIA Impairment Scale definition19). There were no post-randomization losses.
Content validity
Retrospective review of the SF-36 data entry sheets found that 20 of 305 participants (7%) had marked the SF-36 physical activity question (3g–i) as not applicable or problematic. The SF-36ww modification (3g ww–i ww) was problematic in just 9 of 305 participants (3%). Reasons for problems with the SF-36ww were mainly enforced ‘bed rest’ (six subjects). One participant stated that the use of an electric wheelchair led to problems interpreting the SF-36ww questions, whereas the remaining two responses had no reasons recorded.
Table 1 shows a comparison of the baseline responses to the standard SF-36 physical activity questions (3g–i) and the equivalent SF-36ww items (3g ww–i ww). The floor effects of the standard SF-36 walking questions are clearly demonstrated, with 93–96% of this sample being maximally limited at healthy (non-UTI) baseline. In comparison, SF-36ww scores were more evenly distributed among the response categories, with 10–26% of participants being maximally limited, and 54–80% not being limited at all.
SF-36ww summary statistics at baseline
The simple change of ‘walk’ to ‘wheel’ in the physical activity questions (3g–i) increased the overall mean baseline PF score from 18 (s.d.=18.8) to 39 (s.d.=22.4) and the PCS from 33 (s.d.=7.7) to 37 (s.d.=8.3), whereas the MCS was only slightly altered from 56 (s.d.=12.1) to 54 (s.d.=11.9). Using the paired t-test, these differences were all statistically significant (P<0.001).
Ceiling and floor effects
The overall sample (N=305) had a large floor effect of 29% (that is, subjects recorded ‘1-limited a lot’ for every item in Box 1). Post-modification (PFww) improved to 8%. The tetraplegic subgroup (N=167) accounted for most of the floor effect, which was substantially reduced from 49 to 14% with the modification. The paraplegic subgroup (N=138) contributed little to the floor effect in the standard PF domain scores, but this effect was further reduced from 4.3 to 1% using the PFww.
Discriminant validity and internal consistency
Table 2 shows that the ability of the SF-36ww to discriminate tetraplegia from paraplegia is similar to that of the standard SF-36. Mean differences and effect sizes in the PF domain and the PCS and MCS composite scores between the groups were similar for the SF-36 and the SF-36ww. Cronbach's α was slightly better for the PFww scores (0.85) than for the standard PF domain scores (0.83), demonstrating good internal consistency for the walk-wheel modified PFww domain.
Responsiveness of the SF-36ww to disease-state change
The scores of the 138 patients who went on to develop a UTI were analysed for responsiveness. Table 3 demonstrates that, when compared with the standard SF-36, the SF-36ww modification almost doubled the SRM (our indicator for responsiveness) in the PF domain for all neurological levels (from SRM=0.36–0.68) and increased the PCS responsiveness by 24% (from SRM=0.58–0.72). A slight decrease in responsiveness in the modified mental composite score (MCSww) was noted. When the sample was stratified into paraplegic and tetraplegic neurological levels, the least responsive domain was the standard PF domain in the tetraplegic group. With the walk-wheel modification, the responsiveness of this group improved by over five times (from SRM=0.11–0.58, n=77). In contrast, the responsiveness for the paraplegic group increased by only 12% (from SRM=0.77–0.86, n=61).
We used the SRMs to calculate the sample sizes required to detect a change in health status (as reflected by changes in the PF domain and the PCS and MCS scores) associated with UTI. These sample sizes assume that all subjects will develop a UTI. The results in Table 4 demonstrate the difficulty in detecting disease-state change using the PF domain in tetraplegic persons and the marked improvement that the SF-36ww modification has on sample size (N=611 vs 24). Over all neurological groups, the SF-36ww had a smaller but still marked reduction on sample size (N=60 vs 17).
To determine sample size estimates for a study in which only some of the subjects will develop UTI, it is necessary to divide the numbers in Table 4 by the proportion expected to get UTI. For example, using the above PF and PFww results (all neurological levels) and our 45% UTI rate gives a total standard SF-36 sample size of 133 (60/0.45) compared to 38 (17/0.45) for the SF-36ww modification.
Discussion
The SF-36 is widely used as a health status measure across many disease groups including SCI.16 This is despite criticism of the content validity and floor effects of the physical domain of the SF-36.7, 17 Our SF-36ww modification differs from that of Meyers and Andresen,8 Andresen and Meyers17 and Tate et al.7 in that it alters only the walking items and ignores modification of the items about climbing stairs. Our justification for this decision is that the climbing tasks are more likely to be affected by the environment, whereas the locomotion items are more likely to depend on transportable devices; that is, if a wheelchair is the main mode of locomotion, the subject is likely to travel with it. This reduces problematic situations where scores on health status scales may alter simply by being away from a suitable environment, such as when people travel on holidays. The SF-36ww modification is simple, contains only one task type and is quick to perform. We acknowledged that, while pragmatic, our solution is not as complete philosophically as that suggested by Andresen and Meyers and Tate et al. Further studies to review any additional effect of modifying the stair-climbing variables (3d and e) should be performed to see if this also improves responsiveness.
We found that asking participants both the standard SF-36 and the modified SF-36ww items in sequence (in essence asking the problematic physical questions twice, with modified wording to maintain broader SF-36 compatibility) was less annoying to participants than asking questions about walking in isolation. Participants using the SF-36ww now have a relevant additional response to all of the questions about walking. While these three additional SF-36ww items appeared to enhance the acceptability of the questionnaire, a weakness of this study is that our retrospective analysis of content validity is likely to have underestimated the actual number of participants who experienced problems during its administration. Respondents had to feel strongly enough about a question to complain as this involved recording a comment on a datasheet, with or without assistance. As a result, additional studies are necessary to clarify content validity issues related to the modification. However, overall such minor modifications are likely to be as acceptable and feasible in application as existing standard versions of the SF-36. The additional three questions did not appreciably increase the completion time of the questionnaire.
On the basis of retrospective review of recorded comments, participants completing the SF-36ww questionnaire should have the following additional information made clear in a preface:
-
1)
That the wheelchair questions are to be completed by the main mode of wheelchair used by the participants (for example, if the patient uses both an electric and manual wheelchair, they should score the chair they use the most at the time of assessment), and;
-
2)
That patients in situations such as complete bed rest should score based on their current restrictions.
Overall, the SF-36ww modification is quick to implement and is attractive in that it goes a good way toward addressing the content validity problems of the standard SF-36 in the SCI population. Including both the standard and the modified versions of the three walking items retains comparability with disease groups external to SCI by the ability to code to standard SF-36 PF, PCS and MCS values as required. This also allows for powerful interpretations of the underlying SF-36 to be maintained such as utility estimates through SF-6D transformation.6, 18
Reassuringly, the discriminant validity between tetraplegic and paraplegic subgroups for the SF-36ww was almost identical to that of published validation assessments using the standard SF-36.3 The slightly better self-reported mental health in the tetraplegic vs paraplegic group has been reported previously and reflects the negative weighting given to the PF domain in calculation of the MCS.1 These problems were not rectified by the SF-36ww modification, so there is no advantage or disadvantage in the area of internal discriminant validity. Likewise, the internal consistency of the PFww domain, as demonstrated by Cronbach's α, remained similar to that of the PF domain in the standard SF-36.
In addition to improved content validity, the major benefit of the SF-36ww modification over its predecessor when used in SCI populations is the impact on the responsiveness of the health status measure to incident disease states. The standard PF domain scores were poorly responsive and most heavily influenced by the floor effect in the tetraplegic group.3 The standard SF-36 PF domain scores should not be expected to detect change in disease states over time, particularly where a significant proportion of a sample are tetraplegic patients who predominantly utilize a wheelchair for locomotion. The SF-36ww (walk-wheel) modification significantly improved the floor effect in the tetraplegic group, thereby enabling it to be a useful tool to detect within-group clinical change over time.
Accordingly, the SF-36ww health status measure is likely to be useful in studies and clinical management of medical conditions associated with profound physical disability where a significant proportion of the sample are likely to utilize a wheelchair for some or all of their locomotion and where disease-state change is of interest. Further validation studies will be required in populations without SCI, such as those with latter stage neuromuscular disorders.
We have provided a guide to the sample size calculations required in clinical trials and practise where health status change over time is of key concern. The SF-36ww modification demonstrates a clear advantage in study power, particularly for the tetraplegic subgroup. To determine actual sample size estimates from our figures it is necessary to divide the figures in Table 4 by the proportion expected to get a UTI (or other condition). Given the differences demonstrated between paraplegic and tetraplegic populations, if the proportion of each is likely to differ from our sample (55% tetraplegic), it would be necessary to find the sample size for paraplegics and tetraplegics separately and calculate the actual sample size required after estimating the proportion of each likely to be enrolled.
Conclusion
The SF-36ww is a simple modification, which substantially addresses the known problems of acceptability, content validity and floor effects of the standard SF-36 physical domains within populations with SCI while retaining discriminant validity and internal consistency. We demonstrated improved responsiveness for disease-state change within a sample with SCI that will enhance the power of future studies to assess the effect of disease progression, treatment and prevention. The application of the SF-36ww to other populations with profound physical disabilities warrants investigation.
References
Ware JE, Snow KK, Kosinski M, Gandek B . SF-36 Health Survey—Manual and Interpretation Guide. The Health Institute, New England Medical Centre Boston: Boston, MA, 1993.
Haran MJ, Lee BB, King MT, Marial O, Stockler MR . Health status rated with the medical outcomes study 36-item short-form health survey after spinal cord injury. Arch Phys Med Rehabil 2005; 86: 2290–2295.
Haran MJ, King MT, Stockler MR, Marial O, Lee BB . Validity of the SF-36 Health Survey as an outcome measure for trials in people with spinal injury. CHERE Working Paper 2007/4, CHERE, Sydney 2007. Available at http://www.chere.uts.edu.au/pdf/wp2007_4.pdf.
Andresen EM, Fouts BS, Romeis JC, Brownson CA . Performance of health-related quality-of-life instruments in a spinal cord injured population. Arch Phys Med Rehabil 1999; 80: 877–884.
Dijkers M . Quality of life of individuals with spinal cord injury: a review of conceptualisation, measurement, and research findings. J Rehabil Res Dev 2005; 42: 87–110.
Lee BB, King MT, Simpson JM, Haran MJ, Stockler MR, Marial O et al. Validity, Responsiveness and Minimal Important Difference for the SF-6D Health Rating Scale in a Spinal Cord Injured Population. Value in Health (accepted 1 August 2007, published online 11 January 2008, doi:10.1111/j.1524-4733.2007.00311.x.).
Tate DG, Kalpakjian CZ, Forchheimer MB . Quality of life issues in individuals with spinal cord injury. Arch Phys Med Rehabil 2002; 83: S18–S25.
Meyers AR, Andresen EM . Enabling our instruments: accommodation, universal design, and access to participation in research. Arch Phys Med Rehabil 2000; 81: S5–S9.
Lee BB, Haran MJ, Hunt LM, Simpson JM, Marial O, Rutkowski SB et al. Spinal-injured neuropathic bladder antisepsis (SINBA) trial. Spinal Cord 2007; 45: 542–550.
Kazis LE, Anderson JJ, Meenan RF . Effect sizes for interpreting changes in health status. Med Care 1989; 27: S178–S189.
Bland JM, Altman DG . Statistics notes: Cronbach's alpha. BMJ 1997; 314: 572.
Cronbach LJ . Coefficient alpha and the internal structure of tests. Psychometrika 1951; 16: 297–334.
Husted JA, Cook RJ, Farewell VT, Gladman DD . Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol 2000; 53: 459–468.
Wittes J . Sample size calculations for randomised controlled trials. Epidemiol Rev 2002; 24: 39–53.
Walters SJ . Sample size and power estimation for studies with health related quality of life outcomes: a comparison of four methods using the SF-36. Health Qual Life Outcomes 2004; 2##http://www.hqlo.com/content/2/1/26.
Wood-Dauphinee S, Exner G, Bostanci B, Exner G, Glass C, Jochheim KA et al. Quality of life in patients with spinal cord injury—basic issues, assessment, and recommendations. Restor Neurol Neurosci 2002; 20: 135–149.
Andresen EM, Meyers AR . Health-related quality of life outcomes measures. Arch Phy Med Rehabil 2000; 81: S30–S45.
Brazier J, Roberts J, Deverill M . The estimation of a preference-based measure of health from the sf-36. J Health Econ 2002; 21: 271–292.
Maynard Jr FM, Bracken MB, Creasey G, Ditunno Jr JF, Donovan WH, Ducker TB et al. International standards for neurological and functional classification of spinal cord injury. American spinal injury association. Spinal Cord 1997; 35: 266–274.
Acknowledgements
This Study was sponsored by New South Wales Motor Accidents Authority (NSW MAA).
Author information
Authors and Affiliations
Corresponding author
Additional information
SF-36 Health Survey (c) 1988, 2002 by Medical Outcomes Trust and QualityMetric Incorporated—All rights reserved. SF-36 is a registered trademark of the Medical Outcomes Trust.
Rights and permissions
About this article
Cite this article
Lee, B., Simpson, J., King, M. et al. The SF-36 walk-wheel: a simple modification of the SF-36 physical domain improves its responsiveness for measuring health status change in spinal cord injury. Spinal Cord 47, 50–55 (2009). https://doi.org/10.1038/sc.2008.65
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/sc.2008.65
Keywords
This article is cited by
-
Water intake and recurrent urinary tract infections prevention: economic impact analysis in seven countries
BMC Health Services Research (2023)
-
Central neuropathic pain
Nature Reviews Disease Primers (2023)
-
Quality of life in adults with muscular dystrophy
Health and Quality of Life Outcomes (2019)
-
Satisfaction with life, health and well-being: comparison between non-traumatic spinal cord dysfunction, traumatic spinal cord injury and Australian norms
Spinal Cord Series and Cases (2019)
-
“When I saw walking I just kind of took it as wheeling”: interpretations of mobility-related items in generic, preference-based health state instruments in the context of spinal cord injury
Health and Quality of Life Outcomes (2016)