Background
The value of patient-reported outcome (PRO) measures for improving the care and satisfaction of patients is now well established [
1]. Cosmetic impairment is noteworthy in persons with idiopathic scoliosis (IS). Consequently, perceived body image is an important factor when assessing health-related quality of life (HRQOL) in those individuals [
2]. The perception of body image in IS has been evaluated by various PRO instruments. In chronological order of publication and according to the available published information, the most frequently used are the Quality of Life Profile for Spinal Deformities (QLPSD) [
3], the SRS-22 Patient Questionnaire Self-Image subscale [
4‐
6], the Spinal Appearance Questionnaire (SAQ) [
7‐
9], and the Trunk Appearance Perception Scale (TAPS) [
10].
The QLPSD was developed in Spanish [
3] to assess HRQOL in adolescents with IS. The questionnaire contains 21 items grouped in 5 dimensions: psychosocial functioning, sleep disturbances, back pain, back flexibility, and body image. Body image subscale internal consistency was found to be adequate for clinical research (Cronbach’s alpha = 0.7) as was test-retest reliability (ICC = 0.66). However, there was no significant correlation between the scale score and Cobb angle.
SRS-22 was designed by Asher et al. [
4‐
6] for the outcome assessment of patients with IS. The SRS-22 consists of 22 items belonging to 5 dimensions: Function/Activity, Pain, Self-Image, Mental Health, and Satisfaction with Treatment. Adequate internal consistency (Cronbach’s alpha = 0.7) and reproducibility (ICC = 0.9) were found for the Self-Image Subscale whereas correlation with Cobb angle was statistically significant (r = - 0.5).
The SAQ is a pictorial scale based on the Walter Reed Visual Assessment Scale (WRVAS) [
11]. The test measures the patients’ perception of their deformity through a scale based on drawings of the body. It has been tested in adolescents with IS. In the first version of the SAQ, designed by Sanders et al. [
7], the WRVAS was refined by adding several drawings and a second scale regarding expectations about body image. This first version consisted of 32 questions in 9 domains. With the use of factor analysis, Carreon et al. [
8] recently found that 14 questions were associated with two factors: 10 were linked to a scale of appearance (SAQ Appearance) and 4 to a scale of expectations (SAQ Expectations). The reported internal consistency for the total score was 0.88 and the test-retest reliability was 0.89. Nevertheless, correlation between the total score and the Cobb angle was only 0.32.
Finally, TAPS was originally designed in Spanish in order to assess patient perception of trunk deformity in individuals with IS [
10]. Cronbach's alpha coefficient was 0.89 and the ICC for the mean sum score to assess test-retest reliability was 0.92, whereas correlation between TAPS mean score and Cobb angle was -0.55.
These four instruments have been separately evaluated in disparate situations, such as with different age groups, treatments, or curve magnitudes. These circumstances could explain the above-mentioned differences found. The final goal of using these instruments is to evaluate the effect of different treatment modalities into patients’ body image perception, in addition to the radiological (Cobb angle) and HRQOL evaluation (SRS-22 Patient Questionnaire is the standard instrument used for this purpose).
As clinicians, we want to know which of the above-mentioned instruments may be better in evaluating patients in our daily practice. We are especially interested in analyzing the relationship between the instrument scores and the curve magnitude, because the Cobb angle is generally recognized as the gold standard measure of disease’s severity. Moreover, we wanted to determine the relationship between these four instruments and the other HRQOL dimensions, such as pain, mental health, and function.
The aim of this study is to compare the psychometric properties (internal consistency and construct validity) of these four instruments in a single group of patients with IS. In addition, we will present the cross-cultural adaptation of the SAQ into Spanish.
Methods
This is a cross-sectional study, approved by the Clinical Research and Ethics Committee of our hospital. The inclusion criteria were patients with IS, 10 to 40 years old, who had not received previous surgical treatment and who agreed to participate in the study. For each patient, posterior-anterior full-length radiographs were performed one week before participation. An orthopedic surgeon (AM) performed all angle measurements using Surgimap Spine Software (Nemaris Inc, New York, NY). For the analysis, the magnitude of the curve with the largest Cobb angle (MLC) of all the patient’s curves was used. Only those patients that had a MLC ≥ 25° in the coronal plane were included. This threshold was chosen because it is generally accepted that curves below 25° do not need any treatment [
12].
The sample was stratified according to MLC in two groups: Group <45° and Group ≥ 45°. This cut-off value of 45° was chosen because at this magnitude, surgical treatment is usually recommended [
12]. We calculated that each group should be comprised of 40 patients in order to obtain a significant between-groups difference in the TAPS score, according to the previously reported data [
10]. Patients were recruited consecutively until the required number for each group was obtained.
All patients completed the SRS-22, QLPSD Body Image Scale, SAQ and TAPS questionnaires on the day of the visit. Questionnaires were administered using paper-based forms and they were completed by the patients themselves without any assistance of the attending physician or of the patients’ parents before the consultation. The researcher who measured x-rays was unaware of questionnaire scores.
Outcome instruments
Quality of life profile for spinal deformities body image scale (QLPSD-bi)
The QLPSD-bi evaluates body image in adolescents with IS and includes 4 items. Patients had to rate their agreement or disagreement with each statement on the questionnaire using a five-point Likert scale. The total score of the domain ranges from 4 (best perception) to 20 (worst perception). In this study, we used the original Spanish version of the instrument [
3].
Scoliosis research society-22
The SRS-22 consists of 22 items belonging to 5 dimensions: Function/Activity, Pain, Self-Image, Mental Health, and Satisfaction with Treatment. Each domain had five items each, with the exception of satisfaction with treatment, which had two items. The two satisfaction items were not included in the final analysis. Each question is answered using a five-point Likert scale ranging from 1 (worst) to 5 (best). Results are presented as the mean of each scale (sum of 5 questions/5) and the mean subtotal score (sum of 20 questions/20); hence, ranking ranges are from 1 to 5. In this study, we used the validated Spanish version of the instrument [
13].
Spinal appearance questionnaire (SAQ)
SAQ consists of two parts: SAQ Appearance and SAQ Expectation.
14 questions were associated with two factors: 10 were linked to a scale of appearance (SAQ Appearance) that measures patient’s perception of spinal deformity’s appearance; and 4 to a scale of expectations (SAQ Expectations) which measures expectations about Self Image.
The SAQ has a total possible score ranging from 14 (best score) to 70 (worst score). The scale is composed of two domains. The SAQ Appearance domain is based on 10 drawings with a score of 1 (best score) to 5 (worst score) and a possible range of 10 to 50. The Expectations domain is comprised of a five-point Likert scale with 4 items, with a total sum ranging from 4 (lower expectations) to 20 (higher expectations) [
8,
9].
For the present study, we first performed a transcultural adaptation of the SAQ items from the original English into Spanish. The cross-cultural adaptation process was performed using the guidelines of the International Quality of Life Assessment (IQOLA) Project [
14,
15]. Starting with the original English version, two independent translators each produced a translation into Spanish. Two other independent translators then translated the SAQ back into English. The first two of the translators were native English speakers and the last two were native Spanish speakers. An expert committee that was comprised of the translators, one spine surgeon, one specialist in physical medicine, and one psychologist specializing in spine deformities assessed the translations. A final version was developed by consensus of the entire working group (Additional file
1).
Trunk appearance perception scale (TAPS)
The TAPS includes 3 sets of drawings, corresponding to 3 viewpoints of the trunk: looking towards the back, looking towards the head with the patient bending over, and looking towards the front. The last drawing has two sets, one for women and one for men. Each drawing is scored from 1 (greatest deformity) to 5 (least deformity), and a mean total is then obtained, with results ranging from 1 to 5. On this scale, patients have to choose the drawings that are most similar to their perception of their body image. The original Spanish version of the test was used for the current study [
10].
Analysis
SPSS 17.0 software was used for the statistical analyses. We included all data that were obtained for all patients, as no missing data were found upon final review. In the descriptive analysis, the mean and standard deviation (SD) were calculated for all variables. Data were analyzed separately according to the age groups. Mean differences were assessed with a Student t-test. Reliability of the outcome instruments was estimated by the internal consistency and it was determined using Cronbach's alpha coefficient. We have considered as acceptable a value of Cronbach's alpha ranging from 0.7 to 0.95 (Tavakol and Dennick, 2011) [
16]. Reliability was assessed both for the entire sample and for each age group (younger and older than 18 years old).
We hypothesized that the PRO instrument scores were correlated with the magnitude of the curve. Consequently, the mean PRO instrument score should be different between the two groups of different curve magnitude. To test this hypothesis we first calculated the Pearson’s correlation coefficient between MLC and PRO instruments scores. We then conducted a Student’s t-test to analyze mean difference between MLC groups. Secondly, we hypothesized that the scales evaluating body image would correlate strongly (i.e., correlation coefficient > 0.6) between them but they would not correlate (i.e. correlation coefficient < 0.3) with other dimensions such as mental health, pain or function. To test these hypotheses, we determined the inter-correlations by finding the Pearson’s correlation coefficient between the image scales (QLPSD_bi, SRS-22 image; SAQ and TAPS) and the correlations among these scales and mental health, pain and function SRS-22 scales. In addition, data were also analyzed separately for the two age groups. Statistical significance was set at p < 0.05.
Discussion
Overall, the four scales have good psychometric properties, including adequate internal consistency, fair correlation with scoliosis magnitude, and significant inter-correlation between the four scales. These instruments also showed a significant correlation with the non-image dimensions of pain, daily function, and mental health. Consequently, our hypotheses regarding the divergent validity of the instruments were not supported by the results. In particular, all of the tests showed satisfactory internal consistency (> 0.7), especially the pictorial scales: SAQ Appearance (α = 0.89) and TAPS (α = 0.87). To analyze the construct validity of instruments, we assessed convergent and divergent validity. The convergent validity was analyzed in two ways. First, the correlation between the instrument score and the MLC was determined. The highest correlation coefficients were between the MLC and the pictorial scales (TAPS r = 0.62, SAQ Appearance r = 0.61); textual scales showed significant but moderate correlation with the MLC (SRS-22 Self-Image scale r = - 0.41, QLPSD-bi score r = 0.36), whereas the weakest coefficient was obtained for SAQ Expectations (r = 0.24). To confirm this relationship, we also determined the instrument mean score differences between groups of curves above and below 45°. Patients with curves greater than 45° were found to have the worst scores across all instruments, except for the SAQ Expectations. Our data supports the findings of previous research. Worst scores in greater curves have been reported for SAQ [
7], TAPS [
10] and SRS-22 [
17].
Secondly, correlations among the four instruments were performed. All scales were significantly correlated. The highest correlations were found between TAPS and SAQ Appearance (r = - 0.8), as well as between QLPSD-bi and SRS-22 Self-Image (r = - 0.75). These data indicated that the four scales explore the same dimension. Nevertheless, pictorial scales had a higher correlation between them than the textual scales had. This finding may either suggest that pictorial and textual scales may assess slightly different constructs within the same body image dimension, or that some of the association is due to differences in the scale format (textual versus pictorial).
Before testing the divergent validity, we hypothesized that body image perception instruments would not correlate with instruments measuring other dimensions, such as pain, daily function, and mental health. We evaluated these dimensions using the SRS-22 subscales. We hypothesized that there would be low correlations between the body image scales and the other dimensions. However, the correlations were significant and ranged in absolute magnitude from r = - 0.80 to r = 0.68 in the expected directions (Table
3). They were the highest for the SRS-22 Self-Image subscale, but some correlations over 0.5 were also observed for both the TAPS and the SAQ. These data confirm that perceived body image is a prominent constituent in HRQOL of patients with scoliosis. The results also found that the body image scales have modest divergent validity, with pictorial scales having a lesser correlation with the non-body image dimensions.
Analysis by age groups was also performed. We chose 18 years as the cut-off value because it is usually the age required to include patients in “adult” scoliosis registries. Internal consistency was similar in both groups. However, the mean instrument scores were significantly worse in the older group than in the younger group. Our data supported the similar findings previously reported for TAPS [
10] and SRS-22 [
18]. The correlation between the MLC and instrument scores was similar in both age groups for the pictorial scales, but it was remarkably different when using the textual scales. In the younger group, there was a lack of correlation between the textual scales score and MLC. This finding calls into question the validity of the textual body image scales when used with younger patients. Parent et al. [
18] have mentioned similar limitations with using the SRS-22 questionnaire in this age group, where ceiling effect is also remarkable. Nevertheless, a deeper analysis is warranted because we have not considered other co-variables that may influence body image perception in younger patients.
In this study, we used the Spanish versions of the various assessment tools. The QLPSD-bi [
3] and TAPS scale [
10] were originally created in Spanish, and a properly validated Spanish version of the SRS-22 is available [
13,
19]. However, when the study was designed, there was no Spanish version of the SAQ. Therefore, we first performed a cross-cultural adaptation of the instrument, using previously recommended methods [
14,
15]. Comparisons of the psychometric properties of the various instruments calculated in our study with those of the original versions are shown in Table
1. When considering the internal consistency, the values between the two sets of data are very similar [
3,
7,
10,
13].
The SAQ Expectations domain is a novel, unique scale that evaluates patients’ expectations regarding scoliosis surgery. Although its internal consistency is satisfactory, it has very low correlation with MLC. When the Expectation scale is added to the Appearance scale, a paradoxical effect occurs, because the correlation with MLC of the full scale is lower than that of the Appearance scale alone. A patient’s expectation is a complex concept that is difficult to define, measure, and analyze. There is no unanimous agreement on the suitability of an instrument to assess patients’ expectations [
20,
21]. The SAQ Expectations scale assesses the desire to improve several cosmetic aspects related to the condition. However, some patients who undergo surgery mention other expectations, such as decreasing pain or maintaining satisfactory physical function, in addition to improving body image [
22]. A significant relationship has not been found between patient expectations and the actual change in symptoms or the overall satisfaction with treatment outcomes [
20]. These considerations make us doubt the advisability of adding an expectations scale to one of the body image perception scales.
The SAQ has some limitations that should be considered. There are many different versions available [
7,
8]. The first one included 20 items, including eight pictorial items related to deformity and 12 questions on the patient’s expectations regarding treatment. A second version (SAQ v 1.1) [
7] was then created containing 33 items: 11 pictorial items and 22 questions on the expectations regarding treatment. However, factor analysis [
8] demonstrated that only 14 items aggregated in two factors: 10 items in an “appearance” factor and 4 items in an “expectation” factor. The final instrument shows satisfactory internal consistency and test-retest reliability. However, the above-mentioned paper [
8] includes several mistakes especially with regard to the scoring of the two subscales. These errors were amended and published in a subsequent paper [
9]. Nevertheless, it is still unclear whether the version 33 items version or the 14 items version is the one recommended by the authors. For our research we decided to use the 14 items version based on its better factorial structure. The internal consistency and divergent validity of SAQ Appearance and TAPS are very similar. As the SAQ Appearance scale is longer and adolescents may have some difficulty with understanding the drawings [
23], we suggest the TAPS may be more usable in daily practice. It is a very short form, with only three pictorial items, and it is quick and easy to complete. SRS-22 Self-Image and QLSDP-bi have similar properties. Nevertheless, only the Spanish version of the QLSDP has been validated, whereas SRS-22 has been translated into several languages.
In this research, we have only evaluated how age and scoliosis magnitude influence body image perception scales. Nevertheless, we have not examined the influence on the body image scales of the other factors, such the type of treatment or surface disfigurement measurements, which have been identified as influencing one’s body image perception [
19,
24].
Finally, we would like to point out that an important aspect in any PRO instrument that should be examined when the instrument is used for evaluative purposes is the instrument’s responsiveness to the changes associated with a therapeutic intervention. Responsiveness after surgical treatment of scoliosis has been reported separately for SRS-22 [
5,
25,
26], SAQ [
7], TAPS [
27], and QLPSD [
28].
Nonetheless, this analysis was not an objective of the current study. In the future, it would be interesting to determine the responsiveness of the four instruments face-to-face in the same group of patients and using different treatment modalities, before making a clinical recommendation for longitudinal studies.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
AM, JB contributed to conception and design the study. AM, ED participated in acquisition of data. AM, JB, ED contributed to literature review, analysis and interpretation of data. JB, ED participated in writing the article. FP revised the final version. All authors read and approved the final manuscript.