Introduction
There is widespread recognition that assessment of patient outcome following total hip and total knee arthroplasty (THA and TKA respectively) should employ patient-reported outcome (PRO) measures. These tools allow a more patient-centred view in treatment evaluation [
1‐
3] and advocates suggest that they provide a remarkably sophisticated evaluation of whether a treatment has worked in the (important) sense of whether or not the patient feels better, and how much better [
4]. Consequently a number of disease and joint-specific PRO assessment instruments have been developed for use with orthopaedic conditions [
5‐
8]. These outcome questionnaires focus mainly on the patients function in typical activities of daily living (ADLs), pain intensity or joint stiffness. They are often employed in tandem with more generic health outcome instruments such as the SF-36 which in addition to assessing physical health incorporates questions on psycho-social aspects of general health. Some generic tools such as the SF-12 have separate summary scores for physical and mental health. Tools such as this have been shown to display good divergent validity [
9] in that there is very little interaction between physical and mental component questions and thus overall scores. Interestingly in disease-specific scores that do not have specific mental health components, significant correlation of psychological variables and disease specific variables has been demonstrated [
10‐
12]. This interaction is somewhat expected as poor physical outcome and pain after THA/TKA can cause psychological distress and reduce quality of life, or alternatively, poor psychological status can result in worse physical outcome by interfering with the patients’ compliance to treatment [
13] and affect pain coping strategies [
14]. Such causal dependency is probably bidirectional with the directions difficult to separate. An alternative explanation though to the overlap in mental and physical health parameters in these assessment tools is a failure of the patient-reported outcome measure to discriminate the overlapping constructs, and thus poor divergent validity [
15,
16]. A lack of divergent validity means that interpretability of such scales is limited since the resulting scores blend different constructs. Poor outcome scores can then reflect poor physical outcome, poor psychological status, or both. It is clearly desirable to use a diagnostic tool that separates physical from psychological variables as well as possible if one wishes to assess physical function in isolation.
Thresholds for correlations as indicators of divergent validity are rarely explicitly stated in the literature. However, some studies suggest that correlations below 0.30 indicate divergent validity [
17,
18], whereas correlations above 0.40 are considered as indicating convergent validity [
19].
The aim of this study was to evaluate the divergent validity of the WOMAC score and the Forgotten Joint Score, and to investigate correlations with psychological variables after joint arthroplasty.
Patients and methods
Sample
All patients that underwent THA or TKA at our institution within the last five years were considered for enrolment in this study and approached for study participation at their follow-up visits in 2008.
Inclusion criteria were: unilateral THA (cemented Stuemer-Weber hip stem, uncemented Fitmore cup, Zimmer) or unilateral TKA (cemented LCS complete, DePuy), primary arthroplasty surgery, no previous THA or TKA surgery.
Sociodemographic and clinical data including sex, age, education, type and location of implant and time since surgery were collected. Patients were sent the questionnaires and an informed consent form via mail. A reminder call was made to those patients who did not send back the questionnaires within eight weeks. If there was no response for another four weeks they were excluded. Reasons for not participating in the study were recorded.
Ethical approval for this study was obtained from the ethics committee of the canton of St Gallen, Switzerland.
Assessment instruments
Forgotten joint score-12
The Forgotten Joint Score-12 (FJS-12) is a recently published PRO measure to assess joint awareness in hips and knees during various activities of daily living [
6]. It consists of 12 questions and is scored using a 5-point Likert response format with the raw scores transformed onto a 0–100 point scale. High scores indicate good outcome. The FJS has been shown to have a low ceiling effect and discriminates well between good, very good and excellent outcome after THA and TKA. It has shown high internal consistency (Cronbach’s Alpha 0.95) and discriminates well in known group comparisons [
6].
Western Ontario and McMaster Universities Osteoarthritis Index
The Western Ontario and McMaster Universities (WOMAC) Osteoarthritis Index is a widely used outcome measure in patients with lower limb osteoarthritis (OA) [
5]. It consists of 24 questions covering three dimensions: pain (five questions), stiffness (two questions) and function (17 questions). Scale scores are derived from adding up the item scores. High scores indicate poor outcome. The WOMAC OA index has been extensively tested for validity, reliability, feasibility and responsiveness for measuring changes after different OA interventions [
5,
20‐
22].
Brief symptom inventory
The Brief Symptom Inventory (BSI) [
23] is a psychological self-report symptom scale developed as a short-form version of the SCL-90-R [
24]. It is widely used in various medical fields to assess current psychological status and distress. The 53 items are grouped in nine symptom scales (somatisation, obsessive-compulsive behaviour, interpersonal sensitivity, depression, anxiety, hostility, phobic anxiety, paranoid ideation, and psychoticism) and three global indices, Global Severity Index (GSI) as a global distress measure, Positive Symptom Distress Index (PSDI), and Positive Symptom Total (PST). Scale scores are derived from mean item scores. High scores indicate high psychological symptom burden.
Catastrophising scale
The catastrophising scale is part of the Coping Strategies Questionnaire developed by Rosenstiel and Keefe [
25]. It comprises six items assessing catastrophising as a pain-related coping strategy characterised by a feeling of being overstrained and a pessimistic future perspective. The scale scores are derived from adding up the items. A high score indicates poor coping.
Statistical analysis
Sample characteristics are presented as percentages or as means with standard deviations and ranges. For determining associations between the administered scales (WOMAC score, FJS-12, BSI, Catastrophising scale) Pearson-correlation coefficients were calculated. Two multiple linear regression models were used to investigate the impact of sociodemographic and clinical variables and of the psychological scales (BSI and Catastrophising scale) separately for the WOMAC and for the FJS-12 score. In these models adjusted R-Squared (R2) indicates the proportion of variance explained by the independent variables (predictors) in the model. Variables having a significant association with the WOMAC or the FJS-12 in univariate analyis were considered for inclusion into the multivariate regression model if p < 0.05. In a first block of predictors, the patient characteristics sex, education, and location were included. In a second block of predictors the psychological scales (BSI scales and the Catastrophising scale) were included using a forward selection procedure.
Discussion
This study investigated the associations between psychological parameters and physical outcome assessed by two PRO instruments, the WOMAC score and the FJS-12. We found high correlations between disease-specific outcome measures and several of the assessed psychological domains. Multivariate regression showed that catastrophising, psychological distress and somatisation explained almost 60% variance of the WOMAC score beyond the known covariates of sex, implant location and education. We found the same predictor set for the FJS-12, however, psychological parameters accounted only for half the variance seen in the WOMAC score.
Our findings indicate a significant lack of divergent validity of the WOMAC score and, to a lesser extent, of the FJS-12. The variance proportions estimated with help of the regression model suggest a substantial overlap between the orthopaedic and psychological scales. The lack of divergent validity becomes even more evident when opposing the high correlations between the WOMAC subscales themselves (above 0.80) and the correlations of the WOMAC total score with the psychological scores (up to 0.79).
This significant overlap with psychological status is not reflected in the WOMAC scales’ names (pain, stiffness, function) which somewhat misleadingly suggest to just measure physical, joint-related characteristics. This is also true for the FJS-12 which refers to joint awareness. However, the term joint awareness seems more closely related to psychological aspects.
We also found that location of joint arthroplasty (hip or knee) explained less than 5% of variance of both FJS-12 and the WOMAC score. This is interesting as it is well accepted that outcome differs between total hip and total knee arthroplasty populations [
26,
27]. In contrast, the psychological scales exceeded these proportions by a factor of 10 (for both FJS-12 and WOMAC). Thus, our data indicate a stronger association between psychological factors and joint-related outcomes than that between outcome and the type of joint replaced.
Our findings compare well to other results from literature. Escobar et al. [
15] investigated the association between WOMAC scores and the different subscales of the SF-36. They showed that both psycho-social and physical SF-36 scales correlated to the WOMAC score in a similar way. The WOMAC function subscale demonstrated the same correlation with both SF-36 social and physical function scores. WOMAC stiffness was equally correlated with SF-36 role-physical function score and mental health score. Similarly Wolfe [
16] highlighted that divergent validity of the WOMAC may be compromised by factors such as fatigue, symptom counts, depression, and low back pain.
The strong correlation between physical and psychological scales found here and in other studies [
28‐
30] may partially be explained by causal interdepencies that have been suggested by several longitudinal studies.
Sharma et al. [
31] demonstrated that mental health measured with the SF-36 predicted subsequent improvement in physical function in TKA, results in line with Brander et al. [
32], who showed that preoperative depression substantially influences Knee Society Rating Scale function scores five years post-operatively. In contrast, Lingard et al. [
33] found (in a large prospective observational study) that although psychological distress decreased post-operatively, pre-operative levels of distress were not related to post-operative improvement (change in pain and function).
Lopez-Olivo et al. [
12] found a strong correlation between pre-operative psychological status and post-operative physical function at 6 months. Education, coping style and locus of control over health at baseline explained 22% of variance in WOMAC pain at follow-up. A similar predictor-set explained 19% of the WOMAC function scale and 36% of the total score of the Knee Society Rating Scale.
Our study was based on a cross-sectional design which is reasonable for the investigation of divergent validity. However, it does not allow for causal interpretation of the associations between orthopaedic outcomes and psychological variables. A limitation is the limited number of predictors in our model that left a large proportion of unexplained variance. Further interesting predictors that may be of future research interest include patient activity level, social support, cognitive function, range of motion and joint stability.
A particular strength of this study is the use of a comprehensive and detailed assessment of psychological status (BSI and the Catastrophising Scale from the Coping Strategies Questionnaire). These scales are more differentiated and comprehensive than other tools such as the SF-36 which has previously been employed to assess psychosocial characteristics of arthroplasty populations.
Conclusion
We found a substantial overlap between physical and psychological patient-reported symptoms in an arthroplasty population, i.e. orthopaedic PRO measures were strongly associated with psychological PRO measures indicating poor divergent validity.Whereas this may also reflect existing causal dependencies, it impairs valid measurement of orthopaedic outcome. Divergent validity is an important psychometric characteristic of PRO instruments that is required to guarantee accurate assessment of specific orthopaedic outcomes.
Problematically, the category names of the orthopaedic outcome scales suggest measurement of specific constructs such as pain, stiffness, function or joint awareness but they appear to be strongly associated with patients’ psychological status. Our findings suggest that the names of certain orthopaedic scales do not adequately reflect the constructs assessed with these scales.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
KG, MSK and JMG conceived the study objective. All authors participated in the study design. KG and HB coordinated data collection. JMG and KG performed the statistical analysis, interpreted the results and drafted the manuscript. All authors read and approved the final manuscript.