Original ArticleThe SF-36 summary scales were valid, reliable, and equivalent in a Chinese population
Introduction
Health-related quality of life (HRQoL), defined by Bullinger et al. [1] as “the impact of perceived health on an individual's ability to live a fulfilling life,” is becoming an important outcome measure in health services and clinical trials. The MOS 36-item Short-Form Health Survey (SF-36) is a popular HRQoL measure that has been translated and validated for Chinese adults in Hong Kong (HK) [2], [3], [4], [5]. The SF-36 has eight scales measuring eight domains of HRQoL: physical functioning (PF); role–physical (RP), or limitation in daily role functioning due to physical problems; role–emotional (RE), or limitation in daily role functioning due to emotional problems; bodily pain (BP); general health perception (GH); vitality (VT); social functioning (SF); and mental health perception (MH). Each scale consists of 2 to 10 items, and each item is rated on a two- to six-point Likert scale. The scale score is calculated by summation of all the scores of items belonging to the same scale. A profile of eight scale scores, although informative, can be difficult to interpret as an outcome measure in clinical trials [6]. Ware et al. [6], [7], [8] hypothesized that there are two principal factors, namely the physical and the mental components, underlying the eight SF-36 scales. This two-factor structure was demonstrated in the general population in the United States (U.S. standard): the physical health summary (PCS) and mental health summary (MCS) components explained 60% of the total variance of the SF-36 scale scores [6], [7], [8]. The physical component correlated strongly (r ≥ .7) with the physical functioning (PF), role–physical (RP), and bodily pain (BP) scales but weakly (r ≤ .3) with the mental health (MH), role–emotional (RE), and social functioning (SF) scales. The mental component correlated strongly with the MH, RE, and SF scales but weakly with the PF, RP, and BP scales. The general health (GH) and vitality (VT) were bipolar scales, loading moderately (.3 < r < .7) on both physical and mental components [6], [7], [8].
The PCS and MCS scales summarize the eight SF-36 scale scores into two summary scores that give an overall assessment of quality of life related to physical and mental health, respectively. The PCS and MCS scores are easier to interpret and simpler to analyze statistically in clinical trials and longitudinal studies [6], [7]. Because different SF-36 scales correlate with each of the two factors differently, they are weighted by the appropriate physical or mental factor coefficients before aggregation to form the two summary scores. Norm-based scoring with z-score transformation, calculated as (observed score – population mean)/population standard deviation, and standardization of the population mean and standard deviation (SD) to 50 and 10, respectively, are recommended for easier interpretation [6]. The SF-36 PCS and MCS scoring algorithm is summarized below:
The standard SF-36 PCS and MCS scales scoring algorithm uses the population means, SD, and factor coefficients derived from the U.S. general population [6]. A multinational study showed similar factor structures and equivalent population mean PCS and MCS scores between the United States and nine European countries [8], [9]. Ware et al. [8] recommended that the U.S. standard SF-36 PCS and MCS scales and scoring algorithm should be used in these countries, instead of country-specific approaches. Data from the Japanese general population, however, and from several Chinese populations, showed the two principal factor structure and loadings of the SF-36 scales differing from those found in the U.S. population [10], [11], [12], [13]. These studies found that the role–emotional scale loaded more strongly (r = .62–.82) on the physical than the mental component (r = .19 to .49), which was the reverse of that found in the U.S. data (physical: r = .17, mental: r = .78). The vitality scale loaded strongly (r = .79–.88) on the mental component but only weakly (r = .21–.37) on the physical component in these populations, instead of the moderate correlations with both components found in the U.S. data (physical: r = .47, mental: r = .64). This raised a concern of whether the standard PCS and MCS scales are applicable to Asian populations, whose cultures may differ more than the European cultures from that of the United States.
Our objective was to find out whether the SF-36 PCS and MCS scales are valid, reliable, and equivalent for the H.K. Chinese adult population. We also wanted to find out whether a HK-specific scoring algorithm using factor coefficients derived from the H.K. general population would give results equivalent to those of the standard algorithm. Evidence on validity and reliability would support the use of the SF-36 PCS and MCS scales in HK. Equivalence in results between the U.S. and H.K. Chinese populations implies that the standard SF-36 PCS and MCS scales can be used as a cross-cultural HRQoL measure in international studies and global drug trials [14].
Section snippets
Methods
Data of 2,410 Chinese adults randomly selected from the general population in HK that were collected in a cross-sectional norming study of the Chinese (Hong Kong) SF-36 Health Survey in 1998 were used. The detailed sampling and data collection methods have been described elsewhere [3], [5]. The sociodemographic characteristics of the subjects are compared to those of the H.K. general adult population in Table 1.
The data were tested against the following hypotheses.
- 1.
Two principal component
The Hong Kong–specific SF-36 PCS and MCS scales
Two principal component factors were extracted from the eight SF-36 scale scores and the eigenvalues were 3.4968 and 1.1118 for the first two components, respectively. The two principal factor structure and factor loadings, after varimax rotation, of the SF-36 scale scores of the H.K. Chinese adult population are given in Table 2. The physical (first) component correlated more strongly with the physical functioning (PF), role–physical (RP), bodily pain (BP), and general health (GH) than with
Construct validity and reliability of the SF-36 physical and mental health summary scales
The hypothesized two principal factor structure of the SF-36 scales was replicated in the general Chinese population in HK, and the factor loadings were similar to those found in the U.S. population [6], [7]. The physical factor loading in the general health (GH) scale was relatively stronger than hypothesized, but similar to that found in the U.S. population. This confirms the construct validity of the internal factor structure of the SF-36 PCS and MCS scales for the H.K. Chinese population.
Conclusions
The hypothesized two-factor structure of the SF-36 scales was replicated from the SF-36 data of the H.K. Chinese general population, and the two factors explained 57.6% of the total variance of the SF-36 scale scores and 63%–88% of the reliable variance of each scale. The SF-36 PCS and MCS scores showed the expected difference between known chronic disease groups, further supporting their construct validity.
The mean standard PCS and MCS scores of the H.K. Chinese general population differed
Acknowledgments
The general population norming survey of the Chinese (Hong Kong) SF-36 was funded by the Health Services Research Grant, the Government of Hong Kong SAR (HSRC no. 711026). Thanks go to thank Alex Chan, Willis Ho, Joanna Shing, Ka-Lai Chan, Wai-Hung Yu, June Chan, Chi-Kwan Wong, Wing-Yee Lai, Yick-Lok Chan and Hing-Wai Tsang, for their help in data collection and analysis. Parts of this work have been submitted to the University of Hong Kong toward the award of the Doctor of Medicine degree
References (27)
- et al.
Tests of scaling assumptions and construct validity of the Chinese (HK) version of the SF-36 Health Survey
J Clin Epidemiol
(1998) - et al.
The effect of health-related quality of life (HRQOL) on health service utilisation of a Chinese population
Soc Sci Med
(2002) - et al.
The factor structure of the SF-36 Health Survey in 10 countries: results from the IQOLA Project. International Quality of Life Assessment
J Clin Epidemiol
(1998) - et al.
The equivalence of SF-36 summary health scores estimated using standard and country-specific algorithms in 10 countries: results from the IQOLA Project. International Quality of Life Assessment
J Clin Epidemiol
(1998) - et al.
Psychometric and clinical tests of validity of the Japanese SF-36 Health Survey
J Clin Epidemiol
(1998) - et al.
Developing and evaluating cross-cultural instruments from minimum requirements to optimal models
Qual Life Res
(1993) - et al.
Population based norming of the Chinese (HK) version of the SF-36 Health Survey
Hong Kong Practitioner
(1999) Reliability and construct validity of the Chinese (Hong Kong) SF-36 for patients in primary care
Hong Kong Practitioner
(2003)- et al.
SF-36 physical & mental health summary scales: a manual for users of Version 1
(2001) - et al.
The MOS 36-item Short Form Health Survey (SF-36). II: Psychometric and clinical tests of validity in measuring physical and mental health constructs
Med Care
(1993)
Psychometric evaluation of a Chinese (Taiwanese) version of the SF-36 Health Survey amongst middle-aged women from a rural community
Qual Life Res
A community-based study of scaling assumptions and construct validity of the English (UK) and Chinese (HK) SF-36 in Singapore
Qual Life Res
Psychometric and clinical evaluation of a Chinese version of the SF-36 Health Survey among cancer patients in China
Qual Life Newsl
Cited by (240)
What's the clinical significance of VAS, AOFAS, and SF-36 in progressive collapsing foot deformity
2024, Foot and Ankle Surgery