Original articles
Methods for assessing responsiveness: a critical review and recommendations

https://doi.org/10.1016/S0895-4356(99)00206-1Get rights and content

Abstract

A review of the literature suggests there are two major aspects of responsiveness. We define the first as “internal responsiveness,” which characterizes the ability of a measure to change over a prespecified time frame, and the second as “external responsiveness,” which reflects the extent to which change in a measure relates to corresponding change in a reference measure of clinical or health status. The properties and interpretation of commonly used internal and external responsiveness statistics are examined. It is from the interpretation point of view that external responsiveness statistics are considered particularly attractive. The usefulness of regression models for assessing external responsiveness is also highlighted.

Introduction

It is widely argued that outcome measures in clinical trials should be reliable, valid, and responsive 1, 2, 3. A reliable measure is one that tends to produce the same results when administered on two or more occasions under identical conditions 4, 5, 6. Reliability is typically assessed in test–retest studies by analyses based on the kappa statistic or the intraclass correlation. A valid measure is one that measures what it was intended to measure 4, 5, 6 and is assessed by estimation of sensitivity and specificity, ROC curve analyses, correlation analyses, or regression models.

There does not, however, appear to be a consensus in the literature on what constitutes a responsive measure nor, correspondingly, how responsiveness should be quantified. A review of the literature suggests there are two major aspects of responsiveness, each having its own definition and strategies for assessment. We define the first as “internal responsiveness,” which characterizes the ability of a measure to change over a particular prespecified time frame. One widely used method of assessing internal responsiveness is to evaluate the change in a measure within the context of a randomized clinical trial involving a treatment that has previously been shown to be efficacious 7, 8, 9, 10, 11, 12, 13. Any observed change in the measure is typically attributed to clinically relevant changes in health. Alternatively, change in a measure has been assessed using a single group repeated measures design, where patients are assessed before and after a known efficacious treatment (e.g., total hip arthroplasty, back surgery, physiotherapy). This strategy has frequently been employed to compare change in various health status measures 14, 15, 16, 17, 18, 19. The internal responsiveness of a measure, evaluated by either of these methods, will depend upon both the particular treatment and the particular outcomes used to determine treatment efficacy.

We use the term “external responsiveness” to define the second aspect of responsiveness. External responsiveness reflects the extent to which changes in a measure over a specified time frame relate to corresponding changes in a reference measure of health status. In this context, in contrast to internal responsiveness, the measure is not in and of itself of primary interest. Rather, it is the relationship between change in the measure and change in the external standard. One motivation for this is that if the relationship is strong (i.e., the measure is shown to adequately capture changes in the standard), the measure may be used instead of the reference measure as an outcome in future clinical trials. Another motivation is more general and is not based on the assumption that the measure under study should be a replacement for a standard measure. Rather, change in the standard is viewed as an accepted indication of a change in the condition of a patient. By accepted, we mean change that would be widely regarded by clinicians as meaningful and important change in clinical status. If the standard changes then it follows that some change in the measure under investigation would also be expected. Note that, unlike internal responsiveness, the external responsiveness of a measure will depend only on the choice of the external standard and not on the treatments under investigation. This implies that external responsiveness is a property of a measure and therefore it has meaning in a wider range of settings than the more context specific concept of internal responsiveness.

The lack of consensus on what “responsive” actually means, and how one should assess it, has led to a proliferation of responsiveness statistics, with investigators often reporting several within one study. This makes comparisons of measures across and within studies difficult or impossible 2, 3, 18, 19, 20, 21, 22. Beaton et al. [20] write: “there is no gold ‘standard’ for summarizing responsiveness, although some consensus is needed … the literature demonstrates inconsistency in the methods used for calculating responsiveness statistics, and readers must be cautioned to examine the formulae amid adaptations made to the different statistics.” Thus, the most appropriate responsiveness statistic remains a matter of debate and, indeed, if there are different aspects of responsiveness that are of interest, more than one statistic may be reported. However, there seems to be several statistics that have been proposed and are used that purport to reflect the same thing. This has motivated our current investigation.

The purpose of this article is to examine the property of responsiveness from a foundational standpoint. Many of the issues that we discuss have been explicitly or implicitly raised by others 2, 3, 17, 20, 21. Particularly relevant references in this regard are 7, 18. Our intentions in renewing discussion are to: 1) highlight the distinction between internal and external responsiveness; 2) clarify both the properties and interpretation of frequently used responsiveness statistics; 3) recommend the use of regression models to assess external responsiveness; and 4) provide directions for future research. Our illustrative example is drawn from the rheumatological literature, although the general principles we highlight apply to all disciplines in which responsiveness is important.

Section snippets

Notation

Here we define some notation that we will use subsequently to present the various responsiveness statistics. We assume research participants are assessed at two timepoints and let X1 and X2 denote their responses on the measure at the first and second assessments respectively. We let Dx = X2X1 represent the change in the response on the measure over time, with positive (negative) values for Dx representing increase (decrease) in the response over time. We let the expected mean change between

Internal responsiveness

The most frequently used responsiveness statistics fall into this group.

Receiver operating characteristic method

Deyo and Centro [17] were among the first to propose the assessment of responsiveness using receiver operating characteristic curves (ROCs) in rheumatology. In this context responsiveness is described in terms of sensitivity (probability of the measure correctly classifying patients who demonstrate change on an external criterion of clinical change) and specificity (probability of the measure correctly classifying patients who do not demonstrate change on the external criterion) 17, 18, 21. In

Responsiveness in psoriatic arthritis

The data originated from the University of Toronto psoriatic arthritis out-patient clinic [37]. Between 1994 and 1996, 70 patients (27 women and 43 men) completed three health status measures—the HAQ [38], AIMS2 [39], and SF-36 [40]—on two occasions, approximately 12–18 months apart [41].

Here we compute responsiveness statistics for the physical functioning dimension of the HAQ, the AIMS2, and the SF-36 in this sample of 70 patients. For the external responsiveness statistics, a health

Discussion

Further discussion of responsiveness is warranted. We have attempted to provide a structured framework within which such discussion can take place. Here we offer some preliminary thoughts based on our review of the literature.

The distinction between internal and external responsiveness is important. Stucki et al. [18] make a distinction in their work by referring to internal responsiveness simply as responsiveness and external responsiveness as discriminative ability. Kirshner and Guyatt [42]

Acknowledgements

Supported by the Medical Research Council of Canada. The authors would like to thank Dr. Gordon Guyatt and an anonymous referee for helpful referee.

References (43)

  • J.M. Last

    A Dictionary of Epidemiology

    (1983)
  • D.L. Streiner et al.

    Health Measurement Scales. A Practical Guide to their Development and Use

    (1989)
  • R.H. Fletcher et al.

    Clinical Epidemiology—The Essentials

    (1988)
  • J.J. Anderson et al.

    Which traditional measures should be used in rheumatoid arthritis clinical trials?

    Arthritis Rheum

    (1989)
  • J.J. Anderson et al.

    Sensitivity to change of rheumatoid arthritis clinical trial outcome measures

    J Rheumatol

    (1993)
  • L.E. Kazis et al.

    Effect sizes for interpreting changes in health status

    Med Care

    (1989)
  • R. Buchbinder et al.

    Which outcome measures should be used in rheumatoid arthritis clinical trials?

    Arthritis Rheum

    (1995)
  • R.F. Meenan et al.

    Outcome assessment in clinical trials. Evidence for the sensitivity of a health status measure

    Arthritis Rheum

    (1984)
  • C. Bombardier et al.

    A comparison of health-related quality-of-life measures for rheumatoid arthritis research

    Controlled Clin Trials

    (1991)
  • M.H. Liang et al.

    Comparative measurement efficiency and sensitivity of five health status instruments for arthritis research

    Arthritis Rheum

    (1985)
  • J.N. Katz et al.

    Comparative measurement sensitivity of short and longer health status instruments

    Med Care

    (1992)
  • Cited by (0)

    View full text