Reliability of Clinician-Based (GRBAS and CAPE-V) and Patient-Based (V-RQOL and IPVI) Documentation of Voice Disorders
Introduction
Perceptual assessment is the foundation of voice assessment and fundamental to studies of treatment outcomes for surgical and behavioral approaches to management of voice disorders.1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Approaches to documenting perceived voice qualities have evolved from descriptive approaches to more concise coding systems.11, 12 Many are designed for well-trained, experienced voice professionals while others are intended for untrained patient use.
A system for voice professionals based on a multidimensional analysis of voice qualities evolved from the work of several researchers.13, 14 Known as GRBAS (Grade, Roughness, Breathiness, Aesthenia, Strain), this ordinal system was popularized after being described by Hirano.14 Concerns regarding the reliability of such systems resulted in considerable discussion. Bassich and Ludlow15 reported that four inexperienced raters required 8 hours of training before reaching 80% agreement in their ratings of normal and pathological voices. De Bodt et al16 reported that test-retest reliability of the Grade (G) parameter ranged from fair to good while that of the other parameters ranged from moderate to fair, based on kappa statistics. They also found that experience had an important impact on ratings. Gerratt et al17 suggested that explicit anchors are needed to maximize reliability of perceptual assessment of voice quality.
Kreiman et al12 further suggested that scaling systems that rely primarily on ordinal or equal-appearing interval scales may have limited reliability potential. They suggested that a visual analog scaling procedure could serve to address several limitations of other approaches. This perspective was incorporated into a new scaling tool produced by a group of clinical speech-language pathologists and voice scientists specializing in perceptual assessment of voice at the Consensus Conference for Perceptual Measure of Voice Quality sponsored by the American Speech-Langauge-Hearing Association Special Interest Division #3 for Voice and Voice Disorders, June 10–11, 2002. The tool was called CAPE-V (Consensus Auditory Perceptual Evaluation of Voice) and used a type of visual analog scaling supplemented by various other descriptors. Instructions for using CAPE-V and a rating form are available online through the American Speech-Language-Hearing Association's Division 3 for Voice and Voice Disorders at http://www.asha.org/about/membership-certification/divs/div_3.htm.
Some controversy clearly remains. The concerns expressed by Kreiman et al12 motivated Wuyts et al18 to compare the original GRBAS scale with a visual analog version of the GRBAS scale they designed. They asked 29 raters to evaluate the pathologic voices of 14 individuals. The authors reported that contrary to the findings of Krieman et al,12 the original 4-point GRBAS scale yielded higher interrater agreement than did the visual analog version.
Another approach to documenting voice disorders arose from concerns of healthcare professionals and insurance providers that characterization of the presence and severity of disease processes requires the input of the individuals affected by the disease. In the area of voice, Smith et al19 found that patients' ratings of impairment due to voice disorders were similar in range and severity to those that were due to more severe medical diseases. Several “patient-perception” techniques have been described in the literature. Jacobson et al20 described a 30-item version of an original 80-item questionnaire they called the “Voice Handicap Index.” Hogikyan and Sethuraman21 reported that their 10-item “Voice-Related Quality of Life” or V-RQOL instrument was a valid index of quality of life impairments due to voice disorders.
Although these approaches have been used in clinics and described in the literature, little is known about how they compare and relate to each other and how reliable they are. The purpose of this research was to examine the reliability of the clinician's rating systems when the two were used simultaneously. The two clinician's scales were used simultaneously to test the possibility that the structure of the two rating systems (4-point GRBAS vs 100-point CAPE-V) might impact reliability, as has been suggested in the literature. If the structure of the two scales has little impact, we would expect reliability for both scales to be very similar. However, rating variability of 1 point on a 4-point scale represents a difference of 25%, while rating variability of 1 point on a 100-point scale represents only a 1% difference. For this reason alone, it is possible that reliability of the two scales could be substantially different, even when used simultaneously.
An additional purpose was to compare and contrast two clinician-based and two patient-based approaches to documentation of voice quality and the effects of voice disorders on patients' quality of life. If the clinician's perceptions of voice quality are not unlike the patient's perceptions, it may be assumed that the ratings of voice quality may not be influenced by the personal experience of producing the voice. However, if it is the case that the experience of production colors the patient's percept, we may expect the clinician's ratings of quality to differ from those of the patient's. Also, if there is similarity between the two patient's rating scales, which attempt to capture how voice quality affects the patient's life, we may choose the one that is shorter, simpler, and less time consuming.
Section snippets
Methods
This research was reviewed and approved by the Institutional Review Board of the University of Iowa. Four tools for documenting perceptual judgments of dysphonia were used in the clinical assessment of voice disorders. Two of these (GRBAS and CAPE-V) were clinician-based approaches to characterizing perceptual aspects of voice disorders. Two others (V-RQOL and IPVI) were patient-based approaches to characterizing the patient's perception of the presence, severity, and impact of voice disorders
Intrarater reliability
Spearman's correlation coefficients calculated to estimate reliability of the clinician-based severity of dysphonia comparisons (GRBAS Grade and CAPE-V Severity) are presented in Appendix A. As stated previously, the voice sample set was selected to ensure a balanced representation of samples based on the original examining clinician's “Grade” ratings of dysphonia. Reliability of the ratings of the other clinician-based perceptual parameters (roughness, breathiness, etc.) was less meaningful
Discussion
As patients with disordered voices have received growing and more careful attention from speech-language pathologists and otolaryngologists, so have the tools that are designed to assist in characterizing the nature of those disorders. The importance of considering both the clinician's and the patient's perceptions has only recently been recognized. When a new tool becomes available, it is the clinician's responsibility to understand its strengths and weaknesses relative to tried and true
Acknowledgments
The authors thank Gail Kempster, PhD, for her insights and suggestions during development of this manuscript.
References (21)
- et al.
Performance effects on the voices of 10 choral tenors: acoustic and perceptual findings
J Voice
(1996) - et al.
Perceptual evaluation of voice quality and its correlation with acoustic measurements
J Voice
(2004) Reliability in perceptual analysis of voice quality
J Voice
(2005)- et al.
Test-retest study of GRBAS scale: influence of experience and professional background on perceptual rating of voice quality
J Voice
(1997) - et al.
Is the reliability of a visual analog scale higher than an ordinal scale? An experiment with the GRBAS scale for the perceptual evaluation of dysphonia
J Voice
(1999) - et al.
Validation of an instrument to measure voice-related quality of life (V-RQOL)
J Voice
(1999) Some acoustic and perceptual factors in acute-laryngitic hoarseness
J Speech Hear Dis
(1965)- et al.
Differential diagnosis patterns of dysarthria
J Speech Hear Res
(1969) - et al.
Some perceptual dimensions and acoustical correlates of pathologic voices
Acta Otolaryngol Suppl
(1976) - et al.
Perceptual and acoustic correlates of abnormal voice qualities
Acta Otolaryngol
(1980)
Presented at the 34th Annual Meeting of the Voice Foundation of America, June 2005.
Supported by the Department of Otolaryngology-Head and Neck Surgery and the Department of Speech Pathology & Audiology, University of Iowa.