Clinical StudyInterrater and intrarater agreements of magnetic resonance imaging findings in the lumbar spine: significant variability across degenerative conditions
Introduction
Degenerative conditions of the lumbar spine are ubiquitous in modern society [1]. Failing conservative management, magnetic resonance imaging (MRI) is a noninvasive and radiation-free imaging modality that is frequently considered for this population. Speed and image quality have continued to evolve for this imaging modality, but limitations remain.
The interpretation of MRI studies is subject to variability. This may be because of variations in the nomenclature [2], [3]. Analogous to clinical medicine, there is no single-established validated grading scheme for many radiographic findings. However, there are also variations inherent to the assessment of resultant images. A study interpreted as “severe” stenosis may be read as “moderate” or perhaps “mild” by another reviewer [4]. Though much of the clinical practice of spine surgery is based on the correlation of clinical symptomatology and imaging findings, the importance of these variabilities in MRI interpretation and nomenclature cannot be ignored.
Most studies evaluating the interpretation of lumbar MRI pathologies have focused on various specific grading scales. For example, studies have examined the diagnostic characteristics of MRI with regard to conditions such as spinal cord compression in acute traumatic injury [5], disc abnormalities [6], [7], [8], [9], [10], end-plate signal (Modic) changes [11], [12], lumbar spinal stenosis [4], [13], and disc herniation [14], [15]. There are several studies that have examined a handful of spinal conditions simultaneously [16], [17], [18].
Considering the reported variability in assessing specific lumbar conditions by MRI, it can be expected that this variation would exist between different pathologies in a standardized comparison. Nonetheless, we believe physicians and patients may underappreciate these inherent variabilities in MRI interpretation despite the widespread use of this imaging modality [4], [16], [18]. The purpose of our study was to examine the interrater and intrarater agreements of MRI in the evaluation of 10 degenerative conditions of the lumbar spine, with a panel of orthopedic spine surgeons and musculoskeletal radiologists.
Section snippets
Patient sample
The patient population for this study was drawn from our institution's radiology database of patients who underwent lumbar spine MRI in 2010 by our Department of Musculoskeletal Radiology. Exclusion criteria included prior lumbar instrumentation or fusion. There were no changes in imaging equipment or technique over the study period. The patients were sorted in chronological order based on the imaging study date, and the first 75 patients were included in our study based on a priori power
Results
The study population consisted of 75 patients, with 36 males (48%) and 39 females (52%). The mean age was 50.2 (range, 14–82) years. Each study was evaluated for 52 data points by each of the 4 reviewers, with the first 10 subjects evaluated twice.
Overall interrater absolute agreement was 76.9% (95% confidence interval [CI], 72.7–81.0). When stratified by pathology (Fig. 1), interrater absolute agreement ranged from 65.1% to 92.0%. This absolute interrater agreement is the percentage of
Discussion
Observer performance is an important source of inconsistency in imaging-based diagnoses. In the lumbar spine, where the differential for symptoms includes many possible pathologic conditions, there has been a paucity of rigorous studies on the agreement of MRI across multiple pathologies. Our study is an attempt at characterizing the interrater and intrarater agreements of MRI in assessing 10 common conditions of the lumbar spine, using a panel of orthopedic spine surgeons and musculoskeletal
References (24)
Degenerative disc disease and back pain
Magn Reson Imaging Clin N Am
(1999)- et al.
Intra- and inter-observer reliability of MRI examination of intervertebral disc abnormalities in patients with cervical myelopathy
Eur J Radiol
(2008) - et al.
Observer variability based on the strength of MR scanners in the assessment of lumbar degenerative disc disease
Eur J Radiol
(2004) - et al.
Reliability of a modified Modic classification of bone marrow changes in lumbar spine MRI
Joint Bone Spine
(2009) - et al.
Interobserver reliability in the interpretation of diagnostic lumbar MRI and nuclear imaging
Spine J
(2006) - et al.
Reader variability in reporting breast imaging according to BI-RADS assessment categories (the Florence experience)
Breast
(2006) Lumbar disc disorders and low-back pain: socioeconomic factors and consequences
J Bone Joint Surg Am
(2006)The proper terminology for reporting lumbar intervertebral disk disorders
AJNR Am J Neuroradiol
(1997)- et al.
Observer variability in assessing lumbar spinal stenosis severity on magnetic resonance imaging and its relation to cross-sectional spinal canal area
Spine
(2002) - et al.
Interobserver and intraobserver reliability of maximum canal compromise and spinal cord compression for evaluation of acute traumatic cervical spinal cord injury
Spine
(2006)
Interobserver and intraobserver variability in interpretation of lumbar disc abnormalities. A comparison of two nomenclatures
Spine
Magnetic resonance classification of lumbar intervertebral disc degeneration
Spine
Cited by (43)
Modeling annotator preference and stochastic annotation error for medical image segmentation
2024, Medical Image AnalysisStandardized Classification of Lumbar Spine Degeneration on Magnetic Resonance Imaging Reduces Intra- and Inter-subspecialty Variability
2022, Current Problems in Diagnostic RadiologyCitation Excerpt :The efficacy of this tool was also not validated with a group of neuroradiologists, and no comparison was made between MSK and NR groups. Fu et al5 also developed a standardized classification of degenerative change, demonstrating improvement in variability, a group of MSK radiologists and orthopedic surgeons. Again, direct comparison with the current study is limited due to different statistical methods.
Underreporting of spinal epidural lipomatosis: A retrospective analysis of lumbosacral MRI examinations from different radiological settings
2022, Diagnostic and Interventional ImagingCitation Excerpt :In fact, in our series, the reporting rate of SEL as the sole pathologic finding on lumbosacral MRI examinations was 33.3% and dropped to 8% in the whole cohort of patients with SEL, and 5.8% in patients in whom SEL was associated with other pathological findings, albeit this difference was not statistically significant (P = 0.0698). In the literature, there is a well-documented variability among radiologists in the interpretation of imaging examinations of the spine [20–23]. SEL seems to be commonly misdiagnosed, according to our data.
Automatic semantic segmentation and detection of vertebras and intervertebral discs by neural networks
2022, Computer Methods and Programs in Biomedicine Update
FDA device/drug status: Not applicable.
Author disclosures: MCF: Nothing to disclose. RAB: Nothing to disclose. WDL: Nothing to disclose. DJB: Nothing to disclose. AWL: Nothing to disclose. AHH: Consulting: Shire HGT (B), Pfizer (B). JNG: Consulting: Affinergy (D), Alphatec (E), Bioventus, Depuy (C), Harvard Clinical Research Institute (E), Powered Research (A), Stryker (E), Transgenomic, Smith and Nephew (D), Medtronic (B); Grants: Smith and Nephew (Genetic tests done at no charge, but not funds exchanged for a study, Paid directly to institution).
The disclosure key can be found on the Table of Contents and at www.TheSpineJournalOnline.com.
There were no sources of funding or conflicts of interest related to this study.