Inter- and intra-observer reliability of contrast-enhanced magnetic resonance imaging parameters in children with suspected juvenile idiopathic arthritis of the hip
verfasst von:
Francesca M. Porter-Young, Amaka C. Offiah, Penny Broadley, Isla Lang, Anne-Marie McMahon, Philippa Howsley, Daniel P. Hawley
Previous work at our institution demonstrated discrepancies between radiologists in interpretation of contrast-enhanced magnetic resonance imaging (MRI) in suspected hip arthritis.
Objective
To assess inter- and intra-observer reliability of selected MRI parameters (effusion, marrow oedema and synovial thickness and enhancement) used in the diagnosis of juvenile idiopathic arthritis.
Materials and methods
A retrospective cohort study was conducted of patients with confirmed or suspected juvenile idiopathic arthritis who underwent hip contrast-enhanced MRI between January 2011 and September 2014. Three pediatric musculoskeletal radiologists independently assessed all scans for effusion, marrow oedema, measurement of synovial thickness, synovial enhancement and subjective assessment of synovium. Categorical variables were analysed using the Cohen κ, and measurement using Bland-Altman plots.
Results
Eighty patients were included. Interobserver reliability was moderate for effusion (κ=0.5–0.7), marrow oedema (κ=0.6), subjective synovial assessment (κ=0.4–0.5) and synovial enhancement (κ=0.1–0.5). Intra-observer reliability was highest for marrow oedema (κ=0.6–0.8) and lowest for effusion (κ=0.4–0.7). Intra-observer reliability for synovial enhancement (κ= −0.7-0.8) and subjective synovial assessment (κ=0.4–1.0) ranged from poor to excellent. For synovial thickness, intra- and interobserver Bland-Altman plots were well clustered around the mean suggesting good agreement.
Conclusion
There were large differences across variables and only moderate agreement between observers. The most reliable parameters were presence of joint effusion and bone marrow oedema and subjective assessment of synovium.
Juvenile idiopathic arthritis is the most common chronic rheumatic condition of childhood, characterised by chronic joint inflammation and associated with progressive joint damage leading to functional disability [1]. Prominent clinical features include joint pain, stiffness and swelling. It is estimated that up to 50% of patients with juvenile idiopathic arthritis have ongoing disease in adulthood [2] and delayed care has been shown to significantly impact function and pain [3, 4]. Longer disease activity increases the risk of joint destruction [5]. Early and aggressive treatment of juvenile idiopathic arthritis with corticosteroids, disease-modifying anti-rheumatic drugs and newer biological therapies has been linked to disease remission and improved long-term outcomes; this understanding of a potential “window of opportunity” to alter the natural course of the disease makes early diagnosis of arthritis of key importance [6, 7].
Contrast-enhanced magnetic resonance imaging (MRI) provides a highly sensitive method of assessment for active arthritis, regarded by many as the gold standard, and is an important aid in early detection [8, 9]. However, with improved treatments for arthritis being used at an earlier stage in disease, obvious radiologic changes associated with arthritis are seen less frequently and radiologists are increasingly required to differentiate normal scans from those showing subtle signs of early inflammation.
Anzeige
The deep anatomical location of the hip joint makes it difficult to reliably ascertain the presence or absence of inflammation from clinical examination or from ultrasound (US) assessment. For these reasons, contrast-enhanced MRI is frequently used as an adjunct to clinical examination of the hip and to guide treatment choices when clinical signs are subtle or equivocal with uncertainty as to the presence of active hip arthritis. Clinician treatment decisions are informed by the radiologist’s reporting of contrast-enhanced MRI scans and it is therefore noteworthy that previous work at our institution demonstrated discrepancies in hip contrast-enhanced MRI scan interpretation between experienced musculoskeletal radiologists [10]. Therefore, work to further define which contrast-enhanced MRI parameters are most reliable, in terms of describing active hip synovitis, is highly desirable and will further enhance the use of contrast-enhanced MRI as an early diagnostic tool in juvenile idiopathic arthritis (both at presentation and during acute flares). This study aims to assess the reliability of various hip contrast-enhanced MRI parameters when assessing for early signs of hip inflammation.
Materials and methods
The study was an institutionally classified service development project and was conducted in collaboration with the University of Sheffield. All included scans were independently reviewed and reported by three experienced pediatric musculoskeletal radiologists (ACO 15 years; PSB 20 years; IML 20 years) masked for patients’ names, clinical findings and diagnoses. We obtained approval from the hospital trust for the service evaluation and registered it with the trust’s quality and standards department.
Patient selection
We conducted a retrospective cohort study of all patients (n=86) who underwent hip contrast-enhanced MRI requested by the rheumatology team at Sheffield Children’s Hospital between January 2011 and September 2014. The inclusion criteria were as follows: (1) suspected diagnosis of active hip arthritis; (2) scan date between January 2011 and September 2014; (3) scan requested by a consultant rheumatologist, (4) contrast-enhanced MRI of the hip joint according to local standard rheumatology hip contrast-enhanced MRI protocol (see below) and (5) full image set available. We included patients with and without confirmed diagnosis of juvenile idiopathic arthritis and recorded the diagnosis (as documented in the patient record) both before and after the scan, cognisant that contrast-enhanced MRI scan results can be an important factor in confirming an inflammatory diagnosis such as juvenile idiopathic arthritis.
Only one scan was included per patient. The earliest scan in the recruitment period was selected to minimise bias toward children requiring close disease surveillance. All patient-identifiable information was removed before analysis. Before the study, a consensus training read between the three observers was conducted using 20 scans not included in the study.
Anzeige
Magnetic resonance imaging variables
A literature review was undertaken to identify the most relevant MRI parameters. EMBASE and MEDLINE were searched. Our primary search focused on articles matched to juvenile idiopathic arthritis, MRI and scoring or assessment and a secondary search looked at all articles involving juvenile idiopathic arthritis, MRI and hip. Based on this review, we selected joint effusion, marrow oedema, synovial enhancement, synovial thickening and subjective synovial assessment. The definition for each of these parameters is summarised in Table 1 and illustrated in Figs. 1–4 [11‐15]. The radiologists interpreted the scans independently, unaware of the clinical diagnoses (Figs. 5 and 6).
Table 1
Assessment criteria for hip synovitis on contrast-enhanced magnetic resonance imaging
Criterion
Definition
Score
Effusion
Pocket of fluid anterior to the femoral neck or substantial axial effusion on coronal T2-weighted fat-saturated images (Fig. 1) [11].
Yes/No
Marrow oedema
High signal intensity on T2-weighted images and/or enhancement on T1-weighted fat-saturated contrast-enhanced images (Fig. 1) [12].
Yes/No
Synovial enhancement
Subjective increase in synovial enhancement compared to non-contrast images (Figs. 3 and 4) [13].
Yes/No
Synovial thickness
Using magnified images, measuring the white not the surrounding halo (Fig. 2).
Axial T1-weighted fat-saturated contrast-enhanced images: Level of round femoral head at 10 o’clock on right and 2 o’clock on left [14, 15].
Coronal T1-weighted fat-saturated contrast-enhanced images: Lateral to femoral head at the level of the physis.
Recorded in mm
Subjective synovial assessment
Subjective visual assessment of synovial thickness and enhancement combined.
Abnormal/Normal
×
×
×
×
×
×
Magnetic resonance imaging protocol
The local MRI protocol for suspected hip arthritis is coronal T1-weighted and T2-weighted fat-saturated images followed by T1-weighted fat-saturated images, in the axial and coronal planes pre- and post-contrast medium administration (Table 2). The scanner was a GE Signa HDX 1.5 T (GE Healthcare, Waukesha, WI) and the contrast medium was gadoteric acid (Dotarem 280 mg/ml; Guerbet, Paris, France), dose 0.2 ml/kg body weight to a maximum of 10 ml. Images were viewed on IMPAX (Agfa-Gevaert N.V, Mortsel, Belgium) software under standard reporting conditions.
Table 2
Further details of magnetic resonance imaging scanning sequences
The whole pelvis was scanned in each case using an 8-channel GE body array coil, with the patient supine and the legs in neutral and parallel. This enables detailed assessment of both effusions and synovial inflammation [9]. The hips were positioned for comfort to reduce movement artefact.
Statistics
IBM SPSS version 22.0 for PC was used for statistical comparisons. The significance level was set at P<0.05 for all statistical analyses. Missing data were omitted from tests by pairwise deletion. Reliability tests used the Cohen κ for categorical variables (marrow oedema, synovial enhancement and synovial assessment) and Bland-Altman plots for continuous variables (synovial thickness). Kappa values were interpreted according to Fleiss, with understanding of the limitations imposed by subjective interpretation [16].
Clinical diagnosis
Data regarding diagnosis after scanning were taken retrospectively from the patient medical record. Diagnoses of juvenile idiopathic arthritis were made according to the International League of Associations for Rheumatology classification [17].
Results
Of the 86 patients identified, the images of 79 fully complied with the MRI protocol and were therefore included in the study; 20 of these were randomly selected for duplication to allow calculation of intra-observer reliability, resulting in 99 contrast-enhanced MRI hip scans in the final set. Of the 79 patients included, 43% (n=34) were female. Mean age at referral for scanning was 13 years (range: 3–17 years). Both age and gender were consistent across diagnostic categories. Diagnoses prior to contrast-enhanced MRI comprised 22 cases of confirmed juvenile idiopathic arthritis (27.8%) and 57 (72.1%) other diagnoses. Of the patients with confirmed juvenile idiopathic arthritis before contrast-enhanced MRI (n=22), 22.7% (n=5) had oligoarticular juvenile idiopathic arthritis, 13.6% (n=3) had extended oligoarticular juvenile idiopathic arthritis, 22.7% (n=5) had polyarticular juvenile idiopathic arthritis, 4.5% (n=1) had enthesitis-related juvenile idiopathic arthritis and 9.1% (n=2) had systematic juvenile idiopathic arthritis. Six cases of arthritis had no specified subtype. Of the other diagnoses, 4 (7%) had inflammatory conditions that were not juvenile idiopathic arthritis, including sacroiliitis related to ulcerative colitis, uveitis and Takayasu arteritis. Retrospective data were not adequate to report clinical disease activity scores at the time of contrast-enhanced MRI. Missing data were minimal (2 to 4 cases per variable) for all domains except synovial thickening; of the 200 measurements per observer (100 axial, 100 coronal), recordings were not possible for 49 (25%), 97 (49%) and 32 (16%) in the axial plane and 40 (20%), 60 (30%) and 33 (17%) in the coronal plane for observers 1, 2 and 3, respectively. Median synovial thickness was 1.7 mm (axial and coronal) for observers 1 and 2, and 1.3 mm and 1.5 mm for axial and coronal measurements, respectively, for observer 3.
Tables 3 and 4 summarise the inter- and intra-observer comparisons. Intra-observer reliability for joint effusion ranged from poor to moderate (κ=0.4–0.7). In comparison, intra-observer reliability for synovial enhancement (κ=−0.9–0.8) and visual assessment of synovium (κ=0.4–0.9) ranged from poor to excellent. Finally, reliability for marrow oedema was moderate across intra- and interobservers (κ=0.6). Interobserver reliability was consistently moderate for effusion and marrow oedema but variable in other parameters. Overall, intra-observer reliability was variable across both observers and parameters, whereas interobserver reliability was stable across observers but variable across parameters. The Bland-Altman plots for the inter- and intra-observer plots for synovial thickness showed good agreement, with points clustered around the mean difference. Notably, the spread of points for coronal images was more likely to extend beyond the limits of agreement than for axial images for both intra- and interobserver plots (Figs. 7 and 8).
Table 3
Interobserver reliability for categorical variables described using the Cohen κ, comparing patients with juvenile idiopathic arthritis to those with other diagnoses
Effusion
Observer 1 vs. Observer 2
Observer 1 vs. Observer 3
Observer 2 vs. Observer 3
κ
n
κ
n
κ
n
All patients
0.57
79
0.57
79
0.74
79
Patients with arthritis
0.57
22
0.57
22
0.80
22
Patients without arthritis
0.56
57
0.56
57
0.70
57
Marrow oedema
All patients
0.58
79
0.64
79
0.64
79
Patients with arthritis
0.69
22
0.62
22
0.62
22
Patients without arthritis
0.46
57
0.65
57
0.65
57
Synovial enhancement
All patients
0.05
79
0.04
79
0.42
79
Patients with arthritis
0.06
22
0.06
22
0.43
22
Patients without arthritis
0.05
57
0.03
57
0.37
57
Subjective synovial assessment
All patients
0.31
79
0.36
79
0.51
79
Patients with arthritis
0.36
22
0.27
22
0.50
22
Patients without arthritis
0.19
57
0.35
57
0.40
57
Table 4
Intraobserver reliability of categorical magnetic resonance imaging parameters using the Cohen κ comparing patients with juvenile idiopathic arthritis to those with other diagnoses
Effusion
Observer 1
Observer 2
Observer 3
κ
n
κ
n
κ
n
All patients
0.40
20
0.74
20
0.57
20
Patients with arthritis
0.62
5
1.00
5
0.55
5
Patients without arthritis
0.30
15
0.60
15
0.60
15
Bone marrow oedema
All patients
0.83
20
0.83
20
0.64
20
Patients with arthritis
1.00
5
1.00
5
1.00
5
Patients without arthritis
0.63
15
0.63
15
–
–
Synovial enhancement
All patients
−0.07
20
0.66
20
0.83
20
Patients with arthritis
0.00
5
0.55
5
1.00
5
Patients without arthritis
–
–
0.71
15
0.76
15
Subjective synovial assessment
All patients
0.44
20
1.00
20
0.69
20
Patients with arthritis
0.62
5
1.00
5
0.55
5
Patients without arthritis
0.33
15
1.00
15
0.76
15
×
×
We also carried out a subgroup analysis comparing juvenile idiopathic arthritis and other diagnoses. The two groups showed similar reliability overall, although for some comparisons the juvenile idiopathic arthritis group had higher reliability (Tables 3 and 4). Some intra-observer calculations were not possible due to small subgroup sample sizes.
Anzeige
Discussion
It is well established that the early detection of juvenile idiopathic arthritis is fundamental to improving long-term outcomes [6, 7]. As clinicians attempt to detect arthritis at ever earlier stages, radiologists are faced with the increasing challenge of discriminating subtle signs of early inflammation from normal examination findings. This study therefore assessed the inter- and intra-reliability of experienced paediatric musculoskeletal radiologists reporting of specific MRI parameters used to detect inflammation in children ages 1–16 years with suspected hip arthritis. We conducted this study in the real-world context of patients being reviewed in paediatric rheumatology outpatient clinics, including both patients with confirmed juvenile idiopathic arthritis and those with suspected arthritis without a confirmed diagnosis of juvenile idiopathic arthritis. This is important as it reflects the clinical context in which contrast-enhanced MRI hip scanning is most useful and adds most value: aiding differentiation between an inflamed and non-inflamed joint in patients presenting with hip symptoms suggestive of inflammation.
Understanding which parameters are most reliable as indicators of synovitis, across different observers, is likely to help guide how we utilise MRI reports in clinical decision-making and inform the construction of grading tools in the future. Most previous studies assessing the reliability of MRI interpretation have linked this to use of a proposed grading tool to assess disease severity. These tools often lack flexibility (which limits their use in juvenile idiopathic arthritis – a condition with marked heterogeneity) and have little evidence to support them. Furthermore, studies investigating use of grading tools in the context of hip MRI scanning have been limited by small sample sizes -- two reported studies included 28 and 7 patients, respectively [15, 18]. Whilst these studies helped establish contrast-enhanced MRI as a useful diagnostic tool in juvenile idiopathic arthritis, they did not assess the reliability between observers.
Given that radiologic investigations are used increasingly early in the course of inflammatory arthritis, we chose not to study joint erosion as this change usually occurs later in established arthritis. Instead, we included joint effusion, marrow oedema, synovial enhancement, synovial assessment and synovial thickness based on review of existing literature.
Although abduction and extension of the hip have been found useful in the US detection of effusions, this is not practical in an MRI coil. No evidence for optimisation of hip joint effusion assessment on MRI was found, but in any prospective studies, a T2 fat-saturated axial sequence could be included to make interpretation easier. Our scan parameters were selected to provide the best images within a reasonably short scan time for reasons of compliance.
Anzeige
The presence of red marrow makes assessing marrow oedema around the hips in this age group more difficult. For this reason, the combination of T2-weighted fat-saturated and contrast-enhanced T1-weighted fat-saturated sequences was principally used.
Scanning this cohort of patients requires an optimal patient experience to maximise compliance and minimise movement artefact, which results in non-diagnostic imaging. At this institute, there is no access to sedation for MRI scans. Children younger than 5 years had a general anaesthetic and those older than 5 years had a digital video disc (DVD) of their choice for distraction.
There are no similar studies with which we could compare the reliability of individual MRI parameters; existing grading tools provide an overall score in comparison with our assessment of individual MRI parameters. Two grading tools focusing on juvenile idiopathic arthritis, the Juvenile Arthritis MRI Scoring system (JAMRIS) and paediatric MRI, recorded interobserver reliability [19, 20] and use similar parameters to those we assessed to generate a composite score. JAMRIS studied knees whereas paediatric MRI studied wrists. The observer reliability of our parameters is lower than that of JAMRIS (interobserver reliability, interclass correlation coefficient [ICC]=0.89 to 1) [19] and paediatric MRI (interobserver ICC>0.9) [20]. In contrast to these studies, our work focused specifically on the hip joint in a mixed cohort of children presenting to specialist paediatric rheumatology clinics including individuals with suspected, but not confirmed, juvenile idiopathic arthritis – a real-world cohort. The inclusion of patients without confirmed inflammatory disease in our cohort biases towards a greater proportion of normal scans and the subsequent lack of significant synovial thickening on scans may have contributed to low reliability scores. The hip may also be regarded as more challenging to study than wrists and knees, due in part to its deep anatomical position, which can obstruct imaging methods such as US and make clinical examination of swelling and effusion much harder. Unsurprisingly, the hip joint has been the focus of less work in comparison to the wrist and knee joints. Furthermore, there is a relative paucity of literature assessing the heterogeneous real-world population in which MRI is currently used to aid initial diagnosis.
Notably, intra-observer comparisons had slightly high reliability than interobserver (Tables 2 and 3). Variation in intra-reader reliability was largest comparing observers 1 and 3. Observer 1 recorded abnormalities most often. The variation between observers was most apparent for synovial enhancement. Between observers 2 and 3, all values but synovial enhancement had moderate agreement. Observer 1 was far more likely to record synovial enhancement, citing it 183 times compared to 53 and 32 for observers 2 and 3, respectively (Table 5). Observer 1 also had the lowest intra-observer reliability (poor compared to moderate and excellent for observers 2 and 3) for synovial enhancement. This suggests that observers had a different threshold to define enhancement. Care, therefore, needs to be taken in communication between clinicians reporting scans and clinicians making treatment decisions based on scan reports, in order to avoid over- or underinterpretation of scan findings and the alterations in management that may follow. At our institution, we have established weekly multidisciplinary meetings between clinicians requesting scans and radiologists interpreting scans to discuss radiologic abnormalities in clinical context.
Table 5
The frequency with which parameters were recorded as abnormal/present for each observer
Observer 1
Observer 2
Observer 3
Effusion
62 (31.6%)
44 (22.2%)
38 (19.2%)
Bone marrow oedema
15 (7.7%)
15 (7.6%)
8 (4%)
Synovial enhancement
183 (92.4%)
53 (26.8%)
35 (17.7%)
Subjective synovial assessment
48 (24.4%)
26 (13.1%)
22 (11.2%)
Anzeige
The Bland-Altman plots revealed that there was overall good inter- and intra-observer agreement for synovial thickness measurements. Notably, there seemed to be more variability in the coronal plots than the axial plots for both inter- and intra-observer scans. In particular, the intra-observer coronal plots indicate that significant disagreement can occur within the same reader and scan. Together, the plots suggest that readers may be more consistent when reading axial thickness than coronal thickness. The largest discrepancy between observers was 1.2 mm between observers 2 and 3 for axial thickness and 6.7 mm between observers 2 and 3 for coronal (2.7 mm, excluding anomalies). Overall, the difference in measurements was greater for coronal readings than axial and highest between observers 2 and 3. The greatest intra-observer coronal difference was 3.95 mm for observer 3 (excluding the likely error of 8 mm) and 0.7 mm for the axial joint between observers 2 and 3. Observer 3 had the largest range in measurements for axial and coronal (0.7 mm and 3.95 mm). Interestingly, each reader had two to four scans with only one recorded reading due to lack of confidence. However, current technology used in this study may not be adequate for precise (sub mm) measurements as the measuring tool available on workstations is only accurate to 0.3 mm. Delineation of the edge of the synovium is dependent to a small degree on the windowing further diminishing the accuracy. These practical difficulties in measuring synovium resulted in significant numbers (up to 49% for observer 2 on axial scans) being unmeasured, and likely contributed to the marked variability observed in this domain. The median measurements were less than or equal to 1.7 mm across observers, suggesting that children may have comparable synovium to the adult reference range of less than 2 mm. [21]. To the best of our knowledge, this is the first recorded attempt to measure the synovium of children with suspected hip arthritis and data describing normal synovial thickness in children are lacking.
In the subgroup analysis comparing juvenile idiopathic arthritis and non-juvenile idiopathic arthritis groups, the results were largely consistent with the total group and each other. Intra-observer analysis showed marginally higher kappa scores in the juvenile idiopathic arthritis group compared to those in the non-juvenile idiopathic arthritis group. For example, marrow oedema juvenile idiopathic arthritis kappa was 1.0κ compared to 0.60κ. This could suggest that observers are more consistently recognising parameters in confirmed juvenile idiopathic arthritis than in those less likely to have inflammatory changes.
The retrospective methodology employed in this study limited the diagnostic detail, as only what was recorded in patients’ medical records was available. For example, a diagnosis was usually clearly stated but sometimes the clinical suspicion of inflammation was not made clear.
This study has highlighted the challenges faced in assessing early hip inflammation in the pediatric population. Possible areas for future development include the use of subtraction MRI to look at the enhancement profile of the synovium. Synovial thickness measurement could potentially be improved by the use of higher resolution images. This could be achieved by using a smaller field of view to image just one hip at a time in an appropriate group of patients. The use of 3 T may enable improved scan-to-noise ratio without incurring the movement artefact expected with increased scan time. A prospective study would give a better understanding of the cohort of individuals being scanned. To ensure the development of clinical tools is applicable to the real-world situation, it is important that further validation work uses a population representative of patients presenting to paediatric rheumatology services, rather than only studying those with established inflammatory diagnoses. Normative data would provide a helpful comparison in the assessment of MRI parameters and help to discriminate subtle changes of early inflammation; however, we recognise the challenges in obtaining normal contrast-enhanced MRI hip data.
Conclusion
The parameters with highest inter- and intra-observer reliability in our study were joint effusion, marrow oedema and subjective assessment of synovium. Objective measurement of synovial thickness in early disease is technically difficult but has good agreement between and within observers. Radiologist reporting was inconsistent in this population showing significant differences in intra-observer reliability across parameters and only moderate agreement in comparisons between readers for most parameters.
Mit e.Med Pädiatrie erhalten Sie Zugang zu CME-Fortbildungen des Fachgebietes Pädiatrie, den Premium-Inhalten der pädiatrischen Fachzeitschriften, inklusive einer gedruckten Pädiatrie-Zeitschrift Ihrer Wahl.
Mit e.Med Radiologie erhalten Sie Zugang zu CME-Fortbildungen des Fachgebietes Radiologie, den Premium-Inhalten der radiologischen Fachzeitschriften, inklusive einer gedruckten Radiologie-Zeitschrift Ihrer Wahl.
Inter- and intra-observer reliability of contrast-enhanced magnetic resonance imaging parameters in children with suspected juvenile idiopathic arthritis of the hip
verfasst von
Francesca M. Porter-Young Amaka C. Offiah Penny Broadley Isla Lang Anne-Marie McMahon Philippa Howsley Daniel P. Hawley