Background
The value of sensitive imaging methods such as musculoskeletal ultrasound (US) and magnetic resonance imaging (MRI) for disease monitoring in rheumatoid arthritis (RA) is currently being discussed [
1]. The diagnostic value of US and MRI in very early disease phases of RA is also being investigated, and there appears to be an agreement on the notion that these modalities have an added value in the diagnostic process [
1]. The EULAR imaging taskforce also recommended the use of US and MRI for this purpose without distinguishing between both modalities [
2]. These modalities have advantages and disadvantages. MRI is generally considered as the most valid method, yielding reproducible results in a three-dimensional view, and it has the advantage that it depicts bone marrow oedema. Its use is limited by insufficient availability in several centres and higher costs. A disadvantage of US is the machine and operator dependency. Currently available data obtained in patients at risk for RA revealed that US-detected synovitis or tenosynovitis scores (greyscale (GS) or power Doppler (PD)) and MRI-detected synovitis or tenosynovitis scores were predictive for RA development [
3‐
10]. These studies generally used only one modality and did not directly compare findings of both modalities.
Presently, there is limited knowledge whether US and MRI identify the same lesions in the earliest phase of RA. One study compared MRI and US on joint/tendon level in patients with early classified RA; data suggested that MRI is more sensitive than US [
11]. The existing studies in early arthritis or arthralgia that performed both MRI and US did not make comparisons on joint or tendon level, did not include the feet, or used low-field MRI [
12‐
15]. In addition, only few studies included tenosynovitis [
11‐
13], and none of them used standardised scoring methods such as the recently published EULAR-OMERACT method for US scoring [
16].
Therefore, we aimed to evaluate to what extent both modalities can be used interchangeably in patients at risk for RA. We conducted a cross-sectional study in patients presenting with early inflammatory arthritis (IA) or clinically suspect arthralgia (CSA) and investigated on joint and tendon levels whether US and MRI detected the same inflammatory lesions (synovitis and tenosynovitis).
Discussion
This large cross-sectional study compared US and MRI findings of synovitis and tenosynovitis on the joint and tendon levels, respectively, in patients newly presenting with early IA and CSA. These are the populations where imaging modalities can have a specific role in the diagnostic process. The newly developed EULAR-OMERACT-scoring method for GS-detected synovitis for US was used. Our data showed that US findings were highly specific and rarely ‘false-positive’, but also less sensitive compared to MRI, resulting in ‘false-negative results’. This suggests that MRI cannot be replaced by US while maintaining its sensitivity on the level of joints and tendons. How this affects the predictive accuracy needs to be investigated further in longitudinal studies.
Two different scoring methods for GS-detected synovitis were applied: the EULAR-OMERACT method and the modified Szkudlarek method, which combines synovial effusion and hypertrophy [
16,
21]. Direct comparison of both scoring methods for GS synovitis showed that higher scores were obtained by the modified Szkudlarek method. In line with this and compared to MRI, the modified Szkudlarek method had more false positives which resulted in a higher sensitivity but lower specificity than the EULAR-OMERACT method. The false-positive results (MRI scores 0, GSUS > 0) obtained by the modified Szkudlarek method might be explained by the fact that it evaluates a combination of synovial effusion and hypertrophy, while in the recent EULAR-OMERACT definition hypertrophy regardless of the presence of synovial effusion was evaluated [
16], and the fact that contrast-enhanced MRI also does not visualise joint effusion. Thus, although this study did not primarily aim to compare the ‘old’ and ‘new’ GS synovitis scores, present data also showed the relationship between both GS scoring methods and revealed that the EULAR-OMERACT synovitis score for US was more concordant to the OMERACT-RAMRIS method for MRI.
Unfortunately, the definition of the EULAR-OMERACT for GS synovitis was published when this study had already started [
16]. Consequently, synovitis had already been scored according to the modified Szkudlarek method. Therefore, static US images were rescored according to the EULAR-OMERACT method, which might be a potential limitation, as scoring of static images can be challenging. We used two independent readers to assess the static images; both readers showed excellent agreement between the reading results, which supports the reliability of these data.
Since the role of synovial effusion in the pathologic process of RA and other types of IA is not yet fully understood, synovial effusion was not explicitly taken into account, except within the modified Szkudlarek method [
21]. Synovial effusion often has been detected in healthy persons by US, especially in the feet [
24]. Unfortunately, up to now, age-related normal values for US-detected pathologies such as synovial effusion, synovial hypertrophy, tenosynovitis and erosions are still unknown and should be subject for future studies. Furthermore, it would be interesting to see the effect of findings in healthy symptom-free individuals for the definition of positivity for US. This is also subject for future research.
Importantly, there were differences between the scoring methods for US and MRI. All scoring methods consisted of semi-quantitative scales ranging from 0 to 3. However, the requirements for each grade were different for US and MRI (Additional file
1: Table S1). Thus, different definitions for the different scoring methods hamper direct comparison of the different grades, though as presented by Figs.
1 and
2, increased US scores generally coincided with increased MRI scores. To assess whether this was similar in patients with CSA and IA, we also repeated the analyses for both populations separately. In both populations, higher US scores were present in patients with higher MRI scores (Additional file
1: Figures S1–S4). However, the test characteristics were not completely similar. Although the specificity for US was similar in both populations, the sensitivity was lower in patients with CSA compared to IA. CSA patients have less severe inflammation than patients with IA and current data implied that in this setting of subclinical inflammation, US is less sensitive than MRI.
Another issue is the cut-off used for dichotomization. Our US cut-offs are frequently used in the literature. For GS, we observed that increasing the cut-off from ≥ 1 to ≥ 2 resulted in an increased specificity and a notably decreased sensitivity. This phenomenon is often observed when changing cut-offs. Based on AUCs, a cut-off ≥ 1 could be considered more favourably than ≥ 2. Also, the cut-off for MRI positivity was explored. In addition to using a cut-off of mean ≥ 1, we applied a cut-off based on healthy volunteers [
23]. This caused only minor improvements in the test characteristics for US compared to MRI.
A strength of this study was that besides synovitis, also tenosynovitis was evaluated; this imaging feature is less often studied than synovitis while it is important, as tenosynovitis in IA and CSA has been shown predictive of RA development, both in studies that used MRI [
9,
25] and US [
7]. Furthermore, this study examined patients at risk for RA and applied the new EULAR-OMERACT score for GS-detected synovitis. We also did not only examine the wrist and MCP, but also the MTP joints. In contrast, a recent meta-analysis compared the accuracy of US-detected synovitis versus MRI in wrist, MCP, PIP, and knee joints, but not MTP joints in patients with classified RA [
26]. The included studies were also not scored according to the EULAR-OMERACT method. Despite these differences, the sensitivity and specificity for GS/PD-detected synovitis observed in this study compared to our data are roughly similar. Also GS tenosynovitis was previously studied by Wakefield et al. in MCP joints of classified RA-patients and were comparable to our results from patients in earlier disease phases, showing a high specificity and moderate sensitivity [
11].
In our data on tenosynovitis, the sensitivity was particularly low for the FDS/FDP tendon. A possible explanation could be that this tendon is located below the retinaculum flexorum, deeper in the wrist tissue than other tendons. Also, PDUS tenosynovitis had only a low to moderate sensitivity, despite the use of high-end US machine with a sensitive power Doppler. PD-detected tenosynovitis had only a small or no additive value to GS tenosynovitis, particularly for the MCP-flexor tendons. A reason for this could be that PD performs better from the dorsal side of the joint than from the palmar side, which may have contributed to this finding [
16,
27]. Although replication in other studies is needed, the current data with MR as reference suggests that PDUS-detected tenosynovitis had no clear additive value to GSUS, which is in contrast to findings for synovitis.
This cross-sectional study is the first that examined the concordance between synovitis detected by US and MRI in the feet of patients with (suspicion on imminent) early RA. Interestingly, GS synovitis had a higher sensitivity in the feet than in the hand joints, which was at the cost of a lower specificity (implying a higher frequency of false-positive signals in MTP joints).
MRI was the reference in this cross-sectional study on the joint/tendon level, showing false-negative findings for synovitis and tenosynovitis. For clinical purposes, analyses on patient level are also relevant, as patients often have > 1 joint affected and at least 1 joint with subclinical inflammation might be considered sufficient to indicate disease. Analyses on the patient level showed that US missed only 1/44 patients (GS) and 14/44 (PD) compared to MRI (cut-offs ≥ 1, data not shown). Hence, there is less discordance on the patient level than on the joint/tendon level. The comparability of US and MRI to accurately predict RA development remains an outstanding question, for which longitudinal studies with RA development as outcome are needed.
In conclusion, this is the first study that used the recently developed EULAR-OMERACT method for US in comparison to MRI, in patients consecutively presenting with early IA and CSA. These are the populations in which these imaging modalities can be used to detect (imminent) RA. US had a good specificity, but was less sensitive compared to MRI on the local tendon and joint level. However, US is more easily available, less time-consuming and has lower costs than MRI. Longitudinal studies in ‘at-risk’ populations are needed to directly compare the predictive accuracy of MRI and US while using up-to-date scoring methods.