Introduction
Acetabular labrum is a fibrocartilaginous structure that lines the majority of the acetabular socket. The hip labrum has many functions, including shock absorption, joint lubrication, pressure distribution, and aiding in stability. Acetabular labral tears (ALT) were observed in 62% of individuals with hip or groin pain and 54% of asymptomatic individuals [
1]. Five categories of ALT have been described based on etiology: traumatic, congenital, degenerative, capsular laxity, and idiopathic [
2]. FAI is one of the primary predisposing factors to ALT.
Diagnosis of ALT is based on a dedicated examination of patient history, pertinent objective findings, special clinical tests, and supportive imaging findings. It is generally believed that early surgical intervention of acetabular labral tears may delay the development of osteoarthritis. Thus, the diagnostic information of ALT would have a significant effect on an orthopedic surgeon’s clinical decision-making considering surgical intervention. The use of MRI as a non-invasive, fast and convenient test to diagnose ALT has gained in popularity. However, there are some areas (e.g., the shoulder, the wrist, the hip) in which evaluation of the joint space may be suboptimal [
3]. To address these issues, contrast materials may be injected into the hip joint space to perform MR arthrography (MRA), creating distention of the joint.
To be useful to clinicians, a diagnostic test must possess high sensitivity (Se) to rule in a condition and high specificity (Sp) to rule out a condition. However, there has been shown that both conventional MRI and MRA at field strengths of 1.5–3.0 T achieve different Se and Sp in detecting hip labral tears when compared to either arthroscopic or open surgical findings [
4‐
25]. At the same time, the diagnostic accuracy of 1.5 T and 3.0 T MRI for detecting labral tears is also different, and there is no conclusive conclusion that which field strengths should be recommended. Thus, in clinical practice, whether high field MRI has the potential to substitute MRA deserves extensive discussion.
The purpose of this study was to determine (1) the diagnostic accuracy of MRI and MRA for the detection of ALT, (2) whether 1.5 T or 3.0 T is all acceptable, by conducting a meta-analysis of the literature regarding the diagnostic performance of MRI/MRA.
Discussion
This meta-analysis demonstrates that MRA has a better performance for detecting ALT than MRI overall, with a pooled Se of 0.89 vs. 0.80, a Sp of 0.69 vs. 0.77, and AUC of 0.87 vs. 0.80. These findings are consistent with the previous three systematic reviews [
30‐
32]. However, the previous systematic reviews did not include sufficient and eligible studies [
30‐
32]. Another interesting finding of our study is that the Se of 3.0 T MRI was very close to MRA, and the Sp of 3.0 T MRI, ability to correctly detect that a patient does not have a labral tear, was greater in 3.0 T MRI compared to MRA. A summary of post-test probabilities also shows: compared with MRA, MRI can help to confirm the suspicious ALT cases. Given that 3.0 T MRI could provide a non-invasive, fast and convenient method to recognize suspicious cases, 3.0 T MRI is more recommended than MRA. So, in clinical practice, clinicians can rely on conventional methods of diagnosis using data from the patients presenting with anterior groin pain, a mechanical hip symptom (clicking, locking, catching, giving way or instability), a positive physical test (such as anterior hip impingement test) and alongside a positive finding on 3.0 T MRI to identify those patients with a symptomatic ALT.
The diagnosis of ALT is a complicated problem for every clinician. No imaging findings, reported symptoms or clinical physical examination findings are ‘stand-alone’ in their ability to diagnosis ALT [
33]. Sonography is a relatively inexpensive, quick, non-invasive diagnostic procedure for evaluating ALT, which is however a relatively subjective procedure and relies primarily on the extensive experience of the operator. So far, several studies have assessed sonographic examination for diagnosis of acetabular labral tears, but the validation of this test has been inconsistent [
34‐
41]. Sonographic examination has a lesser diagnostic ability than CTA or MRI/MRA; thus, it is of limited use in clinical practice [
34,
35,
41]. CTA is another diagnostic method for evaluating labral tear in patients with claustrophobia, electronic apparatuses, or metallic foreign materials. The diagnostic value of CTA has improved since the advent of multi-detector computed tomography with submillimeter spatial resolution [
42,
43]. However, there are only limited data regarding the efficiency of CTA to assess hip labral pathology [
14,
35,
42‐
44], CT also imparts high levels of radiation on the pelvis to young female patients. The patient history and physical findings are important entities to explore in suspicious ALT population alongside diagnostic imaging. A number of physical tests are used to assess ALT, such as flexion-adduction-internal rotation test and flexion-internal rotation test. Up to now, there are 4 systematic reviews aiming to identify the clinical utility of these physical tests, which showed similar results that available physical examination studies were largely heterogeneous, generally of low quality, and did not appear to currently provide the clinician any significant value in altering probability of disease with their use [
33,
45‐
47]. Although the benefits of CTA, MRI, MRA, and US can provide great promise when complemented with physical examination findings, the gold standard of imaging for the diagnosis of ALT has still not been found [
30]. MRI is widely used in clinical practice for its excellent soft-tissue contrast advantages. MRI findings are also specific factors affecting surgical decision-making [
48]. Therefore, it is necessary and meaningful to clarify whether 1.5 T or 3.0 T is all acceptable for the detection of ALT.
When the clinician is appraising evidence about diagnostic tests they should consider a key concept: how much will different levels of the diagnostic test raise or lower the pre-test probability of disease? So, we calculated the post-test probabilities to understand the clinical utility of MRI/MRA for detecting ALT. Our meta-analysis shows: assuming that the pre-test probability = 50%, MRI could increase the post-test probability to 78% in patients and could decrease the post-test probability to 21% in patients, MRA could increase the post-test probability to 74% in patients and could decrease the post-test probability to 14% in patients. That means MRI may help to confirm the suspicious ALT cases, and MRA may help to rule out the ALT.
Meta-regression analysis revealed that the MR field strength and type of reference standard were significant factors influencing study heterogeneity. Notably, the Se values of 3.0 T MRI were very close to MRA (0.87 vs. 0.89), and the Sp values of 3.0 T MRI were superior to MRA (0.77 vs. 0.69). However, there is insufficient data to summarize the diagnostic value of 3.0 T MRA in subgroup meta-analysis. As we all know, the injection of intra-articular contrast material can play a critical role in the distention of the joint, which may greatly facilitate the radiologist to interpret the MRI. However, MRA is an invasive procedure and carries the risk of joint infection compared to MRI [
3]. On the other hand, high field strength magnet can increase the signal-to-noise ratio thus help in a detailed assessment of acetabular labrum [
32]. This meta-analysis study, which was the first time to comprehensively evaluate the diagnostic accuracy of 3.0 T MRI, demonstrated a similar ability to detect ALT compared with MRA.
Notably, MRA studies using arthroscopic and open surgery as a reference standard showed higher Se and Sp than those using arthroscopic surgery as a reference standard. The higher diagnostic accuracy in studies using arthroscopic and open surgery as a reference standard might be explained by the blind spots in arthroscopic surgery and additional labral injuries in open surgery. With the reference standard issue being not discussed in the previous three systematic studies [
31,
32,
48], further studies still needed to fully assess this issue.
Other possible reasons for the study heterogeneity were: MR sequences (coronal, axial, sagittal, oblique coronal, or oblique sagittal planes), reference test blinded design, the duration interval between MR and surgery, and MR reviewers (single musculoskeletal radiologists, multiple musculoskeletal radiologists or general radiologists). Unfortunately, there was insufficient data to analyze whether the above four potential variables were significant factors influencing study heterogeneity. Furthermore, the combined variability of imaging planes, sequences, slice thicknesses, matrix sizes, resolution, and types of receiver coils were too complex to analyze as subgroup meta-analysis. Park SY et al. compared the diagnostic accuracy of three-dimensional intermediate-weighted fast spin-echo sequence and two-dimensional fast spin-echo sequences for the diagnosis of acetabular labral tears, and they found that Se and Sp were 0.74 and 0.89 for two-dimensional fast spin-echo sequences, and 0.78 and 0.92 for three-dimensional intermediate-weighted fast spin-echo sequence, respectively [
49]. 81.8% of included studies mentioned the imaging interpretation was conducted by musculoskeletal (MSK) radiologists. It is generally believed that the accuracy of radiological reporting of hip pathology is based on the training level of the reporting radiologist. McGuire et al. showed that accuracy rates for MSK radiologists were 85% for labral lesions, for community radiologists were 70%, respectively [
50]. Of included studies eight presented the results of interobserver reliability, the k-value was all interpreted as above moderate except one study [
19]. Freedman BA et al. presented the results of almost perfect intraobserver reliability [
5]. Individual assessor variability may have some influence on the diagnostic accuracy of MRI/MRA interpretation. 50% of included studies mentioned the duration interval between MR and surgery, varying from 18 days [
13] to > 6 months [
5]. This may increase the possibility that the patient labral condition change between the index and reference tests. The reference test blinded design means the findings of the index test were unknown to surgeons. However, the reference test blinded design is impractical in clinical practice. Only one study reported the surgeons were unaware of imaging findings [
21].
Kwee RM et al. demonstrated that a sublabral sulcus can be found at any anatomical location in MRI and its prevalence is at least 5% in symptomatic patients [
51]. Therefore, MSK radiologists can not be too cautious about sublabral sulcus which usually being misdiagnosed as ALT. Surgeons should also carefully check for acetabular cartilage injury during surgery, as labral tears have been indicated as an adjunctive cause of cartilage injury [
52,
53]. Excellent diagnostic criteria are helpful for accurate diagnosis by radiologists. Blankenbaker DG et al. demonstrated that the Lage arthroscopic classification system does not correlate well with the Czerny MRA or an MRA modification of the Lage classification [
54]. Constructing a uniform MR imaging criterion to accurately localize a labral tear and define its extent is a vital future research topic. Tiegs-Heiden CA et al. draw an interesting conclusion that gadolinium-based contrast agents may be able to be eliminated from the direct MRA injection without compromising diagnostic accuracy in the hip [
55].
Limitations of this study
Our meta-analysis has potential limitations. First, large heterogeneity was noted between the included studies; although we could perform a meta-regression analysis, we could not fully explain the heterogeneity. Additionally, because of the small number of studies or insufficient data, other potential reasons for study heterogeneity were not included in meta-regression analysis. Second, there did show significant asymmetry in MRA Deeks’ funnel plot, the publication bias may thus influence the reliability of meta-analysis. Third, several subgroup analyses in our investigation were performed on a small number of studies. Additionally, we could not conduct a subgroup analysis for the limited number of references of 3.0 T MRA. Fourth, there was methodological variability in the studies, such as reference standards tests, reference tests blinded design, imaging reviewers, and the duration interval between MR and surgery. The above limitations weaken the generalizability of this meta-analysis's findings to wider clinical practice.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.