Introduction
Before the initiation of medical or surgical therapy for symptomatic Crohn’s disease (CD), it is crucial to assess whether inflammatory activity is present, because even though the CD may be in remission, symptoms of coexisting irritable bowel syndrome (IBS) may mimic active disease. It also is important to distinguish bowel obstruction due to inflammation from stenosis due to residual fibrotic stenosis as these respectively warrant medical therapy or surgical therapy. Furthermore, if inflammatory activity is present, it is important to distinguish between mild, moderate or severe disease as medical management differs among the disease stages [
1,
2].
The reference standard for diagnosing active CD and staging disease activity is endoscopy [
3]. However, with standard endoscopic techniques only part of the bowel can be visualized, while the low patient acceptance forms another drawback of this technique.
Many studies have advocated the use of computed tomography (CT) for abdominal evaluation in patients with CD, as it is an accurate and patient-friendly technique [
4‐
8]. However, during an abdominal CT examination patients are exposed to considerable radiation doses (mean cumulative effective dose is 36.1 mSv; however, more than 75 mSv can be obtained) [
9].
As assessment of disease activity is often necessary repeatedly, the excess lifetime cancer mortality risk attributable to radiation exposure will increase when abdominal CT is used for CD evaluation. It has been estimated that about 1.5 to 2.0% of all cancers in the US may be attributable to the radiation from CT studies [
10]. In contrast, magnetic resonance imaging (MRI) is an investigation that does not require the use of ionizing radiation. As it also is a non-invasive technique, MRI is increasingly used for abdominal evaluation in patients with CD [
11‐
13]. However, while MRI has been shown to be accurate in diagnosing active CD [
14,
15], the accuracy of MRI in staging disease activity is not so clear yet. As MRI is inferior to colonoscopy in the detection of subtle mucosal detail, MRI might provide false-negative results in patients with mild, superficial CD. This hypothesis is supported by findings from several studies in which false-negative MRI results were seen in patients with active, mostly mild CD [
16‐
19]. However, in other studies disease activity was overestimated on MRI [
20‐
22].
Thus, the purpose of our study was to systematically review the accuracy of MRI in staging disease activity in CD by performing a meta-analysis.
Materials and methods
Search strategy and study eligibility
A computer-assisted search was performed of the MEDLINE, EMBASE, CINAHL and Cochrane databases to identify papers reporting the accuracy of MRI in staging CD activity. In MEDLINE and EMBASE, we used “Crohn disease (MeSH)” and “Magnetic resonance imaging (MeSH)” as search terms. For searching the CINAHL and Cochrane databases, we used “Crohn disease” and “Magnetic resonance imaging” as free text words. The search period was restricted from 1990 through April 2007. No age limits or language restrictions were applied.
Titles and/or abstracts of all retrieved papers were checked by one observer (KH) to determine eligibility for inclusion. Reference lists of review articles and eligible studies were checked manually to identify other relevant papers. Hand searching of major journals was not performed. Only data that were presented as full-text articles were eligible for inclusion. As field strength of most MRI systems currently used in clinical practice is ≥1.0 T, we decided to exclude papers in which MRI field strength was ≤0.5 T. All eligible articles were retrieved as full-text articles.
Study selection
Two reviewers (KH and SB) independently checked all retrieved articles to check whether they satisfied the following criteria: (1) they provided data on disease activity of CD; (2) MRI was used to evaluate CD; (3) findings at histopathology, colonoscopy and/or intra-operative findings were used as the reference standard; (4) positive criteria were defined for MRI (i.e., criteria described to stage disease activity); (5) data were available to fill out cross-tabs (for calculation of agreement in staging disease).
If all criteria were met, the article was included in the study. Disagreement between the two reviewers regarding inclusion was resolved by consensus. The authors of the primary research were approached for additional information, if neccessary.
Study characteristics
Both reviewers independently assessed study characteristics of the included studies and extracted relevant data, described in detail below, by using a standardized form. No blinding of authors’ information, authors’ affiliation or journal title was performed. Inconsistencies in assessment of the included studies were resolved by consensus.
Patient characteristics
The following patient characteristics were recorded: (1) number of patients; (2) sex ratio distribution; (3) mean age (range); (4) part of the gastrointestinal tract examined.
Study quality assessment
To assess study quality characteristics, the QUADAS tool was used as a guideline. The QUADAS tool has been developed for reviewers to evaluate the quality of studies and especially studies of diagnostic accuracy [
23,
24]. The following characteristics were assessed:
(1)
Whether the spectrum of patients was representative of the patients who will receive MRI in practice;
(2)
If selection criteria were clearly described;
(3)
Whether the time period between the MRI and the reference standard was short enough to be reasonably sure that the condition did not change between the two tests;
(4)
Whether all patients received verification using a reference standard;
(5)
Whether the execution of the MRI was described in sufficient detail to permit its replication (we considered the MRI description as sufficient if information was provided about the following imaging features: magnetic field strength; type of coil used, bowel preparation used, and sequences used for evaluation; the use of intravenous and/or luminal contrast medium);
(6)
Whether the execution of the reference standard was described in sufficient detail to permit its replication (we considered the reference standard described as sufficient if the criteria used for diagnosing the different disease stages were defined);
(7)
Whether the MRI results and the reference test results were evaluated independently;
(8)
Whether interpretation of the MRI results was independent of clinical information.
Imaging features
The following imaging features were recorded for MRI, if available: (1) magnetic field strength; (2) coil used (body or surface); (3) bowel preparation and type of bowel preparation (bowel cleansing, fasting and/or diet, use of spasmolytic medication); (4) amount and type of intravenous and/or luminal contrast medium (enteroclysis, oral and/or rectal contrast medium) if administrated; (5) sequences used for disease evaluation.
Imaging criteria used for staging disease activity
For each study the imaging criteria that were used to stage CD on MRI (e.g., pathological bowel wall thickening, pathological bowel wall enhancement and stenosis) were noted.
Reference standard
The verification method used (surgery, histopathology and/or colonoscopy) was recorded for each study.
For each study, 3 × 3 (remission, mild, frank) or 4 × 4 (remission, mild, moderate, severe) contingency tables were extracted from the articles, depending on the way of reporting.
Data analysis
An overall analysis was performed for the 3 × 3 data. For this approach, 4 × 4 tables were reconstructed to 3 × 3 tables by grouping moderate and severe disease together as frank disease. For the 3 × 3 data, analysis was performed using a multivariate random-effects approach [
25] performed by using a Bayesian algorithm [
26] in the Winbugs program. Summary estimates were calculated. If studies reported data for multiple independent observers, we used the data leading to the lowest Aikaike information criterion (AIC) value to calculate summary estimates; a lower AIC value indicates a better fit of the data [
27].
Analysis on 4 × 4 tables could not be performed due to the limited amount of data per stage. The results of the indivual studies are described.
Discussion
MRI was highly accurate for diagnosing patients with frank disease. MRI more often overstaged than understaged disease activity in CD, but in most of these patients radiological staging and disease staging by the reference standard differed one grade.
An explanation for the inaccuracy in staging of patients with mild disease and patients in remission of MRI compared with the reference standard is the relative inexperience with evaluation of abdominal MRI for CD. Although bowel wall enhancement and bowel wall thickening are recognized as important parameters that indicate CD, no strict cutoff points have been defined yet to differentiate between the different stages of disease. This is reflected by the variation in definitions used in the different studies. In all included studies the subjective evaluation of the observers was very important for staging. Even in the studies wherein cutoff points were clearly described to differentiate among the different stages of disease, the radiologist had to subjectively define which bowel loop to use for assessment of enhancement and thickening.
Also, more patients were included with frank disease than with mild disease, while patients in remission were least often included. Frank disease is often easier to diagnose than mild disease or remission, as in this disease stage the parameters indicative of disease are most pronounced.
Another explanation for inaccuracy of MRI in staging is the fact that MRI and the reference standard are essentially different methods. With ileocolonoscopy only the lumen and the inner surface of the bowel wall can be assessed, while tissue sampling for histopathological examination only provides mucosal specimens. Meanwhile, on MRI the entire bowel wall with all its layers and the extraintestinal abdomen (e.g., the mesenteric vessels, mesenteric lymph nodes, mesenteric fat) are evaluated. As CD is a transmural disease, the extent of inflammatory or fibrotic changes might be better assessed on MRI than by inspection of the mucosal surface. A good next step would therefore be to compare MRI results with surgical pathology as in this manner all bowel wall layers can be examinated.
We only determined the ability of MRI to grade disease activity for the colon and terminal ileum, while CD can also be localized in the small bowel. We decided to limit our meta-analysis to findings in the colon and terminal ileum, as no reference standard was available for grading disease activity of the small bowel. The investigation that has often been used for evaluation of small bowel CD in the past (i.e., small bowel barium examination) is increasingly considered to be an imperfect reference standard. Comparative studies of MRI with established superior reference tests for the small bowel, such as double-balloon endoscopy (DBE) or video capsule endoscopy (VCE), are very scarce [
33] as these endoscopic techniques were not commercially available until very recently and are only limitedly available at present. Also, for VCE or DBE the assessment of the severity of CD of the small bowel is not standardized yet.
A limitation of our analysis is the fact that we grouped moderate and severe disease together as frank disease. Information about the ability of MRI to differentiate between moderate and severe disease is discarded in this manner. However, we decided to put these data together in order to provide a more robust statement regarding the accuracy of MRI for disease activity, as only a limited amount of data was available. We provided these data to show the limited number of studies and the extreme heterogeneity in results between studies.
Another limitation is that although we accepted only colonoscopic, histopathological and/or surgical results as reference standard, the criteria for determination of disease activity on the reference standard were not identical between studies. Therefore, activity assessment on the reference standard might not have been consistent between studies. This might have influenced pooled accuracy estimates of MRI for staging disease activity. However, all three reference methods are reliable and are often used for assessment.
We decided not to perform subgroup analysis on the differences in technique, MR imaging criteria used or reference methods used as conclusions from subgroup analysis would not be very reliable due to the limited amount of data available. Therefore, we can not draw conclusions on the influence of the aforementioned differences for staging disease.
Before MRI can be implemented in routine clinical practice for the evaluation of CD, more research should be done on the reproducibility of MRI of the small bowel and colon. In our meta-analysis only two studies looked at interobserver agreement, and both reported moderate kappa values [
31,
32]. As an imaging technique should be both accurate and reproducible, more studies are required to determine the role of MRI in clinical practice.
Also, before MRI can be used as a valid alternative for colonoscopy in the assessment of CD activity, it should become clear which imaging criteria are consistent with the different stages of CD. If standardized criteria were available internationally, larger trials would be possible, while comparison among studies would also be simplified. For that purpose, a more standardized technical imaging approach would be advisable as well. Future research should therefore focus on standardization of preparation, imaging technique and more uniform imaging criteria used for diagnosis of disease, in addition to including larger numbers of patients.
It would be interesting to see how other imaging techniques commonly used for evaluation of CD (i.e., computed tomography, ultrasonography) would perform in staging disease activity. Data on staging disease activity in CD are lacking for these techniques; by using the same inclusion criteria as we described above, only one article on power Doppler sonography [
34] would be eligible for analysis (data not shown).
In conclusion, MRI can be used for staging disease activity in CD as with MRI most patients with frank disease are correctly diagnosed. However, in patients with disease in remission and mild disease, correct staging is limited.