Several types of MRI LIC measurement have been described in the literature. Straightforward in–out phase gradient echo (GRE) shows signal loss at the later echo time (TE) but is only qualitative and easily confounded by the presence of hepatic steatosis. Quantitative approaches include (i) signal intensity ratio (SIR) measurement (e.g., the Gandon method) and (ii) MR-relaxometry. The Gandon method (henceforth referred to as “SIR”) utilizes the liver-to-muscle SIR on differently weighted MRI-scans [
6]. This method allows easy and free calculation of the LIC
SIR, by entering ROI values in an online tool [
7]. Hence, assuming the acquisition and placement of regions-of-interest (ROIs) are performed correctly, the method is robust to observer influences. A major limitation is its upper limit of detection of 350 µmol/g (equal to 20 mg/g): changes above that threshold cannot be measured.
MR-relaxometry relies on the calculation of tissue relaxation rates (
R
2 and
R
2*, the inverse of relaxation times
T
2 and
T
2*), which increase as iron accumulates and are sensitive to changes in LIC values well above the SIR-threshold. One commercialized
R
2 approach using single-echo spin-echo (SE) MRI is the FDA-approved St. Pierre method [FerriScan
®], performed in 10 min in free-breathing [
8]. The per-scan analysis price is ~$300, on top of the costs of the MRI-scan itself. Alternative free-of-charge approaches are available for
R
2 using free-breathing or respiratory triggered SE-MRI and for
R
2* using single breath-hold GRE MRI [
9].
A comparative study of LIC
SIR,
R
2, and
R
2* in 94 patients with β-thalassemia reported high correlations [
14]. However, success rates, interobserver agreement, and applicability for diseases other than β-thalassemia were not investigated, nor were serum markers assessed. The latter may be useful to screen for elevated LIC (i.e., >36 µmol/g), saving expensive and limited MRI time. We hypothesize that
R
2* is preferable over SIR and
R
2 in terms of success rate, acquisition time, and range of detection and over serum values in terms of accuracy in detecting elevated LIC.
In our center, the clinical LIC protocol has included SIR,
R
2, and
R
2* since 2005, with regular weekly clinical referrals since 2008. The SIR measurement is recommended by the national guideline for hemochromatosis [
15]. It is supplemented by
R
2 and
R
2* measurements to fill the gap caused by the SIR method’s hard cut-off at 350 µmol/g. To investigate our hypothesis, we (i) assessed SIR,
R
2, and
R
2* LIC measurements and their success rates and interobserver agreement; and (ii) compared the diagnostic accuracies of LIC
SIR,
R
2, and surrogate serum markers for correctly predicting elevated LIC based on increased
R
2*
.
Discussion
This study shows that for routine clinical MRI-based LIC measurements SIR and R
2* are more often successful than R
2. Interobserver agreement was near perfect (ICC > 0.9) for all methods. R
2 and R
2* methods provided relaxation rates when the SIR-threshold (>350 µmol/g) was already exceeded. This gives them an advantage over SIR in subjects with transfusional hemosiderosis (at least 55% of our population), when LIC values can easily surpass 350 µmol/g. The combination of high success rate, high interobserver agreement, ability to detect changes in LIC over a wide range of LIC values, and single breath-hold acquisition favors the R
2* method for LIC measurement.
In our study, the relationship between
R
2* and LIC
SIR was quadratic and remained quadratic when
R
2* was expressed as a LIC value using a previously published (biopsy-proven) conversion formula. Other authors report linear relationships. Given the physics of the
R
2*–iron relationship, which is basically linear [
25], this discrepancy arises either from our
R
2* acquisition and analysis or from the reference standard. To rule out the former, we compared three fit routines. The exponential + Rician noise factor fit provided identical results in a fraction of the required time to the established and widely applied but labor-intensive method of manual truncation before exponential fitting.
With respect to reference standard, St. Pierre et al. [
8], Wood et al. [
9], Hankins et al. [
19], Garbowski et al. [
20], and Anderson et al. [
21] all used biopsy-determined LIC
BIOPSY as reference standard, whereas we and Christoforidis et al. [
14] used the LIC
SIR according to Gandon. Given the similarity of our MRI protocols, it is unsurprising that Christoforidis’ and our data points show considerable overlap. Arguably, their linear relation between LIC
SIR and
R
2* could also be described by a quadratic polynomial.
Apart from the linear relationship, the other authors report much steeper increase of
R
2* as LIC increases [
9,
19‐
21]. Anderson et al.’s very steep increase could be due a long TE1 of 2.2 ms compared to all other studies (range of TE1: 0.8–0.99 ms) that hampers the ability to accurately estimate high
R
2* values. The fact that the control values of
R
2* in subjects without iron overload in those studies but also in this paper hover around 40 Hz is a further argument that the observed difference in LIC–
R
2* does not arise from the
R
2* acquisition or analysis but from the reference standard.
Hence, the most likely cause of the deviating quadratic relation between
R
2* and estimated LIC is the piecewise sampling of the LIC range with five differently weighted GRE-sequences for LIC
SIR. This has artificially imposed a quadratic behavior on the actually linear relationship between
R
2* and true LIC
BIOPSY. If one looks at the fundamental GRE signal equation (Eq.
9), where PD is proton density and α is flip angle and applies this to the liver-to-muscle signal intensity ratio, the PD and sin(
α) terms drop out. By taking the natural logarithm, we find Eqs.
10 and
11. The latter proves that the relationship between
R
2* and SIR is logarithmic. Indeed, plotting Fig.
3 with a log-scale for the signal intensity ratio on the
y-axis linearized the line (data not shown).
$$ S\left( {\text{TE}} \right) = \frac{{{\text{PD}} \cdot \sin \left( \alpha \right) \cdot \left( {1 - e^{{ - {\text{TR}}/T_{1} }} } \right)}}{{\left( {1 - \cos \left( \alpha \right) \cdot e^{{ - {\text{TR}}/T_{1} }} } \right)}} \cdot e^{{ - {R_{2}}^{*} \cdot {\text{TE}}}} $$
(9)
$$ \ln \left( {\frac{{S_{\text{LIVER}} }}{{S_{\text{MUSCLE}} }}} \right) = f\left( {{\text{TR}},\alpha ,T_{1} } \right) + {\text{TE}} \cdot \left( {{R_2}^{*}_{{,\,{\text{LIVER}}}} - {{R_2}^{*}_{,\,\text{MUSCLE}}}} \right) $$
(10)
$$ {{R_2}^{*}_{,\,\text{LIVER}}} = \frac{{\ln \left( {\frac{{S_{\text{LIVER}} }}{{S_{\text{MUSCLE}} }}} \right) - f\left( {{\text{TR}},\alpha ,T_{1} } \right)}}{\text{TE}} + {{R_2}^{*}_{,\,\text{MUSCLE}}} $$
(11)
For
R
2, single- and multiecho SE acquisitions are possible: multiecho SE decreases
R
2 due to residual signal of stimulated echoes at a given TE. Single-echo SE increases
R
2 because long TEs cause increased sensitivity to diffusion, hence increased signal loss at a given TE. Reported single-echo SE
R
2 values [
8,
9] were concordantly higher for the same estimated LIC compared to multiecho SE results as in this study and in [
14]. In terms of
R
2 data fitting, we as many others applied a biexponential model and we did not assess non-exponential decay models as for instance proposed by Jensen et al. [
26].
The main limitation of our study is the lack of biopsy confirmation. In our center, liver biopsy for iron determination is seldom performed. Both the national, European and American guidelines recommend reluctance in performing biopsy and underline the high sensitivity of MRI [
15,
27,
28]. Moreover, differing processing steps to obtain LIC
BIOPSY are reported, compromising generalizability. In Gandon’s method, paraffin-embedded liver biopsy specimens are dewaxed using a protocol with a triple xylene wash to remove lipid solids from the sample. This approach was shown to have an elevating effect on the dry weight liver iron calculation compared to processing fresh tissue samples [
29]. Another limitation is the fact that we did not perform multipeak fat-correction on complex data [
10]. This was not feasible with only magnitude data available. Comparison to other literature is further hampered by the use of different image acquisition and postprocessing protocols which directly influence the calibration curves between the reference standard and the index test. We have opted to compare our findings to calibration curves obtained with similar postprocessing protocols.
ROC-analyses showed that R
2 and ferritin have the highest diagnostic accuracy to identify increased R
2* (≥44 Hz). Both ferritin (≥524 µg/L) and R
2 (≥18.3 Hz) had positive predictive values of 100%, but the wide distribution of ferritin levels for R
2* ≥ 44 Hz indicates that it cannot be used confidently to follow-up treatment nor accurately determine the LIC. In contrast, R
2 shows a different picture with a close distribution around the regression line. In addition, ferritin lacks the spatial information that MRI provides, allowing segmental LIC measurement and follow-up.
R
2 datasets were missing (i.e., not scanned) in 42/114 (37%) subjects. As R
2 is part of our routine scan protocol, this illustrates that the long and artifact-prone R
2 series is skipped first by the radiographer. This makes the R
2 series less suited as first choice for LIC measurement.
Our results favor the use of R
2* measurements for daily clinical practice with the use of an exponential + Rician noise fit method to save time in analysis. The recommendation to (only) use R
2* comes with cautions. It requires careful consideration of scan parameters which should be kept equal for all measurements. Ideally, routine quality control with phantom testing should be performed.
In conclusion, as R
2* can be obtained in a single breath-hold with excellent success rates, high interobserver agreement, and ability to detect changes over a wide range of LIC values and is available from all major vendors without additional per-scan costs, it is our first choice for LIC measurement.