Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Comparison of the PI-RADS 2.1 scoring system to PI-RADS 2.0: Impact on diagnostic accuracy and inter-reader agreement

  • Andreas M. Hötker ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    Andreas.Hoetker@usz.ch

    Affiliation Institute of Diagnostic and Interventional Radiology, University Hospital Zurich, Zurich, Switzerland

  • Christian Blüthgen,

    Roles Data curation, Writing – original draft, Writing – review & editing

    Affiliation Institute of Diagnostic and Interventional Radiology, University Hospital Zurich, Zurich, Switzerland

  • Niels J. Rupp,

    Roles Conceptualization, Data curation, Writing – original draft, Writing – review & editing

    Affiliation Department of Pathology and Molecular Pathology, University Hospital Zurich, Zurich, Switzerland

  • Aurelia F. Schneider,

    Roles Conceptualization, Writing – original draft, Writing – review & editing

    Affiliation Institute of Diagnostic and Interventional Radiology, University Hospital Zurich, Zurich, Switzerland

  • Daniel Eberli,

    Roles Conceptualization, Writing – original draft, Writing – review & editing

    Affiliation Department of Urology, University Hospital Zurich, Zurich, Switzerland

  • Olivio F. Donati

    Roles Conceptualization, Data curation, Formal analysis, Project administration, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Institute of Diagnostic and Interventional Radiology, University Hospital Zurich, Zurich, Switzerland

Abstract

Purpose

To assess the value of the PI-RADS 2.1 scoring system in the detection of prostate cancer on multiparametric MRI in comparison to the standard PI-RADS 2.0 system and to assess its inter-reader variability.

Materials and methods

This IRB-approved study included 229 patients undergoing multiparametric prostate MRI prior to MRI-guided TRUS-based biopsy, which were retrospectively recruited from our prospectively maintained institutional database. Two readers with high (reader 1, 6 years) and low (reader 2, 2 years) level of expertise identified the lesion with the highest PI-RADS score for both version 2.0 and 2.1 for each patient. Inter-reader agreement was estimated, and diagnostic accuracy analysis was performed.

Results

Inter-reader agreement on PI-RADS scores was fair for both version 2.0 (kappa: 0.57) and 2.1 (kappa: 0.51). Detection rates for prostate cancer (PCa) and clinically significant prostate cancer (csPCa) were almost identical for both PI-RADS versions and higher for the more experienced reader (AUC, Reader 1: PCa, 0.881–0.887, csPCa, 0.874–0.879; Reader 2: PCa, 0.765, csPCa, 0.746–0.747; both p > 0.05), both when using a PI-RADS score of ≥ 4 and ≥3 as indicators for positivity for cancer.

Conclusions

The new PI-RADS 2.1 scoring system showed comparable diagnostic performance and inter-reader variability compared to version 2.0. The introduced changes in the version 2.1 seem only to take effect in a very small number of patients.

Introduction

Multiparametric prostate MRI is now part of the standard clinical work-up for patients with elevated PSA at many institutions, as it has shown to improve detection rates of clinically-significant prostate cancer in patients who subsequently undergo targeted biopsy, with fewer biopsy cores necessary [16]. In clinical routine, the PI-RADS version 2.0 scoring system [7] has been the most common approach to identifying and scoring suspicious lesions, as it offers much-needed standardization of reports, a structured way of assessing lesions and has been broadly validated [8, 9].

Recently, an updated version of the PI-RADS guidelines, version 2.1, has been published [10], which addresses various inconsistencies and issues that have been identified in studies and by the increased experience over the years of use [11]. In addition to clarification of technical aspects, the revised guidelines induce subtle changes to the scoring of indeterminate lesions in the transitional zone (TZ) and an update to the scoring of lesions on diffusion-weighted sequences (DWI), seeking to reduce the number of lesions scored “indeterminate” and thus further increase diagnostic accuracy of prostate MRI. This is of particular importance, as indeterminate lesions pose a clinical challenge regarding patient management and further course of action (e.g. whether a biopsy is required or not in these patients). A higher precision of the PI-RADS 2.1 guidelines could therefore lead to a reduction in unnecessary biopsies, however, missing clinically significant cancers that could affect patient outcome needs to be avoided.

The purpose of this retrospective analysis was therefore to assess the value of the new PI-RADS 2.1 scoring system in the detection of prostate cancer and to compare it to version 2.0.

Materials and methods

Patients and reference standard

This study was approved by the institutional review board (Cantonal Ethics Commission Zurich) and the requirement for a study-specific informed consent for this study was waived. A retrospective search was performed on our prospectively maintained institutional database from 01/2015–12/2017 for consecutive patients undergoing multiparametric prostate MRI following transperineal template saturation biopsy. This initial search yielded a number of 267 patients. Of those, patients who had not signed a general consent to share their data for any research question/ who had withdrawn consent to participate in the study (n = 27) or whose scans demonstrated severe motion or susceptibility artifacts (n = 11) were excluded. The final patient cohort therefore consisted of 229 men (mean age: 63.1, range: 46–79 years), with a mean PSA of 8.2 μg/L (range: 0.81–100 μg/L). The mean time between MRI and biopsy was 42.3 days (0–208 days). All clinical information was collected from our hospital information system. Pre-biopsy PSA values were not available in 4 patients.

Transperineal template saturation biopsy served as the reference standard and was carried-out by board-certified urologists. Cores were taken every 5 mm throughout the prostate up to a total of 40 cores. If a lesion suspicious for tumor (PI-RADS score ≥ 3) had been identified on prior mpMRI, three additional targeted biopsies were taken from this area. All histopathological specimens were evaluated by dedicated genitourinary pathologists. Clinically significant prostate cancer was defined as a Gleason score of ≥ 3 + 4.

Of note, the patients included in this study have been part of earlier investigations [12], however, these studies did not investigate the value of the PI-RADS scoring system version 2.1.

MRI and image analysis

All MRI scans were acquired on scanners manufactured by Siemens (Siemens Skyra, Siemens Healthineers, Erlangen, Germany) at a field strength of 3 Tesla and using an 18-channel phased-array receiver coil. In 68 patients, an additional balloon-covered expandable endorectal coil (Medrad, Warrendale, USA) was used. The MRI protocol consisted of T2- weighted turbo spin-echo sequences covering the prostate and the seminal vesicles (transverse, sagittal and coronal orientation) and a transverse diffusion-weighted sequence with three b-values (100, 600 and 1000 s/mm2). A high b-value of 1400 s/mm2 was calculated. Dynamic contrast-enhanced MR images were obtained in transverse orientation with a temporal resolution ≤ 8s. Gadoterate meglumine (Dotarem, Guerbet, Darmstadt, Germany) was used as a contrast agent in a dose of 0.1 mmol/kg of body weight. The MR protocol was in accordance to the general recommendations published in the PI-RADS guidelines [7].

Two readers, a board-certified radiologist (initials blinded for review) with > 5 years of experience in prostate MRI and a radiology resident with 2 years of experience (initials blinded for review) separately reviewed all scans while being blinded to all clinical and histopathological information. Each reader identified the lesion with the highest PI-RADS score on a per-patient basis for the PI-RADS 2.0 [7] and 2.1 [10] scoring system individually. No wash-out period was introduced between PIRADS 2.1 and 2.0 readings to not introduce intra-reader variability as a potential bias.

Statistical analysis

All statistical analyses were performed in SPSS (IBM Inc., Armonk, USA) and R version 2.13 (The R Foundation for Statistical Computing). Continuous variables were expressed as medians and ranges. Categorical variables were expressed as counts and percentages. Inter- and intra-reader agreement was assessed using weighted Cohen’s kappa and was interpreted as follows: excellent agreement > 0.75, good agreement 0.59–0.75, fair agreement 0.40–0.58, poor agreement < 0.4. Diagnostic accuracy was assessed by the area under the curve of a receiver-operator-characteristics (ROC) analysis for both the detection of prostate cancer and clinically significant prostate cancer (defined as prostate cancer with a highest Gleason score ≥ 3 + 4). ROC curves were compared according to the methodology laid out by DeLong et al. to test for statistical significance [13]. A test result with a p-value < 0.05 was considered statistically significant.

Analyses were performed both with using a PI-RADS score of ≥ 4 and ≥ 3 to indicate positivity for cancer.

Results

Patient and tumor characteristics

The number of patients with highest Gleason scores found on histopathological examinations of the biopsy cores were as follows: 26 with 3+3 (11.4%), 68 with 3+4 (29.7%), 31 with 4+3 (13.5%), 11 with 4+4 (4.8%), 10 with 4+5 (4.4%), 1 with 5+4 (0.4%) and one patient with 5+5 (0.4%).

Inter-reader agreement

Inter-reader agreement between reader 1 and 2 for PI-RADS 2.0 scores was found to be fair (kappa: 0.57, 0.49–0.66 95% CI) and slightly higher than the agreement between reader 1 and 2 on PI-RADS 2.1 scores (kappa: 0.51, 0.44–0.59 95% CI).

Detection of prostate cancer and clinically significant prostate cancer

Detailed information on the distribution of PI-RADS scores for both version 2.0 and 2.1 as well as the detected prostate cancers (PCa) or clinically significant prostate cancers (csPCa) and associated sensitivity/specificity are given in Tables 1 and 2, Fig 1 (Fig 1A: Receiver-operator-characteristics (ROC) analysis for the detection of prostate cancer with PI-RADS version 2.0 and 2.1 for both readers, respectively. Fig 1B: Receiver-operator-characteristics (ROC) analysis for the detection of clinically significant prostate cancer (Gleason score ≥ 3 + 4) with PI-RADS version 2.0 and 2.1 for both readers, respectively.).

thumbnail
Table 1. PI-RADS scores version 2.0 and 2.1 for both readers and the number of detected prostate cancers (PCa) and clinically significant prostate cancers (csPCa).

https://doi.org/10.1371/journal.pone.0239975.t001

thumbnail
Table 2. Sensitivity and Specificity of PI-RADS 2.0 and PI-RADS 2.1 scores for both readers in the detection of prostate cancer and clinically significant prostate cancer (Gleason score ≥ 3+4).

A PI-RADS score ≥ 4 was deemed to indicate positivity for cancer.

https://doi.org/10.1371/journal.pone.0239975.t002

An almost identical performance of the PI-RADS 2.1 scoring system compared to the version 2.0 was seen for two different thresholds for indicating positivity for prostate cancer (PI-RADS score of 4–5 or 3–5). AUCs were marginally higher in PI-RADS 2.1 (PCa: reader 1: 0.887, reader 2: 0.765; csPCa: reader 1: 0.879, reader 2: 0.747) compared to 2.0 for both readers (PCa: reader 1: 0.881, reader 2: 0.765; csPCa: reader 1: 0.874, reader 2: 0.746, see Table 2), but the difference between PIRADS 2.1 and 2.0 was not statistically significant for either reader (PCa: reader 1: p = 0.34, reader 2: p = 0.86; csPCa: reader 1: p = 0.17, reader 2: p = 0.82). A lesion demonstrating imaging features which are newly described in the recent update of PI-RADS (e.g. marked hypointensity on ADC/hyperintensity on high b-value DWI but not both or a lesion with a TZ score of 2 and a DWI score of ≥ 4) which would vindicate a higher overall score was not seen in our study.

Discussion

Multiparametric prostate MRI is part of the clinical pathway of patients with elevated PSA in many centers, as its value in the detection and classification of prostate cancer is supported by a large body of evidence [1, 2, 4, 12, 14]. Despite some minor limitations and inconsistencies becoming apparent after implementation, the PI-RADS 2.0 scoring system has been broadly adopted in the radiological and urological communities and has been extensively validated to allow for the reliable identification of csPCa [8].

The recently published PI-RADS 2.1 guidelines [10] try to remedy some of the limitations identified [11], for example, by clarifying technical aspects of prostate MRI, the reporting of central zone lesions or lesions arising from the anterior fibromuscular stroma. However, the new guidelines also introduce subtle changes to the scoring of both transitional zone tumors and lesions on DWI in general, which is hoped to improve the system’s accuracy and reliability. The criteria for DWI scores 2 and 3 have been revised, with a score of 2 being assigned to lesions that are “linear/wedge-shaped hypointense on ADC and/or linear/wedge-shaped hyperintense on high b-value DWI” whereas a score of 3 requires “focal hypointense on ADC and/or focal hyperintense on high b-value DWI” and a lesion may be “markedly hypointense on high b-value DWI or markedly hyperintense on high b-value DWI, but not both”. In TZ tumors, a lesion with a newly defined T2 score of 2 and a DWI score of 4 or higher would now be assigned an overall score of 3 (instead of 2). However, lesions fulfilling these criteria seem to be rare in clinical routine and we did not see any in our study cohort: Both readers scored lesions nearly identical when using PI-RADS version 2.0 and 2.1 and the small increase in AUC seen in both readers is probably not clinically relevant. Research data on the comparison between PIRADS 2.0 and 2.1 is still sparse, with a few report indicating a slight improvement in the detection of cancer in the transitional zone [15, 16], however, a recent study of Moreira et al. aligns with our results and did not see “significant changes in the number of positive and negative MRI results” and “expected low influence in clinical management” [17]. We did see an effect of reader experience, with the experienced reader reaching higher levels of sensitivity/specificity than the unexperienced reader, even when using the same PI-RADS criteria for the scoring of lesions. This indicates that even when descriptive terms are defined more precisely, the interpretation of these terms remain subjective to a certain extent and are interpreted differently among radiologists. A possible means to further reducing different interpretations of defined descriptions of imaging features may be to introduce quantitative measures.

Another aim of the new guidelines is the “improvement of inter-reader variability”, as reproducibility of findings/scores represents a crucial requirement for any scoring system in clinical routine. However, we did not see an increase in agreement between the two readers when moving from 2.0 to 2.1, albeit a small decrease which is most likely not clinically significant. This decrease may be due to the readers being less familiar with the new scoring system compared to PI-RADS 2.0, however, we could not demonstrate an improvement regarding inter-reader agreement/reproducibility.

The recent changes introduced to the PI-RADS scoring system [10] certainly clarify certain technical matters or aspects of reporting and may help in scoring of few non-typical lesions, but their influence on the majority of “typical” suspicious lesion encountered in clinical routine seems to be small. Nevertheless, the performance of PI-RADS in the detection of clinically significant cancer is good and improves with experience, which highlights the importance of training and structured education in prostate MRI [18]. For further improvement on the detection rates, the use of quantitative imaging parameters may be an option [5, 19, 20].

Our study has limitations: First, we scored one lesion per patient (the “index lesion”) and the results may differ when scoring every lesion in a patient, as this reduces the number of indeterminate lesions (if another lesion with a higher score is present)–though clinical management is most commonly based on the Gleason score of the dominant lesion. Ideally, the use of pathological maps would allow for direct radiological-pathological correlation in future studies. Secondly, this study was limited by its retrospective design and albeit including a relatively high number of patients, may still be limited by the size of the patient cohort since the changes introduced in PI-RADS 2.1 only affect a very small number of lesions.

In conclusion, we demonstrated a comparable performance of PI-RADS 2.1 compared to version 2.0 in the detection of prostate cancer and clinically significant prostate cancer and could not show an improvement in inter-reader agreement. Future revision of the PI-RADS guideline may need to take quantitative measurements into account in order to increase reproducibility of PI-RADS scores.

References

  1. 1. Kasivisvanathan V, Rannikko AS, Borghi M, Panebianco V, Mynderse LA, Vaarala MH, et al. MRI-Targeted or Standard Biopsy for Prostate-Cancer Diagnosis. N Engl J Med. 2018; 378: 1767–1777. pmid:29552975
  2. 2. Mehralivand S, Bednarova S, Shih JH, Mertan FV, Gaur S, Merino MJ, et al. Prospective Evaluation of PI-RADS Version 2 Using the International Society of Urological Pathology Prostate Cancer Grade Group System. J Urol. 2017; 198: 583–590. pmid:28373133
  3. 3. Siddiqui MM, Rais-Bahrami S, Turkbey B, George AK, Rothwax J, Shakir N, et al. Comparison of MR/ultrasound fusion-guided biopsy with ultrasound-guided biopsy for the diagnosis of prostate cancer. JAMA. 2015; 313: 390–397. pmid:25626035
  4. 4. Ahmed HU, El-Shater Bosaily A, Brown LC, Gabe R, Kaplan R, Parmar MK, et al. Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet. 2017; 389: 815–822. pmid:28110982
  5. 5. Panda A, O'Connor G, Lo WC, Jiang Y, Margevicius S, Schluchter M, et al. Targeted Biopsy Validation of Peripheral Zone Prostate Cancer Characterization With Magnetic Resonance Fingerprinting and Diffusion Mapping. Invest Radiol. 2019; 54: 485–493. pmid:30985480
  6. 6. Weiss J, Martirosian P, Notohamiprodjo M, Kaufmann S, Othman AE, Grosse U, et al. Implementation of a 5-Minute Magnetic Resonance Imaging Screening Protocol for Prostate Cancer in Men With Elevated Prostate-Specific Antigen Before Biopsy. Invest Radiol. 2018; 53: 186–190. pmid:29077588
  7. 7. Weinreb JC, Barentsz JO, Choyke PL, Cornud F, Haider MA, Macura KJ, et al. PI-RADS Prostate Imaging—Reporting and Data System: 2015, Version 2. Eur Urol. 2016; 69: 16–40. pmid:26427566
  8. 8. Vargas HA, Hotker AM, Goldman DA, Moskowitz CS, Gondo T, Matsumoto K, et al. Updated prostate imaging reporting and data system (PIRADS v2) recommendations for the detection of clinically significant prostate cancer using multiparametric MRI: critical evaluation using whole-mount pathology as standard of reference. Eur Radiol. 2016; 26: 1606–1612. pmid:26396111
  9. 9. Hansen NL, Barrett T, Kesch C, Pepdjonovic L, Bonekamp D, O'Sullivan R, et al. Multicentre evaluation of magnetic resonance imaging supported transperineal prostate biopsy in biopsy-naive men with suspicion of prostate cancer. BJU Int. 2018; 122: 40–49. pmid:29024425
  10. 10. Turkbey B, Rosenkrantz AB, Haider MA, Padhani AR, Villeirs G, Macura KJ, et al. Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol. 2019. pmid:30898406
  11. 11. Padhani AR, Weinreb J, Rosenkrantz AB, Villeirs G, Turkbey B, Barentsz J. Prostate Imaging-Reporting and Data System Steering Committee: PI-RADS v2 Status Update and Future Directions. Eur Urol. 2019; 75: 385–396. pmid:29908876
  12. 12. Mortezavi A, Marzendorfer O, Donati OF, Rizzi G, Rupp NJ, Wettstein MS, et al. Diagnostic Accuracy of Multiparametric Magnetic Resonance Imaging and Fusion Guided Targeted Biopsy Evaluated by Transperineal Template Saturation Prostate Biopsy for the Detection and Characterization of Prostate Cancer. J Urol. 2018; 200: 309–318. pmid:29474846
  13. 13. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988; 44: 837–845. pmid:3203132
  14. 14. Scheenen TWJ, Rosenkrantz AB, Haider MA, Futterer JJ. Multiparametric Magnetic Resonance Imaging in Prostate Cancer Management: Current Status and Future Perspectives. Invest Radiol. 2015; 50: 594–600. pmid:25974203
  15. 15. Lim CS, Abreu-Gomez J, Carrion I, Schieda N. Prevalence of prostate cancer in PI-RADS version 2.1 transition zone 'atypical nodules' upgraded by abnormal diffusion weighted imaging: correlation with MRI-directed TRUS-guided targeted biopsy. AJR Am J Roentgenol. 2020. pmid:32755208
  16. 16. Wei C-G, Zhang Y-Y, Pan P, Chen T, Yu H-C, Dai G-C, et al. Diagnostic Accuracy and Inter-observer Agreement of PI-RADS Version 2 and Version 2.1 for the Detection of Transition Zone Prostate Cancers. AJR Am J Roentgenol. 2020. pmid:32755220
  17. 17. Linhares Moreira AS, Visschere P de, van Praet C, Villeirs G. How does PI-RADS v2.1 impact patient classification? A head-to-head comparison between PI-RADS v2.0 and v2.1. Acta Radiol. 2020: 284185120941831. pmid:32702998
  18. 18. Greer MD, Brown AM, Shih JH, Summers RM, Marko J, Law YM, et al. Accuracy and agreement of PIRADSv2 for prostate cancer mpMRI: A multireader study. J Magn Reson Imaging. 2017; 45: 579–585. pmid:27391860
  19. 19. Maas MC, Litjens GJS, Wright AJ, Attenberger UI, Haider MA, Helbich TH, et al. A Single-Arm, Multicenter Validation Study of Prostate Cancer Localization and Aggressiveness With a Quantitative Multiparametric Magnetic Resonance Imaging Approach. Invest Radiol. 2019; 54: 437–447. pmid:30946180
  20. 20. Vos EK, Kobus T, Litjens GJS, Hambrock T, Hulsbergen-van de Kaa, Christina A, Barentsz JO, et al. Multiparametric Magnetic Resonance Imaging for Discriminating Low-Grade From High-Grade Prostate Cancer. Invest Radiol. 2015; 50: 490–497. pmid:25867656