Skip to main content
Erschienen in: European Radiology 3/2019

07.09.2018 | Computed Tomography

Inter-observer variability of manual contour delineation of structures in CT

verfasst von: Leo Joskowicz, D. Cohen, N. Caplan, J. Sosna

Erschienen in: European Radiology | Ausgabe 3/2019

Einloggen, um Zugang zu erhalten

Abstract

Purpose

To quantify the inter-observer variability of manual delineation of lesions and organ contours in CT to establish a reference standard for volumetric measurements for clinical decision making and for the evaluation of automatic segmentation algorithms.

Materials and methods

Eleven radiologists manually delineated 3193 contours of liver tumours (896), lung tumours (1085), kidney contours (434) and brain hematomas (497) on 490 slices of clinical CT scans. A comparative analysis of the delineations was then performed to quantify the inter-observer delineation variability with standard volume metrics and with new group-wise metrics for delineations produced by groups of observers.

Results

The mean volume overlap variability values and ranges (in %) between the delineations of two observers were: liver tumours 17.8 [-5.8,+7.2]%, lung tumours 20.8 [-8.8,+10.2]%, kidney contours 8.8 [-0.8,+1.2]% and brain hematomas 18 [-6.0,+6.0] %. For any two randomly selected observers, the mean delineation volume overlap variability was 5–57%. The mean variability captured by groups of two, three and five observers was 37%, 53% and 72%; eight observers accounted for 75–94% of the total variability. For all cases, 38.5% of the delineation non-agreement was due to parts of the delineation of a single observer disagreeing with the others. No statistical difference was found for the delineation variability between the observers based on their expertise.

Conclusion

The variability in manual delineations for different structures and observers is large and spans a wide range across a variety of structures and pathologies. Two and even three observers may not be sufficient to establish the full range of inter-observer variability.

Key Points

This study quantifies the inter-observer variability of manual delineation of lesions and organ contours in CT.
The variability of manual delineations between two observers can be significant. Two and even three observers capture only a fraction of the full range of inter-observer variability observed in common practice.
Inter-observer manual delineation variability is necessary to establish a reference standard for radiologist training and evaluation and for the evaluation of automatic segmentation algorithms.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Nanda A, Konar SK, Maiti TK, Bir SC, Guthikonda B (2016) Stratification of predictive factors to assess resectability and surgical outcome in clinoidal meningioma. Clin Neurol Neurosurg 142:31–37CrossRefPubMed Nanda A, Konar SK, Maiti TK, Bir SC, Guthikonda B (2016) Stratification of predictive factors to assess resectability and surgical outcome in clinoidal meningioma. Clin Neurol Neurosurg 142:31–37CrossRefPubMed
2.
Zurück zum Zitat Vivanti R, Szeskin A, Lev-Cohain N, Sosna J, Joskowicz L (2017) Automatic detection of new tumors and tumor burden evaluation in longitudinal liver CT scan studies. Int J Comput Assist Radiol Surg 12(11):1945–1957CrossRefPubMed Vivanti R, Szeskin A, Lev-Cohain N, Sosna J, Joskowicz L (2017) Automatic detection of new tumors and tumor burden evaluation in longitudinal liver CT scan studies. Int J Comput Assist Radiol Surg 12(11):1945–1957CrossRefPubMed
3.
Zurück zum Zitat Bhooshan N, Sharma NK, Badiyan S et al (2016) Pretreatment tumor volume as a prognostic factor in metastatic colorectal cancer treated with selective internal radiation to the liver using yttrium-90 resin microspheres. J Gastrointest Oncol 7(6):931–937CrossRefPubMedPubMedCentral Bhooshan N, Sharma NK, Badiyan S et al (2016) Pretreatment tumor volume as a prognostic factor in metastatic colorectal cancer treated with selective internal radiation to the liver using yttrium-90 resin microspheres. J Gastrointest Oncol 7(6):931–937CrossRefPubMedPubMedCentral
4.
Zurück zum Zitat Abbara S, Blanke P, Maroules CD et al (2016) SCCT guidelines for the performance and acquisition of coronary computed tomographic angiography: A report of the society of Cardiovascular Computed Tomography Guidelines Committee: Endorsed by the North American Society for Cardiovascular Imaging (NASCI). J Cardiovasc Comput Tomogr 10(6):435–449CrossRefPubMed Abbara S, Blanke P, Maroules CD et al (2016) SCCT guidelines for the performance and acquisition of coronary computed tomographic angiography: A report of the society of Cardiovascular Computed Tomography Guidelines Committee: Endorsed by the North American Society for Cardiovascular Imaging (NASCI). J Cardiovasc Comput Tomogr 10(6):435–449CrossRefPubMed
5.
Zurück zum Zitat Greenberg V, Lazarev I, Frank Y, Dudnik J, Ariad S, Shelef I (2017) Semi-automatic volumetric measurement of response to chemotherapy in lung cancer patients: How wrong are we using RECIST? Lung Cancer 108:90–95CrossRefPubMed Greenberg V, Lazarev I, Frank Y, Dudnik J, Ariad S, Shelef I (2017) Semi-automatic volumetric measurement of response to chemotherapy in lung cancer patients: How wrong are we using RECIST? Lung Cancer 108:90–95CrossRefPubMed
6.
Zurück zum Zitat Pupulim LF, Ronot M, Paradis V, Chemouny S, Vilgrain V (2017) Volumetric measurement of hepatic tumors: accuracy of manual contouring using CT with volumetric pathology as the reference method. Diagn Interv Imaging S2211-5684(17):30282–30286 Pupulim LF, Ronot M, Paradis V, Chemouny S, Vilgrain V (2017) Volumetric measurement of hepatic tumors: accuracy of manual contouring using CT with volumetric pathology as the reference method. Diagn Interv Imaging S2211-5684(17):30282–30286
7.
Zurück zum Zitat Cai W, He B, Fan Y, Fang C, Jia F (2016) Comparison of liver volumetry on contrast-enhanced CT images: one semiautomatic and two automatic approaches. J Appl Clin Med Phys 17(6):118–127CrossRefPubMedPubMedCentral Cai W, He B, Fan Y, Fang C, Jia F (2016) Comparison of liver volumetry on contrast-enhanced CT images: one semiautomatic and two automatic approaches. J Appl Clin Med Phys 17(6):118–127CrossRefPubMedPubMedCentral
8.
Zurück zum Zitat Haas M, Hamm B, Niehues SM (2014) Automated lung volumetry from routine thoracic CT scans: how reliable is the result? Acad Radiol 21(5):633–638CrossRefPubMed Haas M, Hamm B, Niehues SM (2014) Automated lung volumetry from routine thoracic CT scans: how reliable is the result? Acad Radiol 21(5):633–638CrossRefPubMed
9.
Zurück zum Zitat Warfield SK, Zou KH, Wells WM (2004) Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging 23(7):903–921CrossRefPubMedPubMedCentral Warfield SK, Zou KH, Wells WM (2004) Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging 23(7):903–921CrossRefPubMedPubMedCentral
11.
Zurück zum Zitat Cohen D (2017) Segmentation variability estimation in medical image processing: framework, method and study. MSc Thesis. The Hebrew University of Jerusalem Israel Cohen D (2017) Segmentation variability estimation in medical image processing: framework, method and study. MSc Thesis. The Hebrew University of Jerusalem Israel
12.
Zurück zum Zitat Meyer CR, Johnson TD, McLennan G et al (2006) Evaluation of lung MDCT nodule annotation across radiologists and methods. Acad Radiol 13(10):1254–1265CrossRefPubMedPubMedCentral Meyer CR, Johnson TD, McLennan G et al (2006) Evaluation of lung MDCT nodule annotation across radiologists and methods. Acad Radiol 13(10):1254–1265CrossRefPubMedPubMedCentral
13.
Zurück zum Zitat Bø HK, Solheim O, Jakola AS, Kvistad KA, Reinertsen I, Berntsen EM (2017) Intra-rater variability in low-grade glioma segmentation. J Neurooncol 131(2):393–402CrossRefPubMed Bø HK, Solheim O, Jakola AS, Kvistad KA, Reinertsen I, Berntsen EM (2017) Intra-rater variability in low-grade glioma segmentation. J Neurooncol 131(2):393–402CrossRefPubMed
14.
Zurück zum Zitat Gurari D, Theriault D, Sameki M, et al (2015) How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. Proc IEEE Winter Conference on Applications of Computer Vision, pp 1169–1176 Gurari D, Theriault D, Sameki M, et al (2015) How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. Proc IEEE Winter Conference on Applications of Computer Vision, pp 1169–1176
15.
Zurück zum Zitat Irshad H, Montaser-Kouhsari L, Waltz G et al (2015) Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd. Pac Symp Biocomput, pp 294–305 Irshad H, Montaser-Kouhsari L, Waltz G et al (2015) Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd. Pac Symp Biocomput, pp 294–305
17.
Zurück zum Zitat Valindria VV, Lavdas I, Bai W et al (2017) Reverse classification accuracy: predicting segmentation performance in the absence of ground truth. IEEE Trans Med Imaging 36(8):1597–1606CrossRefPubMed Valindria VV, Lavdas I, Bai W et al (2017) Reverse classification accuracy: predicting segmentation performance in the absence of ground truth. IEEE Trans Med Imaging 36(8):1597–1606CrossRefPubMed
Metadaten
Titel
Inter-observer variability of manual contour delineation of structures in CT
verfasst von
Leo Joskowicz
D. Cohen
N. Caplan
J. Sosna
Publikationsdatum
07.09.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
European Radiology / Ausgabe 3/2019
Print ISSN: 0938-7994
Elektronische ISSN: 1432-1084
DOI
https://doi.org/10.1007/s00330-018-5695-5

Weitere Artikel der Ausgabe 3/2019

European Radiology 3/2019 Zur Ausgabe

Mammakarzinom: Brustdichte beeinflusst rezidivfreies Überleben

26.05.2024 Mammakarzinom Nachrichten

Frauen, die zum Zeitpunkt der Brustkrebsdiagnose eine hohe mammografische Brustdichte aufweisen, haben ein erhöhtes Risiko für ein baldiges Rezidiv, legen neue Daten nahe.

„Übersichtlicher Wegweiser“: Lauterbachs umstrittener Klinik-Atlas ist online

17.05.2024 Klinik aktuell Nachrichten

Sie sei „ethisch geboten“, meint Gesundheitsminister Karl Lauterbach: mehr Transparenz über die Qualität von Klinikbehandlungen. Um sie abzubilden, lässt er gegen den Widerstand vieler Länder einen virtuellen Klinik-Atlas freischalten.

Klinikreform soll zehntausende Menschenleben retten

15.05.2024 Klinik aktuell Nachrichten

Gesundheitsminister Lauterbach hat die vom Bundeskabinett beschlossene Klinikreform verteidigt. Kritik an den Plänen kommt vom Marburger Bund. Und in den Ländern wird über den Gang zum Vermittlungsausschuss spekuliert.

Darf man die Behandlung eines Neonazis ablehnen?

08.05.2024 Gesellschaft Nachrichten

In einer Leseranfrage in der Zeitschrift Journal of the American Academy of Dermatology möchte ein anonymer Dermatologe bzw. eine anonyme Dermatologin wissen, ob er oder sie einen Patienten behandeln muss, der eine rassistische Tätowierung trägt.

Update Radiologie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.