Skip to main content
Erschienen in: Journal of Digital Imaging 4/2022

03.03.2022

Effect of Training Data Volume on Performance of Convolutional Neural Network Pneumothorax Classifiers

verfasst von: Yee Liang Thian, Dian Wen Ng, James Thomas Patrick Decourcy Hallinan, Pooja Jagmohan, Soon Yiew Sia, Jalila Sayed Adnan Mohamed, Swee Tian Quek, Mengling Feng

Erschienen in: Journal of Imaging Informatics in Medicine | Ausgabe 4/2022

Einloggen, um Zugang zu erhalten

Abstract

Large datasets with high-quality labels required to train deep neural networks are challenging to obtain in the radiology domain. This work investigates the effect of training dataset size on the performance of deep learning classifiers, focusing on chest radiograph pneumothorax detection as a proxy visual task in the radiology domain. Two open-source datasets (ChestX-ray14 and CheXpert) comprising 291,454 images were merged and convolutional neural networks trained with stepwise increase in training dataset sizes. Model iterations at each dataset volume were evaluated on an external test set of 525 emergency department chest radiographs. Learning curve analysis was performed to fit the observed AUCs for all models generated. For all three network architectures tested, model AUCs and accuracy increased rapidly from 2 × 103 to 20 × 103 training samples, with more gradual increase until the maximum training dataset size of 291 × 103 images. AUCs for models trained with the maximum tested dataset size of 291 × 103 images were significantly higher than models trained with 20 × 103 images: ResNet-50: AUC20k = 0.86, AUC291k = 0.95, p < 0.001; DenseNet-121 AUC20k = 0.85, AUC291k = 0.93, p < 0.001; EfficientNet AUC20k = 0.92, AUC 291 k = 0.98, p < 0.001. Our study established learning curves describing the relationship between dataset training size and model performance of deep learning convolutional neural networks applied to a typical radiology binary classification task. These curves suggest a point of diminishing performance returns for increasing training data volumes, which algorithm developers should consider given the high costs of obtaining and labelling radiology data.
Literatur
1.
Zurück zum Zitat Sun, C., Shrivastava, A., Singh, S., & Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of the IEEE international conference on computer vision pp. 843–852, 2017. Sun, C., Shrivastava, A., Singh, S., & Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proceedings of the IEEE international conference on computer vision pp. 843–852, 2017.
2.
Zurück zum Zitat Alwosheel A, van Cranenburgh S, Chorus CG. Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis. Journal of choice modelling. 28:167-82, 2018.CrossRef Alwosheel A, van Cranenburgh S, Chorus CG. Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis. Journal of choice modelling. 28:167-82, 2018.CrossRef
3.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition pp. 248–255, 2009. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., & Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition pp. 248–255, 2009.
4.
Zurück zum Zitat Parkhi, O.M., Vedaldi, A., & Zisserman, A. Deep face recognition. In bmvc, vol. 1, p.6, 2015. Parkhi, O.M., Vedaldi, A., & Zisserman, A. Deep face recognition. In bmvc, vol. 1, p.6, 2015.
5.
Zurück zum Zitat Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC medicine. 17:1-9 2019.CrossRef Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC medicine. 17:1-9 2019.CrossRef
6.
Zurück zum Zitat Price WN, Cohen IG. Privacy in the age of medical big data. Nat Med. 25:37-43, 2019.CrossRef Price WN, Cohen IG. Privacy in the age of medical big data. Nat Med. 25:37-43, 2019.CrossRef
7.
Zurück zum Zitat Prevedello LM, Halabi SS, Shih G, Wu CC, Kohli MD, Chokshi FH, Erickson BJ, Kalpathy-Cramer J, Andriole KP, Flanders AE. Challenges related to artificial intelligence research in medical imaging and the importance of image analysis competitions. Radiol Artif Intell. 1:e180031, 2019. Prevedello LM, Halabi SS, Shih G, Wu CC, Kohli MD, Chokshi FH, Erickson BJ, Kalpathy-Cramer J, Andriole KP, Flanders AE. Challenges related to artificial intelligence research in medical imaging and the importance of image analysis competitions. Radiol Artif Intell. 1:e180031, 2019.
8.
Zurück zum Zitat Willemink MJ, Koszek WA, Hardell C, Wu J, Fleischmann D, Harvey H, Folio LR, Summers RM, Rubin DL, Lungren MP. Preparing medical imaging data for machine learning. Radiology. 295:4-15, 2020.CrossRef Willemink MJ, Koszek WA, Hardell C, Wu J, Fleischmann D, Harvey H, Folio LR, Summers RM, Rubin DL, Lungren MP. Preparing medical imaging data for machine learning. Radiology. 295:4-15, 2020.CrossRef
9.
Zurück zum Zitat Kim DH, MacKinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol. 73:439-45, 2018.CrossRef Kim DH, MacKinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol. 73:439-45, 2018.CrossRef
10.
Zurück zum Zitat Majkowska A, Mittal S, Steiner DF, Reicher JJ, McKinney SM, Duggan GE, Eswaran K, Cameron Chen PH, Liu Y, Kalidindi SR, Ding A. Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology. 294:421-31, 2020.CrossRef Majkowska A, Mittal S, Steiner DF, Reicher JJ, McKinney SM, Duggan GE, Eswaran K, Cameron Chen PH, Liu Y, Kalidindi SR, Ding A. Chest radiograph interpretation with deep learning models: assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology. 294:421-31, 2020.CrossRef
11.
Zurück zum Zitat Cho J, Lee K, Shin E, Choy G, Do S. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? arXiv Prepr arXiv151106348, 2015. Cho J, Lee K, Shin E, Choy G, Do S. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? arXiv Prepr arXiv151106348, 2015.
12.
Zurück zum Zitat Narayana, P. A., Coronado, I., Sujit, S. J., Wolinsky, J. S., Lublin, F. D., & Gabr, R. E. Deep-Learning-Based Neural Tissue Segmentation of MRI in Multiple Sclerosis: Effect of Training Set Size. J Magn Reson Imaging. 51:1487–1496, 2020.CrossRef Narayana, P. A., Coronado, I., Sujit, S. J., Wolinsky, J. S., Lublin, F. D., & Gabr, R. E. Deep-Learning-Based Neural Tissue Segmentation of MRI in Multiple Sclerosis: Effect of Training Set Size. J Magn Reson Imaging. 51:1487–1496, 2020.CrossRef
13.
Zurück zum Zitat Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers R. Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. InIEEE CVPR pp. 3462–3471, 2017. Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers R. Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. InIEEE CVPR pp. 3462–3471, 2017.
14.
Zurück zum Zitat Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball R, Shpanskaya K, Seekins J. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI Conference on Artificial Intelligenc. 33: 590-597, 2019.CrossRef Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball R, Shpanskaya K, Seekins J. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. In Proceedings of the AAAI Conference on Artificial Intelligenc. 33: 590-597, 2019.CrossRef
15.
Zurück zum Zitat Oakden-Rayner L. Exploring large-scale public medical image datasets. Acad Radiol. 27:106-12, 2020.CrossRef Oakden-Rayner L. Exploring large-scale public medical image datasets. Acad Radiol. 27:106-12, 2020.CrossRef
16.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 770–778, 2016. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 770–778, 2016.
17.
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition 4700–4708, 2017. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition 4700–4708, 2017.
18.
Zurück zum Zitat Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning 6105–6114, 2019. Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning 6105–6114, 2019.
19.
Zurück zum Zitat Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference On Computer Vision 618–626, 2017. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference On Computer Vision 618–626, 2017.
20.
Zurück zum Zitat Krawczyk B. Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence. 5:221-32, 2016.CrossRef Krawczyk B. Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence. 5:221-32, 2016.CrossRef
21.
Zurück zum Zitat Kohli MD, Summers RM, Geis JR. Medical image data and datasets in the era of machine learning—whitepaper from the 2016 C-MIMI meeting dataset session. J Digit Imaging 30:392-9, 2017CrossRef Kohli MD, Summers RM, Geis JR. Medical image data and datasets in the era of machine learning—whitepaper from the 2016 C-MIMI meeting dataset session. J Digit Imaging 30:392-9, 2017CrossRef
22.
Zurück zum Zitat Figueroa RL, Zeng-Treitler Q, Kandula S, Ngo LH. Predicting sample size required for classification performance. BMC Med Inform Decis Mak. 12:8, 2012.CrossRef Figueroa RL, Zeng-Treitler Q, Kandula S, Ngo LH. Predicting sample size required for classification performance. BMC Med Inform Decis Mak. 12:8, 2012.CrossRef
23.
Zurück zum Zitat Hestness J, Narang S, Ardalani N, Diamos G, Jun H, Kianinejad H, Patwary M, Ali M, Yang Y, Zhou Y. Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017 . Hestness J, Narang S, Ardalani N, Diamos G, Jun H, Kianinejad H, Patwary M, Ali M, Yang Y, Zhou Y. Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017 .
24.
Zurück zum Zitat Balki I, Amirabadi A, Levman J, Martel AL, Emersic Z, Meden B, Garcia-Pedrero A, Ramirez SC, Kong D, Moody AR, Tyrrell PN. Sample-size determination methodologies for machine learning in medical imaging research: a systematic review. Can Assoc of Radiol J.;70:344-53, 2019.CrossRef Balki I, Amirabadi A, Levman J, Martel AL, Emersic Z, Meden B, Garcia-Pedrero A, Ramirez SC, Kong D, Moody AR, Tyrrell PN. Sample-size determination methodologies for machine learning in medical imaging research: a systematic review. Can Assoc of Radiol J.;70:344-53, 2019.CrossRef
25.
Zurück zum Zitat Dunnmon JA, Yi D, Langlotz CP, Ré C, Rubin DL, Lungren MP. Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology. 290:537-44, 2019CrossRef Dunnmon JA, Yi D, Langlotz CP, Ré C, Rubin DL, Lungren MP. Assessment of convolutional neural networks for automated classification of chest radiographs. Radiology. 290:537-44, 2019CrossRef
26.
Zurück zum Zitat Krause J, Gebru T, Deng J, Li LJ, Fei-Fei L. Learning features and parts for fine-grained recognition. In 2014 22nd International Conference on Pattern Recognition. 26–33, 2014. IEEE. Krause J, Gebru T, Deng J, Li LJ, Fei-Fei L. Learning features and parts for fine-grained recognition. In 2014 22nd International Conference on Pattern Recognition. 26–33, 2014. IEEE.
Metadaten
Titel
Effect of Training Data Volume on Performance of Convolutional Neural Network Pneumothorax Classifiers
verfasst von
Yee Liang Thian
Dian Wen Ng
James Thomas Patrick Decourcy Hallinan
Pooja Jagmohan
Soon Yiew Sia
Jalila Sayed Adnan Mohamed
Swee Tian Quek
Mengling Feng
Publikationsdatum
03.03.2022
Verlag
Springer International Publishing
Erschienen in
Journal of Imaging Informatics in Medicine / Ausgabe 4/2022
Print ISSN: 2948-2925
Elektronische ISSN: 2948-2933
DOI
https://doi.org/10.1007/s10278-022-00594-y

Weitere Artikel der Ausgabe 4/2022

Journal of Digital Imaging 4/2022 Zur Ausgabe

Darf man die Behandlung eines Neonazis ablehnen?

08.05.2024 Gesellschaft Nachrichten

In einer Leseranfrage in der Zeitschrift Journal of the American Academy of Dermatology möchte ein anonymer Dermatologe bzw. eine anonyme Dermatologin wissen, ob er oder sie einen Patienten behandeln muss, der eine rassistische Tätowierung trägt.

Ein Drittel der jungen Ärztinnen und Ärzte erwägt abzuwandern

07.05.2024 Klinik aktuell Nachrichten

Extreme Arbeitsverdichtung und kaum Supervision: Dr. Andrea Martini, Sprecherin des Bündnisses Junge Ärztinnen und Ärzte (BJÄ) über den Frust des ärztlichen Nachwuchses und die Vorteile des Rucksack-Modells.

Endlich: Zi zeigt, mit welchen PVS Praxen zufrieden sind

IT für Ärzte Nachrichten

Darauf haben viele Praxen gewartet: Das Zi hat eine Liste von Praxisverwaltungssystemen veröffentlicht, die von Nutzern positiv bewertet werden. Eine gute Grundlage für wechselwillige Ärztinnen und Psychotherapeuten.

Akuter Schwindel: Wann lohnt sich eine MRT?

28.04.2024 Schwindel Nachrichten

Akuter Schwindel stellt oft eine diagnostische Herausforderung dar. Wie nützlich dabei eine MRT ist, hat eine Studie aus Finnland untersucht. Immerhin einer von sechs Patienten wurde mit akutem ischämischem Schlaganfall diagnostiziert.

Update Radiologie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.