Skip to main content
Erschienen in:

11.08.2023 | Neuro

Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports

verfasst von: Minjae Kim, Kai Tzu-iunn Ong, Seonah Choi, Jinyoung Yeo, Sooyon Kim, Kyunghwa Han, Ji Eun Park, Ho Sung Kim, Yoon Seong Choi, Sung Soo Ahn, Jinna Kim, Seung-Koo Lee, Beomseok Sohn

Erschienen in: European Radiology | Ausgabe 11/2023

Einloggen, um Zugang zu erhalten

Abstract

Objectives

To evaluate the performance of natural language processing (NLP) models to predict isocitrate dehydrogenase (IDH) mutation status in diffuse glioma using routine MR radiology reports.

Materials and methods

This retrospective, multi-center study included consecutive patients with diffuse glioma with known IDH mutation status from May 2009 to November 2021 whose initial MR radiology report was available prior to pathologic diagnosis. Five NLP models (long short-term memory [LSTM], bidirectional LSTM, bidirectional encoder representations from transformers [BERT], BERT graph convolutional network [GCN], BioBERT) were trained, and area under the receiver operating characteristic curve (AUC) was assessed to validate prediction of IDH mutation status in the internal and external validation sets. The performance of the best performing NLP model was compared with that of the human readers.

Results

A total of 1427 patients (mean age ± standard deviation, 54 ± 15; 779 men, 54.6%) with 720 patients in the training set, 180 patients in the internal validation set, and 527 patients in the external validation set were included. In the external validation set, BERT GCN showed the highest performance (AUC 0.85, 95% CI 0.81−0.89) in predicting IDH mutation status, which was higher than LSTM (AUC 0.77, 95% CI 0.72−0.81; p = .003) and BioBERT (AUC 0.81, 95% CI 0.76−0.85; p = .03). This was higher than that of a neuroradiologist (AUC 0.80, 95% CI 0.76−0.84; p = .005) and a neurosurgeon (AUC 0.79, 95% CI 0.76−0.84; p = .04).

Conclusion

BERT GCN was externally validated to predict IDH mutation status in patients with diffuse glioma using routine MR radiology reports with superior or at least comparable performance to human reader.

Clinical relevance statement

Natural language processing may be used to extract relevant information from routine radiology reports to predict cancer genotype and provide prognostic information that may aid in guiding treatment strategy and enabling personalized medicine.

Key Points

• A transformer-based natural language processing (NLP) model predicted isocitrate dehydrogenase mutation status in diffuse glioma with an AUC of 0.85 in the external validation set.
• The best NLP models were superior or at least comparable to human readers in both internal and external validation sets.
• Transformer-based models showed higher performance than conventional NLP model such as long short-term memory.
Literatur
1.
Zurück zum Zitat Pons E, Braun LM, Hunink MG, Kors JA (2016) Natural language processing in radiology: a systematic review. Radiology 279:329–343CrossRefPubMed Pons E, Braun LM, Hunink MG, Kors JA (2016) Natural language processing in radiology: a systematic review. Radiology 279:329–343CrossRefPubMed
2.
Zurück zum Zitat Donnelly LF, Grzeszczuk R, Guimaraes CV (2022) Use of natural language processing (NLP) in evaluation of radiology reports: an update on applications and technology advances. Semin Ultrasound CT MRI 43:176–181CrossRef Donnelly LF, Grzeszczuk R, Guimaraes CV (2022) Use of natural language processing (NLP) in evaluation of radiology reports: an update on applications and technology advances. Semin Ultrasound CT MRI 43:176–181CrossRef
3.
Zurück zum Zitat Casey A, Davidson E, Poon M et al (2021) A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 21:179CrossRefPubMedPubMedCentral Casey A, Davidson E, Poon M et al (2021) A systematic review of natural language processing applied to radiology reports. BMC Med Inform Decis Mak 21:179CrossRefPubMedPubMedCentral
4.
Zurück zum Zitat Do RKG, Lupton K, Andrieu PIC et al (2021) Patterns of metastatic disease in patients with cancer derived from natural language processing of structured CT radiology reports over a 10-year period. Radiology 301:115–122CrossRefPubMed Do RKG, Lupton K, Andrieu PIC et al (2021) Patterns of metastatic disease in patients with cancer derived from natural language processing of structured CT radiology reports over a 10-year period. Radiology 301:115–122CrossRefPubMed
5.
Zurück zum Zitat Yim W-w, Yetisgen M, Harris WP, Kwan SW (2016) Natural language processing in oncology: a review. JAMA Oncol 2:797–804CrossRefPubMed Yim W-w, Yetisgen M, Harris WP, Kwan SW (2016) Natural language processing in oncology: a review. JAMA Oncol 2:797–804CrossRefPubMed
6.
7.
Zurück zum Zitat Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for computational linguistics, pp 4171–4186. https://doi.org/10.48550/arXiv.1810.04805 Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (long and short papers). Association for computational linguistics, pp 4171–4186. https://​doi.​org/​10.​48550/​arXiv.​1810.​04805
8.
Zurück zum Zitat Lee J, Yoon W, Kim S et al (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234–1240CrossRefPubMed Lee J, Yoon W, Kim S et al (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234–1240CrossRefPubMed
9.
Zurück zum Zitat Savova GK, Danciu I, Alamudun F et al (2019) Use of natural language processing to extract clinical cancer phenotypes from electronic medical records. Cancer Res 79:5463–5470CrossRefPubMedPubMedCentral Savova GK, Danciu I, Alamudun F et al (2019) Use of natural language processing to extract clinical cancer phenotypes from electronic medical records. Cancer Res 79:5463–5470CrossRefPubMedPubMedCentral
10.
Zurück zum Zitat Fink MA, Kades K, Bischoff A et al (2022) Deep learning–based assessment of oncologic outcomes from natural language processing of structured radiology reports. Radiol Artif Intell 4:e220055CrossRefPubMedPubMedCentral Fink MA, Kades K, Bischoff A et al (2022) Deep learning–based assessment of oncologic outcomes from natural language processing of structured radiology reports. Radiol Artif Intell 4:e220055CrossRefPubMedPubMedCentral
11.
Zurück zum Zitat Liu F, Zhou P, Baccei SJ et al (2021) Qualifying certainty in radiology reports through deep learning–based natural language processing. AJNR Am J Neuroradiol 42:1755–1761PubMedPubMedCentral Liu F, Zhou P, Baccei SJ et al (2021) Qualifying certainty in radiology reports through deep learning–based natural language processing. AJNR Am J Neuroradiol 42:1755–1761PubMedPubMedCentral
12.
Zurück zum Zitat Chaudhari GR, Liu T, Chen TL et al (2022) Application of a domain-specific BERT for detection of speech recognition errors in radiology reports. Radiol Artif Intell 4:e210185CrossRefPubMedPubMedCentral Chaudhari GR, Liu T, Chen TL et al (2022) Application of a domain-specific BERT for detection of speech recognition errors in radiology reports. Radiol Artif Intell 4:e210185CrossRefPubMedPubMedCentral
13.
Zurück zum Zitat Tejani AS, Ng YS, Xi Y, Fielding JR, Browning TG, Rayan JC (2022) Performance of multiple pretrained BERT models to automate and accelerate data annotation for large datasets. Radiol Artif Intell 4:e220007CrossRefPubMedPubMedCentral Tejani AS, Ng YS, Xi Y, Fielding JR, Browning TG, Rayan JC (2022) Performance of multiple pretrained BERT models to automate and accelerate data annotation for large datasets. Radiol Artif Intell 4:e220007CrossRefPubMedPubMedCentral
14.
Zurück zum Zitat Iorga M, Drakopoulos M, Naidech AM, Katsaggelos AK, Parrish TB, Hill VB (2022) Labeling noncontrast head CT reports for common findings using natural language processing. AJNR Am J Neuroradiol 43:721–726CrossRefPubMedPubMedCentral Iorga M, Drakopoulos M, Naidech AM, Katsaggelos AK, Parrish TB, Hill VB (2022) Labeling noncontrast head CT reports for common findings using natural language processing. AJNR Am J Neuroradiol 43:721–726CrossRefPubMedPubMedCentral
15.
Zurück zum Zitat Sanson M, Marie Y, Paris S et al (2009) Isocitrate dehydrogenase 1 codon 132 mutation is an important prognostic biomarker in gliomas. J Clin Oncol 27:4150–4154CrossRefPubMed Sanson M, Marie Y, Paris S et al (2009) Isocitrate dehydrogenase 1 codon 132 mutation is an important prognostic biomarker in gliomas. J Clin Oncol 27:4150–4154CrossRefPubMed
17.
Zurück zum Zitat Hartmann C, Hentschel B, Wick W et al (2010) Patients with IDH1 wild type anaplastic astrocytomas exhibit worse prognosis than IDH1-mutated glioblastomas, and IDH1 mutation status accounts for the unfavorable prognostic effect of higher age: implications for classification of gliomas. Acta Neuropathol 120:707–718CrossRefPubMed Hartmann C, Hentschel B, Wick W et al (2010) Patients with IDH1 wild type anaplastic astrocytomas exhibit worse prognosis than IDH1-mutated glioblastomas, and IDH1 mutation status accounts for the unfavorable prognostic effect of higher age: implications for classification of gliomas. Acta Neuropathol 120:707–718CrossRefPubMed
18.
Zurück zum Zitat Zhou H, Vallières M, Bai HX et al (2017) MRI features predict survival and molecular markers in diffuse lower-grade gliomas. Neuro Oncol 19:862–870CrossRefPubMedPubMedCentral Zhou H, Vallières M, Bai HX et al (2017) MRI features predict survival and molecular markers in diffuse lower-grade gliomas. Neuro Oncol 19:862–870CrossRefPubMedPubMedCentral
19.
Zurück zum Zitat Choi YS, Bae S, Chang JH et al (2020) Fully automated hybrid approach to predict the IDH mutation status of gliomas via deep learning and radiomics. Neuro Oncol 23:304–313CrossRefPubMedCentral Choi YS, Bae S, Chang JH et al (2020) Fully automated hybrid approach to predict the IDH mutation status of gliomas via deep learning and radiomics. Neuro Oncol 23:304–313CrossRefPubMedCentral
20.
Zurück zum Zitat Park YW, Han K, Ahn SS et al (2018) Prediction of IDH1-mutation and 1p/19q-codeletion status using preoperative MR imaging phenotypes in lower grade gliomas. AJNR Am J Neuroradiol 39:37–42CrossRefPubMedPubMedCentral Park YW, Han K, Ahn SS et al (2018) Prediction of IDH1-mutation and 1p/19q-codeletion status using preoperative MR imaging phenotypes in lower grade gliomas. AJNR Am J Neuroradiol 39:37–42CrossRefPubMedPubMedCentral
21.
Zurück zum Zitat Gutman DA, Cooper LA, Hwang SN et al (2013) MR imaging predictors of molecular profile and survival: multi-institutional study of the TCGA glioblastoma data set. Radiology 267:560–569CrossRefPubMedPubMedCentral Gutman DA, Cooper LA, Hwang SN et al (2013) MR imaging predictors of molecular profile and survival: multi-institutional study of the TCGA glioblastoma data set. Radiology 267:560–569CrossRefPubMedPubMedCentral
22.
Zurück zum Zitat Zhu Y, Kiros R, Zemel R et al (2015) Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: Ren X, Chen CC, Gupta A, Malik J (eds) Proceedings of the IEEE international conference on computer vision. IEEE, pp 19–27 Zhu Y, Kiros R, Zemel R et al (2015) Aligning books and movies: towards story-like visual explanations by watching movies and reading books. In: Ren X, Chen CC, Gupta A, Malik J (eds) Proceedings of the IEEE international conference on computer vision. IEEE,  pp 19–27
23.
Zurück zum Zitat Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 30, pp 5998–6008 Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 30, pp 5998–6008
24.
26.
Zurück zum Zitat Dipnall JF, Lu J, Gabbe BJ et al (2022) Comparison of state-of-the-art machine and deep learning algorithms to classify proximal humeral fractures using radiology text. Eur J Radiol 153:110366CrossRefPubMed Dipnall JF, Lu J, Gabbe BJ et al (2022) Comparison of state-of-the-art machine and deep learning algorithms to classify proximal humeral fractures using radiology text. Eur J Radiol 153:110366CrossRefPubMed
27.
Zurück zum Zitat Olthof AW, Shouche P, Fennema EM et al (2021) Machine learning based natural language processing of radiology reports in orthopaedic trauma. Comput Methods Programs Biomed 208:106304CrossRefPubMed Olthof AW, Shouche P, Fennema EM et al (2021) Machine learning based natural language processing of radiology reports in orthopaedic trauma. Comput Methods Programs Biomed 208:106304CrossRefPubMed
28.
Zurück zum Zitat Choi YS, Bae S, Chang JH et al (2021) Fully automated hybrid approach to predict the IDH mutation status of gliomas via deep learning and radiomics. Neuro Oncol 23:304–313CrossRefPubMed Choi YS, Bae S, Chang JH et al (2021) Fully automated hybrid approach to predict the IDH mutation status of gliomas via deep learning and radiomics. Neuro Oncol 23:304–313CrossRefPubMed
29.
Zurück zum Zitat Kim M, Jung SY, Park JE et al (2020) Diffusion- and perfusion-weighted MRI radiomics model may predict isocitrate dehydrogenase (IDH) mutation and tumor aggressiveness in diffuse lower grade glioma. Eur Radiol 30:2142–2151CrossRefPubMed Kim M, Jung SY, Park JE et al (2020) Diffusion- and perfusion-weighted MRI radiomics model may predict isocitrate dehydrogenase (IDH) mutation and tumor aggressiveness in diffuse lower grade glioma. Eur Radiol 30:2142–2151CrossRefPubMed
30.
Zurück zum Zitat Suh CH, Kim HS, Jung SC, Choi CG, Kim SJ (2019) Imaging prediction of isocitrate dehydrogenase (IDH) mutation in patients with glioma: a systemic review and meta-analysis. Eur Radiol 29:745–758CrossRefPubMed Suh CH, Kim HS, Jung SC, Choi CG, Kim SJ (2019) Imaging prediction of isocitrate dehydrogenase (IDH) mutation in patients with glioma: a systemic review and meta-analysis. Eur Radiol 29:745–758CrossRefPubMed
31.
Zurück zum Zitat Senders JT, Cho LD, Calvachi P et al (2020) Automating clinical chart review: an open-source natural language processing pipeline developed on free-text radiology reports from patients with glioblastoma. JCO Clin Cancer Inform 4:25–34CrossRefPubMed Senders JT, Cho LD, Calvachi P et al (2020) Automating clinical chart review: an open-source natural language processing pipeline developed on free-text radiology reports from patients with glioblastoma. JCO Clin Cancer Inform 4:25–34CrossRefPubMed
32.
Zurück zum Zitat Mozayan A, Fabbri AR, Maneevese M, Tocino I, Chheang S (2021) Practical guide to natural language processing for radiology. Radiographics 41:1446–1453CrossRefPubMed Mozayan A, Fabbri AR, Maneevese M, Tocino I, Chheang S (2021) Practical guide to natural language processing for radiology. Radiographics 41:1446–1453CrossRefPubMed
33.
Zurück zum Zitat Park SH, Choi J, Byeon J-S (2021) Key principles of clinical validation, device approval, and insurance coverage decisions of artificial intelligence. Korean J Radiol 22:442–453CrossRefPubMedPubMedCentral Park SH, Choi J, Byeon J-S (2021) Key principles of clinical validation, device approval, and insurance coverage decisions of artificial intelligence. Korean J Radiol 22:442–453CrossRefPubMedPubMedCentral
Metadaten
Titel
Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports
verfasst von
Minjae Kim
Kai Tzu-iunn Ong
Seonah Choi
Jinyoung Yeo
Sooyon Kim
Kyunghwa Han
Ji Eun Park
Ho Sung Kim
Yoon Seong Choi
Sung Soo Ahn
Jinna Kim
Seung-Koo Lee
Beomseok Sohn
Publikationsdatum
11.08.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
European Radiology / Ausgabe 11/2023
Print ISSN: 0938-7994
Elektronische ISSN: 1432-1084
DOI
https://doi.org/10.1007/s00330-023-10061-z

Neu im Fachgebiet Radiologie

KI-gestütztes Mammografiescreening überzeugt im Praxistest

Mit dem Einsatz künstlicher Intelligenz lässt sich die Detektionsrate im Mammografiescreening offenbar deutlich steigern. Mehr unnötige Zusatzuntersuchungen sind laut der Studie aus Deutschland nicht zu befürchten.

Stumme Schlaganfälle − ein häufiger Nebenbefund im Kopf-CT?

In 4% der in der Notfallambulanz initiierten zerebralen Bildgebung sind „alte“ Schlaganfälle zu erkennen. Gar nicht so selten handelt es sich laut einer aktuellen Studie dabei um unbemerkte Insulte. Bietet sich hier womöglich die Chance auf ein effektives opportunistisches Screening?

Die elektronische Patientenakte kommt: Das sollten Sie jetzt wissen

Am 15. Januar geht die „ePA für alle“ zunächst in den Modellregionen an den Start. Doch schon bald soll sie in allen Praxen zum Einsatz kommen. Was ist jetzt zu tun? Was müssen Sie wissen? Wir geben in einem FAQ Antworten auf 21 Fragen.

Stören weiße Wände und viel Licht die Bildqualitätskontrolle?

Wenn es darum geht, die technische Qualität eines Mammogramms zu beurteilen, könnten graue Wandfarbe und reduzierte Beleuchtung im Bildgebungsraum von Vorteil sein. Darauf deuten zumindest Ergebnisse einer kleinen Studie hin. 

Update Radiologie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.