Skip to main content
Erschienen in: Journal of Medical Systems 1/2024

01.12.2024 | Original Paper

Leveraging Large Language Models for Clinical Abbreviation Disambiguation

verfasst von: Manda Hosseini, Mandana Hosseini, Reza Javidan

Erschienen in: Journal of Medical Systems | Ausgabe 1/2024

Einloggen, um Zugang zu erhalten

Abstract

Clinical abbreviation disambiguation is a crucial task in the biomedical domain, as the accurate identification of the intended meanings or expansions of abbreviations in clinical texts is vital for medical information retrieval and analysis. Existing approaches have shown promising results, but challenges such as limited instances and ambiguous interpretations persist. In this paper, we propose an approach to address these challenges and enhance the performance of clinical abbreviation disambiguation. Our objective is to leverage the power of Large Language Models (LLMs) and employ a Generative Model (GM) to augment the dataset with contextually relevant instances, enabling more accurate disambiguation across diverse clinical contexts. We integrate the contextual understanding of LLMs, represented by BlueBERT and Transformers, with data augmentation using a Generative Model, called Biomedical Generative Pre-trained Transformer (BIOGPT), that is pretrained on an extensive corpus of biomedical literature to capture the intricacies of medical terminology and context. By providing the BIOGPT with relevant medical terms and sense information, we generate diverse instances of clinical text that accurately represent the intended meanings of abbreviations. We evaluate our approach on the widely recognized CASI dataset, carefully partitioned into training, validation, and test sets. The incorporation of data augmentation with the GM improves the model’s performance, particularly for senses with limited instances, effectively addressing dataset imbalance and challenges posed by similar concepts. The results demonstrate the efficacy of our proposed method, showcasing the significance of LLMs and generative techniques in clinical abbreviation disambiguation. Our model achieves a good accuracy on the test set, outperforming previous methods.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat B. Duganová, “Medical language – a unique linguistic phenomenon,” JAHR, 2019. B. Duganová, “Medical language – a unique linguistic phenomenon,” JAHR, 2019.
7.
Zurück zum Zitat H. Xu, P. D. Stetson, and C. Friedman, “A study of abbreviations in clinical notes.,” AMIA Annu Symp Proc, vol. 2007, pp. 821–825, Oct. 2007. H. Xu, P. D. Stetson, and C. Friedman, “A study of abbreviations in clinical notes.,” AMIA Annu Symp Proc, vol. 2007, pp. 821–825, Oct. 2007.
9.
Zurück zum Zitat I. F. Kuhn, “Abbreviations and acronyms in healthcare: when shorter isn’t sweeter.,” Pediatr Nurs, vol. 33, no. 5, pp. 392–398, 2007.PubMed I. F. Kuhn, “Abbreviations and acronyms in healthcare: when shorter isn’t sweeter.,” Pediatr Nurs, vol. 33, no. 5, pp. 392–398, 2007.PubMed
14.
Zurück zum Zitat J. Toole, “A Hybrid Approach to the Identification and Expansion of Abbreviations,” May 2000. J. Toole, “A Hybrid Approach to the Identification and Expansion of Abbreviations,” May 2000.
16.
Zurück zum Zitat R. Navigli, S. Faralli, A. Soroa, O. de Lacalle, and E. Agirre, “Two Birds with One Stone: Learning Semantic Models for Text Categorization and Word Sense Disambiguation,” in Proceedings of the 20th ACM International Conference on Information and Knowledge Management, in CIKM ’11. New York, NY, USA: Association for Computing Machinery, 2011, pp. 2317–2320. doi: https://doi.org/10.1145/2063576.2063955 R. Navigli, S. Faralli, A. Soroa, O. de Lacalle, and E. Agirre, “Two Birds with One Stone: Learning Semantic Models for Text Categorization and Word Sense Disambiguation,” in Proceedings of the 20th ACM International Conference on Information and Knowledge Management, in CIKM ’11. New York, NY, USA: Association for Computing Machinery, 2011, pp. 2317–2320. doi: https://​doi.​org/​10.​1145/​2063576.​2063955
17.
Zurück zum Zitat S. Pakhomov, T. Pedersen, and C. G. Chute, “Abbreviation and acronym disambiguation in clinical discourse.,” AMIA Annu Symp Proc, vol. 2005, pp. 589–593, 2005. S. Pakhomov, T. Pedersen, and C. G. Chute, “Abbreviation and acronym disambiguation in clinical discourse.,” AMIA Annu Symp Proc, vol. 2005, pp. 589–593, 2005.
18.
Zurück zum Zitat S. Moon, S. Pakhomov, and G. B. Melton, “Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.,” AMIA Annu Symp Proc, vol. 2012, pp. 1310–1319, 2012. S. Moon, S. Pakhomov, and G. B. Melton, “Automated disambiguation of acronyms and abbreviations in clinical texts: window and training size considerations.,” AMIA Annu Symp Proc, vol. 2012, pp. 1310–1319, 2012.
19.
Zurück zum Zitat M. Joshi, S. Pakhomov, T. Pedersen, and C. G. Chute, “A comparative study of supervised learning as applied to acronym expansion in clinical reports.,” AMIA Annu Symp Proc, vol. 2006, pp. 399–403, 2006. M. Joshi, S. Pakhomov, T. Pedersen, and C. G. Chute, “A comparative study of supervised learning as applied to acronym expansion in clinical reports.,” AMIA Annu Symp Proc, vol. 2006, pp. 399–403, 2006.
20.
Zurück zum Zitat G. P. Finley, S. V. S. Pakhomov, R. McEwan, and G. B. Melton, “Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.,” AMIA Annu Symp Proc, vol. 2016, pp. 560–569, 2016. G. P. Finley, S. V. S. Pakhomov, R. McEwan, and G. B. Melton, “Towards Comprehensive Clinical Abbreviation Disambiguation Using Machine-Labeled Training Data.,” AMIA Annu Symp Proc, vol. 2016, pp. 560–569, 2016.
22.
Zurück zum Zitat Y. Wu et al, “A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).,” J Am Med Inform Assoc, vol. 24, no. e1, pp. e79–e86, Apr. 2017, doi: https://doi.org/10.1093/jamia/ocw109 Y. Wu et al, “A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD).,” J Am Med Inform Assoc, vol. 24, no. e1, pp. e79–e86, Apr. 2017, doi: https://​doi.​org/​10.​1093/​jamia/​ocw109
23.
Zurück zum Zitat A. Jaber and P. Mart\’\inez, “Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings,” in Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, {SCITEPRESS} - Science and Technology Publications, 2021. doi: https://doi.org/10.5220/0010256105010508 A. Jaber and P. Mart\’\inez, “Disambiguating Clinical Abbreviations using Pre-trained Word Embeddings,” in Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, {SCITEPRESS} - Science and Technology Publications, 2021. doi: https://​doi.​org/​10.​5220/​0010256105010508​
24.
Zurück zum Zitat R. Socher and C. Manning, “Deep Learning for NLP (without Magic).” R. Socher and C. Manning, “Deep Learning for NLP (without Magic).”
32.
Zurück zum Zitat J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in NAACL, 2019. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in NAACL, 2019.
39.
Zurück zum Zitat A. Wagh and M. Khanna, “Clinical Abbreviation Disambiguation Using Clinical Variants of BERT,” in Multi-disciplinary Trends in Artificial Intelligence, R. Morusupalli, T. S. Dandibhotla, V. V. Atluri, D. Windridge, P. Lingras, and V. R. Komati, Eds., Cham: Springer Nature Switzerland, 2023, pp. 214–224.CrossRef A. Wagh and M. Khanna, “Clinical Abbreviation Disambiguation Using Clinical Variants of BERT,” in Multi-disciplinary Trends in Artificial Intelligence, R. Morusupalli, T. S. Dandibhotla, V. V. Atluri, D. Windridge, P. Lingras, and V. R. Komati, Eds., Cham: Springer Nature Switzerland, 2023, pp. 214–224.CrossRef
41.
Zurück zum Zitat A. Vaswani et al, “Attention Is All You Need,” Jun. 2017. A. Vaswani et al, “Attention Is All You Need,” Jun. 2017.
42.
43.
Zurück zum Zitat S. P. S. M. G. Moon, “Clinical Abbreviation Sense Inventory. Retrieved from the University of Minnesota Digital Conservancy,” 2012. S. P. S. M. G. Moon, “Clinical Abbreviation Sense Inventory. Retrieved from the University of Minnesota Digital Conservancy,” 2012.
44.
Zurück zum Zitat J. Kaur and P. Buttar, “A Systematic Review on Stopword Removal Algorithms,” vol. 4, pp. 207–210, Apr. 2018. J. Kaur and P. Buttar, “A Systematic Review on Stopword Removal Algorithms,” vol. 4, pp. 207–210, Apr. 2018.
47.
49.
51.
Zurück zum Zitat G. Adams, M. Ketenci, S. Bhave, A. Perotte, and N. Elhadad, “Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells.,” Proc Mach Learn Res, vol. 136, pp. 12–40, Dec. 2020. G. Adams, M. Ketenci, S. Bhave, A. Perotte, and N. Elhadad, “Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells.,” Proc Mach Learn Res, vol. 136, pp. 12–40, Dec. 2020.
52.
Zurück zum Zitat M. Agrawal, S. Hegselmann, H. Lang, Y. Kim, and D. Sontag, “Large Language Models are Few-Shot Clinical Information Extractors,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022. M. Agrawal, S. Hegselmann, H. Lang, Y. Kim, and D. Sontag, “Large Language Models are Few-Shot Clinical Information Extractors,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022.
53.
Metadaten
Titel
Leveraging Large Language Models for Clinical Abbreviation Disambiguation
verfasst von
Manda Hosseini
Mandana Hosseini
Reza Javidan
Publikationsdatum
01.12.2024
Verlag
Springer US
Erschienen in
Journal of Medical Systems / Ausgabe 1/2024
Print ISSN: 0148-5598
Elektronische ISSN: 1573-689X
DOI
https://doi.org/10.1007/s10916-024-02049-z

Weitere Artikel der Ausgabe 1/2024

Journal of Medical Systems 1/2024 Zur Ausgabe