nach oben

European Archives of Oto-Rhino-Laryngology

22.04.2024 | Short Communication

ChatGPT’s adherence to otolaryngology clinical practice guidelines

verfasst von: Idit Tessler, Amit Wolfovitz, Eran E. Alon, Nir A. Gecel, Nir Livneh, Eyal Zimlichman, Eyal Klang

Erschienen in: European Archives of Oto-Rhino-Laryngology

Einloggen, um Zugang zu erhalten

Abstract

Objectives

Large language models, including ChatGPT, has the potential to transform the way we approach medical knowledge, yet accuracy in clinical topics is critical. Here we assessed ChatGPT’s performance in adhering to the American Academy of Otolaryngology-Head and Neck Surgery guidelines.

Methods

We presented ChatGPT with 24 clinical otolaryngology questions based on the guidelines of the American Academy of Otolaryngology. This was done three times (N = 72) to test the model’s consistency. Two otolaryngologists evaluated the responses for accuracy and relevance to the guidelines. Cohen’s Kappa was used to measure evaluator agreement, and Cronbach’s alpha assessed the consistency of ChatGPT’s responses.

Results

The study revealed mixed results; 59.7% (43/72) of ChatGPT’s responses were highly accurate, while only 2.8% (2/72) directly contradicted the guidelines. The model showed 100% accuracy in Head and Neck, but lower accuracy in Rhinology and Otology/Neurotology (66%), Laryngology (50%), and Pediatrics (8%). The model’s responses were consistent in 17/24 (70.8%), with a Cronbach’s alpha value of 0.87, indicating a reasonable consistency across tests.

Conclusions

Using a guideline-based set of structured questions, ChatGPT demonstrates consistency but variable accuracy in otolaryngology. Its lower performance in some areas, especially Pediatrics, suggests that further rigorous evaluation is needed before considering real-world clinical use.

Nur mit Berechtigung zugänglich

Pavlik JV (2023) Collaborating with ChatGPT: considering the implications of generative artificial intelligence for journalism and media education. Journalism Mass Commun Educ. https://doi.org/10.1177/10776958221149577CrossRef

Johnson SB, King AJ, Warner EL, Aneja S, Kann BH, Bylund CL (2023) Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectr 7(2):pkad015CrossRefPubMedPubMedCentral

Kung TH, Cheatham M, Medinilla A, ChatGPT, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, Tseng V (2023) Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health 2(2):e0000198. https://doi.org/10.1371/journal.pdig.0000198

Teixeira-Marques F, Medeiros N, Nazaré F, Alves S, Lima N, Ribeiro L et al (2024) Exploring the role of ChatGPT in clinical decision-making in otorhinolaryngology: a ChatGPT designed study. Eur Arch Otorhinolaryngol 281:2023–2030CrossRefPubMed

Juhi A, Pipil N, Santra S, Mondal S, Behera JK, Mondal H (2023) The capability of ChatGPT in predicting and explaining common drug–drug interactions. Cureus 15(3):e36272PubMedPubMedCentral

Mittermaier M, Raza MM, Kvedar JC (2023) Bias in AI-based models for medical applications: challenges and mitigation strategies. npj Digit Med 6(1):113CrossRefPubMedPubMedCentral

Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453CrossRefPubMed

Wang C, Liu S, Yang H, Guo J, Wu Y, Liu J (2023) Ethical considerations of using ChatGPT in health care. J Med Internet Res 25:e48009CrossRefPubMedPubMedCentral

American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS). https://www.entnet.org/. Accessed 31 Jan 2023

10.

Zalzal HG, Cheng J, Shah RK (2023) Evaluating the current ability of ChatGPT to assist in professional otolaryngology education. OTO Open 7(4):e94CrossRefPubMedPubMedCentral

11.

Graham F (2022) Daily briefing: will ChatGPT kill the essay assignment? Nature. https://doi.org/10.1038/d41586-022-04437-2CrossRefPubMedPubMedCentral

12.

O’Connor S (2023) Open artificial intelligence platforms in nursing education: tools for academic progress or abuse? Nurse Educ Pract 66:103537CrossRefPubMed

13.

Castelvecchi D (2022) Are ChatGPT and AlphaCode going to replace programmers? Nature. https://doi.org/10.1038/d41586-022-04383-zCrossRefPubMed

14.

Thorp HH (2023) ChatGPT is fun, but not an author. Science 379(6630):313CrossRefPubMed

15.

Alfertshofer M, Hoch CC, Funk PF, Hollmann K, Wollenberg B, Knoedler S et al (2023) Sailing the seven seas: a multinational comparison of ChatGPT’s performance on medical licensing examinations. Ann Biomed Eng. https://doi.org/10.1007/s10439-023-03338-3CrossRefPubMed

16.

Vaira LA, Lechien JR, Abbate V, Allevi F, Audino G, Beltramini GA et al (2023) Accuracy of ChatGPT-generated information on head and neck and oromaxillofacial surgery: a multicenter collaborative analysis. Otolaryngol Head Neck Surg. https://doi.org/10.1002/ohn.489CrossRefPubMed

17.

Taira K, Itaya T, Hanada A (2023) Performance of the large language model ChatGPT on the national nurse examinations in Japan: evaluation study. JMIR Nurs 6:e47305CrossRefPubMedPubMedCentral

18.

Qu RW, Qureshi U, Petersen G, Lee SC (2023) Diagnostic and management applications of ChatGPT in structured otolaryngology clinical scenarios. OTO Open 7(3):e67CrossRefPubMedPubMedCentral

19.

Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA et al (2023) How does ChatGPT perform on the United States medical licensing examination (USMLE)? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9:e45312CrossRefPubMedPubMedCentral

20.

Thirunavukarasu AJ, Hassan R, Mahmood S, Sanghera R, Barzangi K, El Mukashfi M et al (2023) Trialling a large language model (ChatGPT) in general practice with the applied knowledge test: observational study demonstrating opportunities and limitations in primary care. JMIR Med Educ 9:e46599CrossRefPubMedPubMedCentral

21.

Tessler I, Wolfovitz A, Livneh N, Gecel NA, Sorin V, Barash Y et al (2024) Advancing medical practice with artificial intelligence: ChatGPT in healthcare. Isr Med Assoc J 26(2):80–85PubMed

22.

Hoch CC, Wollenberg B, Lüers J-C, Knoedler S, Knoedler L, Frank K et al (2023) ChatGPT’s quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions. Eur Arch Otorhinolaryngol 280(9):4271–4278CrossRefPubMedPubMedCentral

23.

Guirguis CA, Crossley JR, Malekzadeh S (2023) Bilateral vocal fold paralysis in a patient with neurosarcoidosis: a ChatGPT-driven case report describing an unusual presentation. Cureus 15(4):e37368PubMedPubMedCentral

24.

Kim H-Y (2023) A case report on ground-level alternobaric vertigo due to eustachian tube dysfunction with the assistance of conversational generative pre-trained transformer (ChatGPT). Cureus 15(3):e36830PubMedPubMedCentral

25.

Radulesco T, Saibene AM, Michel J, Vaira LA, Lechien JR (2024) ChatGPT-4 performance in rhinology: a clinical case series. Int Forum Allergy Rhinol. https://doi.org/10.1002/alr.23323CrossRefPubMed

Titel: ChatGPT’s adherence to otolaryngology clinical practice guidelines
verfasst von: Idit Tessler
Amit Wolfovitz
Eran E. Alon
Nir A. Gecel
Nir Livneh
Eyal Zimlichman
Eyal Klang
Publikationsdatum: 22.04.2024
Verlag: Springer Berlin Heidelberg
Erschienen in: European Archives of Oto-Rhino-Laryngology
Print ISSN: 0937-4477
Elektronische ISSN: 1434-4726
DOI: https://doi.org/10.1007/s00405-024-08634-9

Neu im Fachgebiet HNO

Betalaktam-Allergie: praxisnahes Vorgehen beim Delabeling

16.05.2024 Pädiatrische Allergologie Nachrichten

Die große Mehrheit der vermeintlichen Penicillinallergien sind keine. Da das „Etikett“ Betalaktam-Allergie oft schon in der Kindheit erworben wird, kann ein frühzeitiges Delabeling lebenslange Vorteile bringen. Ein Team von Pädiaterinnen und Pädiatern aus Kanada stellt vor, wie sie dabei vorgehen.

Eingreifen von Umstehenden rettet vor Erstickungstod

15.05.2024 Fremdkörperaspiration Nachrichten

Wer sich an einem Essensrest verschluckt und um Luft ringt, benötigt vor allem rasche Hilfe. Dass Umstehende nur in jedem zweiten Erstickungsnotfall bereit waren, diese zu leisten, ist das ernüchternde Ergebnis einer Beobachtungsstudie aus Japan. Doch es gibt auch eine gute Nachricht.

Real-World-Daten sprechen eher für Dupilumab als für Op.

14.05.2024 Rhinosinusitis Nachrichten

Zur Behandlung schwerer Formen der chronischen Rhinosinusitis mit Nasenpolypen (CRSwNP) stehen seit Kurzem verschiedene Behandlungsmethoden zur Verfügung, darunter Biologika, wie Dupilumab, und die endoskopische Sinuschirurgie (ESS). Beim Vergleich der beiden Therapieoptionen war Dupilumab leicht im Vorteil.

Schwindelursache: Massagepistole lässt Otholiten tanzen

14.05.2024 Benigner Lagerungsschwindel Nachrichten

Wenn jüngere Menschen über ständig rezidivierenden Lagerungsschwindel klagen, könnte eine Massagepistole der Auslöser sein. In JAMA Otolaryngology warnt ein Team vor der Anwendung hochpotenter Geräte im Bereich des Nackens.

Update HNO

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert – ganz bequem per eMail.

Newsletter bestellen