nach oben

European Archives of Oto-Rhino-Laryngology

Erschienen in:

11.01.2024 | Miscellaneous

Accuracy of ChatGPT-3.5 and -4 in providing scientific references in otolaryngology–head and neck surgery

verfasst von: Jerome R. Lechien, Giovanni Briganti, Luigi A. Vaira

Erschienen in: European Archives of Oto-Rhino-Laryngology | Ausgabe 4/2024

Einloggen, um Zugang zu erhalten

Abstract

Introduction

Chatbot generative pre-trained transformer (ChatGPT) is a new artificial intelligence-powered language model of chatbot able to help otolaryngologists in practice and research. We investigated the accuracy of ChatGPT-3.5 and -4 in the referencing of manuscripts published in otolaryngology.

Methods

ChatGPT-3.5 and ChatGPT-4 were interrogated for providing references of the top-30 most cited papers in otolaryngology in the past 40 years including clinical guidelines and key studies that changed the practice. The responses were regenerated three times to assess the accuracy and stability of ChatGPT. ChatGPT-3.5 and ChatGPT-4 were compared for accuracy of reference and potential mistakes.

Results

The accuracy of ChatGPT-3.5 and ChatGPT-4.0 ranged from 47% to 60%, and 73% to 87%, respectively (p < 0.005). ChatGPT-3.5 provided 19 inaccurate references and invented 2 references throughout the regenerated questions. ChatGPT-4.0 provided 13 inaccurate references, while it proposed only one invented reference. The stability of responses throughout regenerated answers was mild (k = 0.238) and moderate (k = 0.408) for ChatGPT-3.5 and 4.0, respectively.

Conclusions

ChatGPT-4.0 reported higher accuracy than the free-access version (3.5). False references were detected in both 3.5 and 4.0 versions. Practitioners need to be careful regarding the use of ChatGPT in the reach of some key reference when writing a report.

Nur mit Berechtigung zugänglich

Ayoub NF, Lee YJ, Grimm D, Divi V (2023) Head-to-head comparison of ChatGPT versus google search for medical knowledge acquisition. Otolaryngol Head Neck Surg. https://doi.org/10.1002/ohn.465CrossRefPubMed

Salvagno M, Taccone FS, Gerli AG (2023) Can artificial intelligence help for scientific writing? Crit Care 27(1):75. https://doi.org/10.1186/s13054-023-04380-2CrossRefPubMedPubMedCentral

Vaira LA, Lechien JR, Abbate V, Allevi F, Audino G, Beltramini GA et al (2023) Accuracy of ChatGPT-generated information on head and neck and oromaxillofacial surgery: a multicenter collaborative analysis. Otolaryngol Head Neck Surg. https://doi.org/10.1002/ohn.489CrossRefPubMed

Hoch CC, Wollenberg B, Lüers JC, Knoedler S, Knoedler L, Frank K, Cotofana S, Alfertshofer M (2023) ChatGPT’s quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions. Eur Arch Otorhinolaryngol 280(9):4271–4278. https://doi.org/10.1007/s00405-023-08051-4CrossRefPubMedPubMedCentral

Fokkens WJ et al (2012) European position paper on rhinosinusitis and nasal polyps 2012. Rhinology 50:1–298CrossRefPubMed

House JW, Brackmann DE (1985) Facial nerve grading system. Otolaryngol Head Neck Surg 93:146–147CrossRefPubMed

Glasberg BR, Moore BCJ (1990) Derivation of auditory filter shapes from notched-noise data. Hear Res 47:103–138CrossRefPubMed

Jacobson BH et al (1997) The voice handicap index (VHI): development and validation. Am J Speech Lang Pathol 6:66–70CrossRef

Bernier J et al (2004) Postoperative irradiation with or without concomitant chemotherapy for locally advanced head and neck cancer. N Engl J Med 350(19):1945–1952CrossRefPubMed

10.

Lechien JR et al (2020) Olfactory and gustatory dysfunctions as a clinical presentation of mild-to-moderate forms of the coronavirus disease (COVID-19): a multicenter European study. Eur Arch Otorhinolaryngol 277(8):2251–2261CrossRefPubMedPubMedCentral

11.

Rosenbek JC et al (1996) A penetration aspiration scale. Dysphagia 11:93–98CrossRefPubMed

12.

Jacobson GP, Newman CW (1998) The development of the Dizziness Handicap Inventory. Arch Otolaryngol Head Neck Surg 116:424–427CrossRef

13.

Luce PA, Pisoni DB (1998) Recognizing spoken words: the neighborhood activation model. Ear Hear 19:1–36CrossRefPubMedPubMedCentral

14.

Koufman JA (1991) The otolaryngologic manifestation of gastroesophageal reflux disease (GERD): a clinical investigation of 225 patients using ambulatory 24-hour pH monitoring and experimental investigation of the role of acid and pepsin in the development of laryngeal injury. Laryngoscope 101:1–78CrossRefPubMed

15.

Vermorken JB et al (2007) Cisplatin, fluorouracil, and docetaxel in unresectable head and neck cancer. N Engl J Med 357(17):1695–1704CrossRefPubMed

16.

Stammberger H, Posawetz W (1990) Functional endoscopic sinus surgery: concept, indications and results of the Messerklinger technique. Eur Arch Otorhinolaryngol 247:63–76CrossRefPubMed

17.

Spiro RH (1986) Salivary neoplasms: overview of a 35-year experience with 2807 patients. Head Neck Surg 8:177–184CrossRefPubMed

18.

Epley JM (1992) The canalith repositioning procedure: for treatment of benign paroxysmal positional vertigo. Otolaryngol Head Neck Surg 107:399–404CrossRefPubMed

19.

Hadad G et al (2006) A novel reconstructive technique after endoscopic expanded endonasal approaches: vascular pedicle nasoseptal flap. Laryngoscope 116:1882–1886CrossRefPubMed

20.

Belafsky PC et al (2002) Validity and reliability of the reflux symptom index (RSI). J Voice 16:274–277CrossRefPubMed

21.

Hummel T et al (2007) Normative data for the Sniffin’ Sticks including tests of odor identification, odor discrimination, and olfactory thresholds: an upgrade based on a group of more than 3000 subjects. Eur Arch Otorhinolaryngol 264:237–243CrossRefPubMed

22.

Bernier J et al (2005) Defining risk levels in locally advanced head and neck cancers: a comparative analysis of concurrent postoperative radiation plus chemotherapy trials of the EORTC (#22931) and RTOG (#9501). Head Neck 27:843–850CrossRefPubMed

23.

Fokkens W et al (2007) European position paper on rhinosinusitis and nasal polyps. Rhinol Suppl 20:1–36PubMed

24.

Benninger MS (2003) Adult chronic rhinosinusitis: definitions, diagnosis, epidemiology, and pathophysiology. Otolaryngol Head Neck Surg 129:S1-32CrossRefPubMed

25.

Belafsky PC et al (2001) The validity and reliability of the reflux finding score (RFS). Laryngoscope 111:1313–1317CrossRefPubMed

26.

Gatehouse S, Noble W (2004) The speech, spatial and qualities of hearing scale (SSQ). Int J Audiol 43:85–99CrossRefPubMedPubMedCentral

27.

Rosenfeld RM et al (2007) Clinical practice guideline: adult sinusitis. Otolaryngol Head Neck Surg 137:S1-31CrossRefPubMed

28.

Dejonckere PH et al (2001) A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques – Guideline elaborated by the Committee on Phoniatrics of the European Laryngological Society (ELS). Eur Arch Otorhinolaryngol 258:77–82CrossRefPubMed

29.

Stammberger H (1986) Endoscopic endonasal surgery: concepts in treatment of recurring rhinosinusitis. 1. Anatomic and pathophysiologic considerations. Otolaryngol Head Neck Surg 94:143–147CrossRefPubMed

30.

Lund VJ, Kennedy DW (1997) Staging for rhinosinusitis. Otolaryngol Head Neck Surg 117:S35-40CrossRefPubMed

31.

Robbins KT et al (2002) Neck dissection classification update-revisions proposed by the American Head and Neck Society and the American Academy of Otolaryngology-Head and Neck Surgery. Arch Otolaryngol Head Neck Surg 128:751–758CrossRefPubMed

32.

Piccirillo JF et al (2002) Psychometric and clinimetric validity of the 20- Item Sino-Nasal Outcome Test (SNOT-20). Otolaryngol Head Neck Surg 126:41–47CrossRefPubMed

33.

Kennedy DW et al (1985) Functional endoscopic sinus surgery: theory and diagnostic evaluation. Arch Otolaryngol Head Neck Surg 111:576–582CrossRef

34.

Robbins KT et al (1991) Standardizing neck dissection terminology: official report of the Academy’s Committee for Head and Neck Surgery and Oncology. Arch Otolaryngol Head Neck Surg 117:601–605CrossRefPubMed

35.

Lanza DC, Kennedy DW (1997) Adult rhinosinusitis defined. Otolaryngol Head Neck Surg 117:S1-7CrossRefPubMed

36.

Frosolini A, Franz L, Benedetti S, Vaira LA, de Filippis C, Gennaro P, Marioni G, Gabriele G (2023) Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines. Eur Arch Otorhinolaryngol. https://doi.org/10.1007/s00405-023-08205-4CrossRefPubMed

37.

Morath B, Chiriac U, Jaszkowski E, Deiß C, Nürnberg H, Hörth K, Hoppe-Tichy T, Green K (2023) Performance and risks of ChatGPT used in drug information: an exploratory real-world analysis. Eur J Hosp Pharm. https://doi.org/10.1136/ejhpharm-2023-003750CrossRefPubMed

38.

Lechien JR, Gorton A, Robertson J, Vaira LA (2023) Is ChatGPT accurate in proofread a manuscript in otolaryngology-head and neck surgery? Otolaryngol Head Neck Surg. https://doi.org/10.1002/ohn.526CrossRefPubMed

39.

Campbell DJ, Estephan LE, Sina E, Mastrolonardo EV, Alapati R, Amin DR, Cottrill E (2023) Evaluating ChatGPT responses on thyroid nodules for patient education. Thyroid. https://doi.org/10.1089/thy.2023.0491CrossRefPubMedPubMedCentral

Titel: Accuracy of ChatGPT-3.5 and -4 in providing scientific references in otolaryngology–head and neck surgery
verfasst von: Jerome R. Lechien
Giovanni Briganti
Luigi A. Vaira
Publikationsdatum: 11.01.2024
Verlag: Springer Berlin Heidelberg
Erschienen in: European Archives of Oto-Rhino-Laryngology / Ausgabe 4/2024
Print ISSN: 0937-4477
Elektronische ISSN: 1434-4726
DOI: https://doi.org/10.1007/s00405-023-08441-8

Neu im Fachgebiet HNO

Betalaktam-Allergie: praxisnahes Vorgehen beim Delabeling

16.05.2024 Pädiatrische Allergologie Nachrichten

Die große Mehrheit der vermeintlichen Penicillinallergien sind keine. Da das „Etikett“ Betalaktam-Allergie oft schon in der Kindheit erworben wird, kann ein frühzeitiges Delabeling lebenslange Vorteile bringen. Ein Team von Pädiaterinnen und Pädiatern aus Kanada stellt vor, wie sie dabei vorgehen.

Eingreifen von Umstehenden rettet vor Erstickungstod

15.05.2024 Fremdkörperaspiration Nachrichten

Wer sich an einem Essensrest verschluckt und um Luft ringt, benötigt vor allem rasche Hilfe. Dass Umstehende nur in jedem zweiten Erstickungsnotfall bereit waren, diese zu leisten, ist das ernüchternde Ergebnis einer Beobachtungsstudie aus Japan. Doch es gibt auch eine gute Nachricht.

Real-World-Daten sprechen eher für Dupilumab als für Op.

14.05.2024 Rhinosinusitis Nachrichten

Zur Behandlung schwerer Formen der chronischen Rhinosinusitis mit Nasenpolypen (CRSwNP) stehen seit Kurzem verschiedene Behandlungsmethoden zur Verfügung, darunter Biologika, wie Dupilumab, und die endoskopische Sinuschirurgie (ESS). Beim Vergleich der beiden Therapieoptionen war Dupilumab leicht im Vorteil.

Schwindelursache: Massagepistole lässt Otholiten tanzen

14.05.2024 Benigner Lagerungsschwindel Nachrichten

Wenn jüngere Menschen über ständig rezidivierenden Lagerungsschwindel klagen, könnte eine Massagepistole der Auslöser sein. In JAMA Otolaryngology warnt ein Team vor der Anwendung hochpotenter Geräte im Bereich des Nackens.

Update HNO

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert – ganz bequem per eMail.

Newsletter bestellen

Die Highlights vom Kongress des American College of Cardiology 2024

Springer Medizin

Accuracy of ChatGPT-3.5 and -4 in providing scientific references in otolaryngology–head and neck surgery

Abstract

Introduction

Methods

Results

Conclusions

Neu im Fachgebiet HNO

Betalaktam-Allergie: praxisnahes Vorgehen beim Delabeling

Eingreifen von Umstehenden rettet vor Erstickungstod

Real-World-Daten sprechen eher für Dupilumab als für Op.

Schwindelursache: Massagepistole lässt Otholiten tanzen

Update HNO

Die Highlights vom Kongress des American College of Cardiology 2024

Springer Medizin

Abstract

Introduction

Methods

Results

Conclusions

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Weitere Artikel der Ausgabe 4/2024

The acute vestibular syndrome: prevalence of new hearing loss and its diagnostic value

Comment on: Tranexamic acid in bleeding reduction and operative time of nasal surgeries: systematic review and meta‐analysis

Objective and subjective efficacy of hearing aids in patients with mild-to-moderate unilateral hearing loss: a prospective study

The effect of subperichondrial dissection on nasal vascularity in septorhinoplasty operations

Endoscopic approach to geniculate ganglion: a multicentric experience

Long-term clinical and radiological results for fat graft obliteration in subtotal petrosectomy and cochlear implant surgery: a retrospective clinical study

Neu im Fachgebiet HNO

Betalaktam-Allergie: praxisnahes Vorgehen beim Delabeling

Eingreifen von Umstehenden rettet vor Erstickungstod

Real-World-Daten sprechen eher für Dupilumab als für Op.

Schwindelursache: Massagepistole lässt Otholiten tanzen

Update HNO