Assessing unknown potential : quality and limitations of different large language models in the field of otorhinolaryngology

dc.contributor.authorBuhr, Christoph R.
dc.contributor.authorSmith, Harry
dc.contributor.authorHuppertz, Tilman
dc.contributor.authorBahr-Hamm, Katharina
dc.contributor.authorMatthias, Christoph
dc.contributor.authorCuny, Clemens
dc.contributor.authorSnijders, Jan Phillipp
dc.contributor.authorErnst, Benjamin Philipp
dc.contributor.authorBlaikie, Andrew
dc.contributor.authorKelsey, Tom
dc.contributor.authorKuhn, Sebastian
dc.contributor.authorEckrich, Jonas
dc.date.accessioned2024-12-03T12:57:51Z
dc.date.available2024-12-03T12:57:51Z
dc.date.issued2024
dc.description.abstractBackground: Large Language Models (LLMs) might offer a solution for the lack of trained health personnel, particularly in low- and middle-income countries. However, their strengths and weaknesses remain unclear. Aims/objectives: Here we benchmark different LLMs (Bard 2023.07.13, Claude 2, ChatGPT 4) against six consultants in otorhinolaryngology (ORL). Material and methods: Case-based questions were extracted from literature and German state examinations. Answers from Bard 2023.07.13, Claude 2, ChatGPT 4, and six ORL consultants were rated blindly on a 6-point Likert-scale for medical adequacy, comprehensibility, coherence, and conciseness. Given answers were compared to validated answers and evaluated for hazards. A modified Turing test was performed and character counts were compared. Results: LLMs answers ranked inferior to consultants in all categories. Yet, the difference between consultants and LLMs was marginal, with the clearest disparity in conciseness and the smallest in comprehensibility. Among LLen_GB
dc.identifier.doihttp://doi.org/10.25358/openscience-11041
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/11060
dc.language.isoengde
dc.rightsCC-BY-4.0*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subject.ddc610 Medizinde_DE
dc.subject.ddc610 Medical sciencesen_GB
dc.titleAssessing unknown potential : quality and limitations of different large language models in the field of otorhinolaryngologyen_GB
dc.typeZeitschriftenaufsatzde
jgu.journal.issue3de
jgu.journal.titleActa oto-laryngologicade
jgu.journal.volume44de
jgu.organisation.departmentFB 04 Medizinde
jgu.organisation.nameJohannes Gutenberg-Universität Mainz
jgu.organisation.number2700
jgu.organisation.placeMainz
jgu.organisation.rorhttps://ror.org/023b0x485
jgu.pages.end242de
jgu.pages.start237de
jgu.publisher.doi10.1080/00016489.2024.2352843de
jgu.publisher.issn1651-2251de
jgu.publisher.nameTaylor & Francisde
jgu.publisher.placeStockholmde
jgu.publisher.year2024
jgu.rights.accessrightsopenAccess
jgu.subject.ddccode610de
jgu.subject.dfgLebenswissenschaftende
jgu.type.dinitypeArticleen_GB
jgu.type.resourceTextde
jgu.type.versionPublished versionde

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
assessing_unknown_potential__-20241203135652757.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.57 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections