ESMO Congress 2023 | OncologyPRO

Topics

Cancer Intelligence (eHealth, Telehealth Technology, BIG Data); Cancer Diagnostics

Author affiliations

¹ Clinical Oncology, Beneficencia Portuguesa, 01323001 - Sao Paulo/BR
² Clinical Oncology Dept., FMABC - Centro Universitário Saúde ABC - Faculdade de Medicina, 09060-650 - Santo Andre/BR
³ Pneumology, Beneficencia Portuguesa de Sao Paulo, 01323-001 - Sao Paulo/BR
⁴ Oncology, USP - Universidade de Sao Paulo, 05508-220 - Sao Paulo/BR
⁵ Oncology Center, MD Anderson, Houston/US

Resources

If you do not have an ESMO account, please create one for free.

Abstract 1255P

Background

Artificial Intelligence (AI) and Natural Language Processing (NLP) advancements have led to sophisticated tools like GPT-4.0, allowing clinicians to explore its utility as a healthcare management support tool. Our study aimed to assess GPT-4's ability to suggest the definitive diagnosis and the most appropriate work-up to minimize unnecessary procedures.

Methods

We conducted a retrospective comparative analysis, extracting relevant clinical data from 10 cases published at NEJM after 2022 and inputting it into GPT-4 to generate diagnostic and workup recommendations. Primary endpoint: the ability to correctly identify the final diagnosis. Secondary endpoints: its ability to list the definitive diagnosis in the five most likely differential diagnoses and determine an adequate workup.

Results

The AI could not identify the definitive diagnosis in 2 out of the 10 cases (20% inaccuracy). Among the 8 cases correctly identified by the AI, 5 (63%) had the definitive diagnosis as the top differential diagnosis list. Regarding the suggested diagnostic tests and exams, requests for exams that would not aid in the patient's final diagnosis were made in 2 cases, representing 40% of the patients whose final diagnosis was not correctly identified by the AI. Moreover, the AI could not suggest adequate treatment for 7 cases (70%). Among them, the AI suggested inappropriate management for 2 cases, and the remaining 5 received incomplete or non-specific advice, such as chemotherapy, without specifying the best regimen.

Conclusions

Our study demonstrated GPT-4's potential as an academic support tool, although it cannot correctly identify the final diagnosis in 20% of the cases. There is also a limitation regarding the management suggested by AI. In cases where the main diagnostic hypothesis was incorrectly identified or not listed as the top differential diagnosis, the AI requested unnecessary additional diagnostic tests for 40% of the patients. Future research should focus on evaluating the performance of GPT-4 using a more extensive and diverse sample, incorporating prospective assessments, and investigating its ability to optimize diagnostic and therapeutic procedures to optimize healthcare utilization.

Poster session 14

1255P - Evaluating GPT-4 as an academic support tool for clinicians: A comparative analysis of case records from the literature

Date

Session

Topics

Tumour Site

Presenters

Citation

Authors

Author affiliations

Resources

Abstract 1255P

Background

Methods

Results

Conclusions

Clinical trial identification

Editorial acknowledgement

Legal entity responsible for the study

Funding

Disclosure

Abstract 1255P

Background

Methods

Results

Conclusions

Clinical trial identification

Editorial acknowledgement

Legal entity responsible for the study

Funding

Disclosure

Resources from the same session