Abstract 158P
Background
Lung cancer (LC) is the leading cause of cancer death due to late-stage diagnosis, which often results in poor prognosis. Therefore, new strategies for the early detection of LC are of outmost importance. Artificial intelligence has shown prominent results in health science during the last decade, using pattern recognition to predict outcomes. Several risk models have been presented to refine LC screening criteria, but most are based on unrepresentative populations or it is challenging to obtain data from different resources. This study presents a risk model based on standard blood sample analysis as well as smoking history from a population at risk.
Methods
All patients examined due to a risk of LC in the Region of Southern Denmark within 2008–2019, were included. Exclusion criteria were patients with missing information on smoking status or results from less than 17 of 20 selected blood sample analysis taken at the time of examination. Several models were tested on a subset of patients with complete results. To obtain a gold standard for comparison, five LC specialists provided their diagnoses on 200 samples.
Results
Among 38,944 patients, data on smoking and blood sample results from at least 17 analyses were available for 9,940 patients. This includes 2505 (25%) LC patients and 7435 (75%) non-LC patients. The best performance was obtained using a light gradient-boosting machine with an accuracy and ROC-AUC of 72% and 80%, respectively. The model performed better than the LC specialists with a sensitivity of 76% compared to 67% for the specialists, at a matched specificity of 70%. The most important predictors of LC were active/former smoking status, high age, and an elevation of neutrophils, LDH and calcium, accordingly.
Conclusions
This study presents a risk-model based on smoking status and regular blood sample analysis, generated on a relevant population at risk. The model demonstrates moderate performance, and outperforms LC specialists presented with the same information. This emphasizes the relevance to consider both clinical and laboratory data in future risk assessment models. A high performing risk model able to provide decision support to the general practitioner would be of great value to the patient, facilitating earlier referral of potential LC-patients.
Legal entity responsible for the study
M.B. Henriksen.
Funding
The Region of Southern Denmark, University of Southern Denmark, The Regional Research Board, The Danish Cancer Society, Dagmar Marshalls Foundation, Beckett Foundation, Lily and Herbert Hans Foundation, Familien Hede Nielsens Foundation.
Disclosure
All authors have declared no conflicts of interest.