Abstract 95P
Background
Non-small cell lung cancer (NSCLC) is the most common subtype of lung cancer. Driver mutations in epidermal growth factor receptor (EGFR), which occur in ∼10-15% of NSCLC, can be targeted by specific therapies. Real-world data can provide valuable information regarding the prevalence of these mutations, including their subtypes. However, despite comprehensive data availability in the Dutch Pathology Registry (Palga), manual extraction of EGFR mutation status from narrative pathology reports is time-consuming. Therefore, we used machine learning and natural language processing (NLP) to identify pathology reports that state the presence of an EGFR mutation.
Methods
The NLP algorithm was trained and validated on manually curated datasets of semi-structured pathology reports from the Palga archive to generate a structured OMOP CDM database. Afterwards, pathology reports of patients with metastatic, non-squamous NSCLC in 2019-2020 were requested from the Palga registry. The output of the algorithm was compared to results of the manual extraction.
Results
The algorithm identified 839 (10.9%) reports that mention an EGFR alteration. Manual analysis indicated 875 reports, resulting in a data extraction accuracy of 95.9% (95% CI 92.7-99.2). The 36/875 (4.1%) reports that were not identified by the algorithm were all listed as variants of unknown significance (VUS) by the reader. In the EGFR-mutated patient groups, 73.0% (639/875) had a common EGFR mutation (i.e., exon 19 deletion (41.4%, 362/875) or p.(Leu858Arg) mutation (31.7%; 277/875)). Exon 20 insertions were detected in 8.1% (71/875) of patients. Automatic data processing was 48 times faster than complete manual extraction.
Conclusions
NLP algorithms allow rapid data extraction from pathology reports, thereby offering a time-efficient and cost-effective alternative to manual data processing. In turn, this approach enables rapid insight in current biomarker testing rates and prevalence of (actionable) mutations.
Editorial acknowledgement
Clinical trial identification
Legal entity responsible for the study
LynxCare Inc.
Funding
LynxCare Inc.
Disclosure
All authors have declared no conflicts of interest.
Resources from the same session
83P - Evaluation of serum macrophage inhibitory cytokine 1 as a diagnostic biomarker for pancreatic cancer: A systematic review and diagnostic accuracy meta-analysis
Presenter: Muhammed Elfaituri
Session: Cocktail & Poster Display session
Resources:
Abstract
84P - Profiling of lipid-loaded macrophages in melanoma
Presenter: Marta Pandini
Session: Cocktail & Poster Display session
Resources:
Abstract
85P - Whole-genome CRISPR screening identifies chemosensor receptors as key regulators of the cancer-macrophage crosstalk
Presenter: Giulia Marelli
Session: Cocktail & Poster Display session
Resources:
Abstract
86P - Regulation of cancer progression through the gut microbiome and immuno-nutrition
Presenter: Anikka Swaby
Session: Cocktail & Poster Display session
Resources:
Abstract
87P - Macrophage ontogeny underlies functional programs and drives brain tumor progression
Presenter: Miranda Yu
Session: Cocktail & Poster Display session
Resources:
Abstract
88P - Evaluating the infiltration of anti-NKG2DL CAR-T cells into a 3D cell culture developed in a Vitvo cartridge bioreactor
Presenter: Aigul Valiullina
Session: Cocktail & Poster Display session
Resources:
Abstract
89P - Immune homeostasis mediators and disease progression in chemotherapy-naïve and neoadjuvant chemotherapy treated gastric cancer patients
Presenter: Vasileia Kokala-Dimitropoulou
Session: Cocktail & Poster Display session
Resources:
Abstract
90P - Neutrophils as producers of endothelial growth factor in the progression of kidney cancer
Presenter: Ilseya Myagdieva
Session: Cocktail & Poster Display session
Resources:
Abstract
91P - The impact of the immunological context on outcomes of solid cancer patients treated with genotype-matched targeted therapies: A systematic review
Presenter: Omar Mubarak
Session: Cocktail & Poster Display session
Resources:
Abstract
92P - Reduction in the relative lymphocyte count as a predictive biomarker for serious immune-related adverse events in patients with metastatic non-small cell lung cancer on immunotherapy: Single institution experience
Presenter: Antoan Garev
Session: Cocktail & Poster Display session
Resources:
Abstract