Abstract 95P
Background
Non-small cell lung cancer (NSCLC) is the most common subtype of lung cancer. Driver mutations in epidermal growth factor receptor (EGFR), which occur in ∼10-15% of NSCLC, can be targeted by specific therapies. Real-world data can provide valuable information regarding the prevalence of these mutations, including their subtypes. However, despite comprehensive data availability in the Dutch Pathology Registry (Palga), manual extraction of EGFR mutation status from narrative pathology reports is time-consuming. Therefore, we used machine learning and natural language processing (NLP) to identify pathology reports that state the presence of an EGFR mutation.
Methods
The NLP algorithm was trained and validated on manually curated datasets of semi-structured pathology reports from the Palga archive to generate a structured OMOP CDM database. Afterwards, pathology reports of patients with metastatic, non-squamous NSCLC in 2019-2020 were requested from the Palga registry. The output of the algorithm was compared to results of the manual extraction.
Results
The algorithm identified 839 (10.9%) reports that mention an EGFR alteration. Manual analysis indicated 875 reports, resulting in a data extraction accuracy of 95.9% (95% CI 92.7-99.2). The 36/875 (4.1%) reports that were not identified by the algorithm were all listed as variants of unknown significance (VUS) by the reader. In the EGFR-mutated patient groups, 73.0% (639/875) had a common EGFR mutation (i.e., exon 19 deletion (41.4%, 362/875) or p.(Leu858Arg) mutation (31.7%; 277/875)). Exon 20 insertions were detected in 8.1% (71/875) of patients. Automatic data processing was 48 times faster than complete manual extraction.
Conclusions
NLP algorithms allow rapid data extraction from pathology reports, thereby offering a time-efficient and cost-effective alternative to manual data processing. In turn, this approach enables rapid insight in current biomarker testing rates and prevalence of (actionable) mutations.
Editorial acknowledgement
Clinical trial identification
Legal entity responsible for the study
LynxCare Inc.
Funding
LynxCare Inc.
Disclosure
All authors have declared no conflicts of interest.
Resources from the same session
158P - Enhancing the communication of genomic results: Understanding patient and clinician perspectives
Presenter: Eleanor Johnston
Session: Cocktail & Poster Display session
Resources:
Abstract
159P - Precision medicine-based platform to guide the treatment of EML4-ALK fusion lung cancers and other NSCLC
Presenter: Nathan Merrill
Session: Cocktail & Poster Display session
Resources:
Abstract
160P - CICLADES-CE study: Genomic signatures detected in DNA from FFPE samples of patients with advanced or metastatic breast cancers treated with anti-aromatase and CDK4/6 inhibitors
Presenter: Margaux BETZ
Session: Cocktail & Poster Display session
Resources:
Abstract
161P - PDL1 expression and its relation to EML4-ALK gene variants in metastatic lung adenocarcinoma: A single-center real-world experience
Presenter: Santhosh Meedimale
Session: Cocktail & Poster Display session
Resources:
Abstract
163P - Impact of genomic sequencing data on treatment decisions in advanced breast cancer (ABC)
Presenter: Sviatoslav Chekhun
Session: Cocktail & Poster Display session
Resources:
Abstract
164P - Clinical utility of comprehensive molecular profiling tests for advanced gastrointestinal tumors
Presenter: Alexandra Lebedeva
Session: Cocktail & Poster Display session
Resources:
Abstract
165P - Evaluation of the mutational profile in patients with metastatic non-small cell lung carcinoma diagnosed in a tertiary hospital which was performed by NGS method
Presenter: Kankan Deka
Session: Cocktail & Poster Display session
Resources:
Abstract
166P - Targeted next-generation sequencing as reliable detection of genetic profile for cancer treatment to guide oncologists in Pakistan
Presenter: Zeeshan Ahmed
Session: Cocktail & Poster Display session
Resources:
Abstract
167P - Molecular analysis of gastrointestinal stromal tumor (GIST): The experience of Regina Elena National Cancer Institute
Presenter: Andrea Torchia
Session: Cocktail & Poster Display session
Resources:
Abstract
168P - Usefulness of liquid biopsy in the management of patients with advanced or metastatic pancreatic cancer
Presenter: Jorge Iranzo Barreira
Session: Cocktail & Poster Display session
Resources:
Abstract