Abstract 1186P
Background
Early cancer screening using circulating tumor DNA (ctDNA) faces challenges due to low abundance and a high signal-to-noise ratio. We aimed to develop a robust screening model that overcomes these limitations.
Methods
Low-pass whole-genome bisulfite sequencing (Low pass-WGBS) was utilized with the high-efficiency WATCHMaker (7K0101-096) library preparation kit for the optimization of cell-free DNA (cfDNA) sample processing, with sample loss minimized and molecular conversion efficiency enhanced. Thirteen cancer-specific differentially methylated regions (DMRs), including those related to lung and liver cancers, were targeted in the analysis. The SmartCS-LPLLM model, a single-molecule multimodal early cancer screening model based on large language models, was developed. Cancer signals were precisely identified by this model through the analysis of cfDNA features, including methylation scoring, sequence length, terminal motif characteristics, and sequence linguistic features.
Results
Reanalysis of public data from BMC Medicine (CRA001537) demonstrated the SmartCS-LPLLM model's significant improvement in differentiating hepatocellular carcinoma (HCC) from non-HCC samples, with an increased AUC value of 0.967. In a blind test of 12 cfDNA samples, the model accurately classified all 5 liver cancer samples. Notably, the model has been enhanced to accurately identify ctDNA at a concentration as low as 0.05%. Furthermore, during the model's construction, it was observed that the highest accuracy was achieved when the DMR region was 120M, with the single-molecule read-level model achieving a 85% accuracy rate in distinguishing tumor from healthy reads.
Conclusions
The SmartCS-LPLLM model, integrating biological features like methylation and copy number variations (CNVs), provides a precise clinical strategy for early cancer screening. Its performance in blind tests confirms its robustness and suitability for identifying low-abundance ctDNA samples, indicating significant clinical relevance.
Clinical trial identification
Editorial acknowledgement
Legal entity responsible for the study
The authors.
Funding
Has not received any funding.
Disclosure
All authors have declared no conflicts of interest.
Resources from the same session
202P - eIF4E inhibition exhibits anti-tumor activity and re-sensitizes acquired resistant KRAS G12C NSCLC to KRAS inhibitors
Presenter: Andrew Truong
Session: Poster session 09
203P - An innovative evidence-based laboratory medicine (EBLM) test to help doctors in multi-cancer early detection (MCED)
Presenter: Jose D Santotoribio
Session: Poster session 09
204P - Assessing biomarker testing awareness among patients and caregivers in NSCLC through an interdisciplinary global survey
Presenter: Rodrigo Paredes
Session: Poster session 09
205P - Detection and diagnosis of lung cancer by electronic nose analysis of exhaled breath: A multi-center prospective observational study
Presenter: Alessandra Buma
Session: Poster session 09
206P - Unveiling the link: How metabolic syndrome drives endometrial cancer progression
Presenter: Lirong Zhai
Session: Poster session 09
Resources:
Abstract
207P - Associations of diabetic background retinopathy and ER+ breast cancer risk: A Mendelian randomization study
Presenter: Shu Wang
Session: Poster session 09
208P - Role of plasma exosomes in crosstalk between immune system and hereditary ovarian cancer: Opportunity or challenge?
Presenter: Daniele Fanale
Session: Poster session 09
209P - A novel method for early evaluation of drug-specific predictive biomarker
Presenter: Gal Dinstag
Session: Poster session 09
210P - Therapeutic implications of phosphoproteomics in molecular cancer diagnostics
Presenter: Annika Schneider
Session: Poster session 09
211P - GynePDX: A new platform of preclinical models for endometrial and ovarian cancers
Presenter: Melek Denizli
Session: Poster session 09