Abstract 1186P
Background
Early cancer screening using circulating tumor DNA (ctDNA) faces challenges due to low abundance and a high signal-to-noise ratio. We aimed to develop a robust screening model that overcomes these limitations.
Methods
Low-pass whole-genome bisulfite sequencing (Low pass-WGBS) was utilized with the high-efficiency WATCHMaker (7K0101-096) library preparation kit for the optimization of cell-free DNA (cfDNA) sample processing, with sample loss minimized and molecular conversion efficiency enhanced. Thirteen cancer-specific differentially methylated regions (DMRs), including those related to lung and liver cancers, were targeted in the analysis. The SmartCS-LPLLM model, a single-molecule multimodal early cancer screening model based on large language models, was developed. Cancer signals were precisely identified by this model through the analysis of cfDNA features, including methylation scoring, sequence length, terminal motif characteristics, and sequence linguistic features.
Results
Reanalysis of public data from BMC Medicine (CRA001537) demonstrated the SmartCS-LPLLM model's significant improvement in differentiating hepatocellular carcinoma (HCC) from non-HCC samples, with an increased AUC value of 0.967. In a blind test of 12 cfDNA samples, the model accurately classified all 5 liver cancer samples. Notably, the model has been enhanced to accurately identify ctDNA at a concentration as low as 0.05%. Furthermore, during the model's construction, it was observed that the highest accuracy was achieved when the DMR region was 120M, with the single-molecule read-level model achieving a 85% accuracy rate in distinguishing tumor from healthy reads.
Conclusions
The SmartCS-LPLLM model, integrating biological features like methylation and copy number variations (CNVs), provides a precise clinical strategy for early cancer screening. Its performance in blind tests confirms its robustness and suitability for identifying low-abundance ctDNA samples, indicating significant clinical relevance.
Clinical trial identification
Editorial acknowledgement
Legal entity responsible for the study
The authors.
Funding
Has not received any funding.
Disclosure
All authors have declared no conflicts of interest.
Resources from the same session
845TiP - CNS lymphoma imaging and molecular biomarkers study: CLIMB
Presenter: Panagiotis Ntellas
Session: Poster session 09
1173P - Combining mass spectrometry with quantitative continuous scoring to unlock the full quantitative potential of immunohistochemistry
Presenter: Ana Hidalgo-Sastre
Session: Poster session 09
1174P - FLAMINGO: Accurate cancer detection from ultra-low-pass whole genome sequencing of cell-free DNA
Presenter: Daan Vessies
Session: Poster session 09
1175P - Universal circulating tumor DNA quantification using deep learning
Presenter: Anders Skanderup
Session: Poster session 09
Resources:
Abstract
1176P - Potential utility of ctDNA to detect false positive PET/CT in the evaluation of lymphoma response
Presenter: Alejandro martín-muñoz
Session: Poster session 09
1177P - FRESH: The Gustave Roussy program to facilitate access to liquid biopsy for precision oncology in France
Presenter: Etienne Rouleau
Session: Poster session 09
1178P - EGFR evaluation in non-small cell lung cancer: An artificial intelligence approach to pre-molecular analysis
Presenter: Chad Vanderbilt
Session: Poster session 09
1179P - WomEC: a novel diagnostic test for the detection of endometrial cancer in uterine fluids
Presenter: Antonio Gil-Moreno
Session: Poster session 09
1180P - An integrated metabolomics-based platform for early-stage detection of multiple cancers
Presenter: imliwati longkumer
Session: Poster session 09