Abstract 1186P
Background
Early cancer screening using circulating tumor DNA (ctDNA) faces challenges due to low abundance and a high signal-to-noise ratio. We aimed to develop a robust screening model that overcomes these limitations.
Methods
Low-pass whole-genome bisulfite sequencing (Low pass-WGBS) was utilized with the high-efficiency WATCHMaker (7K0101-096) library preparation kit for the optimization of cell-free DNA (cfDNA) sample processing, with sample loss minimized and molecular conversion efficiency enhanced. Thirteen cancer-specific differentially methylated regions (DMRs), including those related to lung and liver cancers, were targeted in the analysis. The SmartCS-LPLLM model, a single-molecule multimodal early cancer screening model based on large language models, was developed. Cancer signals were precisely identified by this model through the analysis of cfDNA features, including methylation scoring, sequence length, terminal motif characteristics, and sequence linguistic features.
Results
Reanalysis of public data from BMC Medicine (CRA001537) demonstrated the SmartCS-LPLLM model's significant improvement in differentiating hepatocellular carcinoma (HCC) from non-HCC samples, with an increased AUC value of 0.967. In a blind test of 12 cfDNA samples, the model accurately classified all 5 liver cancer samples. Notably, the model has been enhanced to accurately identify ctDNA at a concentration as low as 0.05%. Furthermore, during the model's construction, it was observed that the highest accuracy was achieved when the DMR region was 120M, with the single-molecule read-level model achieving a 85% accuracy rate in distinguishing tumor from healthy reads.
Conclusions
The SmartCS-LPLLM model, integrating biological features like methylation and copy number variations (CNVs), provides a precise clinical strategy for early cancer screening. Its performance in blind tests confirms its robustness and suitability for identifying low-abundance ctDNA samples, indicating significant clinical relevance.
Clinical trial identification
Editorial acknowledgement
Legal entity responsible for the study
The authors.
Funding
Has not received any funding.
Disclosure
All authors have declared no conflicts of interest.
Resources from the same session
1192P - Optimizing lung cancer screening: Independent verification of an AI/ML computer-aided detection and characterization software as medical device
Presenter: Sylvain Bodard
Session: Poster session 09
1194P - Development of a novel artificial intelligence (AI) algorithm to detect pulmonary nodules on chest radiography
Presenter: Mitsunori Higuchi
Session: Poster session 09
1195P - Whole-body magnetic resonance imaging (WB-MRI) screening in Li Fraumeni syndrome for early cancer diagnosis: The SIGNIFIED project
Presenter: Elena Cojocaru
Session: Poster session 09
1196P - Organoid growth-based oncological sensitivity test (OncoSensi) for predicting adjuvant therapy outcomes in ovarian cancer patients
Presenter: Dong Woo Lee
Session: Poster session 09
1197P - Ex vivo basket study reports patient-specific sensitivity to carboplatin versus cisplatin in lung, ovarian and bladder cancer
Presenter: Debbie Robbrecht
Session: Poster session 09
1198P - Analytical validation of an NGS panel-based ecDNA detection device for use as a clinical trial assay for the POTENTIATE clinical study of the novel CHK1 inhibitor, BBI-355
Presenter: Pontis Julien
Session: Poster session 09