Oops, you're using an old version of your browser so some of the features on this page may not be displaying properly.

MINIMAL Requirements: Google Chrome 24+Mozilla Firefox 20+Internet Explorer 11Opera 15–18Apple Safari 7SeaMonkey 2.15-2.23

Poster session 09

1186P - SmartCS-LPLLM: Enhancing early cancer detection through ctDNA methylation analysis leveraging large language models

Date

14 Sep 2024

Session

Poster session 09

Topics

Secondary Prevention/Screening;  Cancer Research

Tumour Site

Presenters

Li Chao

Citation

Annals of Oncology (2024) 35 (suppl_2): S762-S774. 10.1016/annonc/annonc1599

Authors

L. Chao1, H. Wang1, M. Chao1, P. xiaobao1, W. lin1, Z. Jiawei1, S. yu1, L. Jing1, L. yixue2

Author affiliations

  • 1 R&d Center, Smartquerier Gene Technology (Shanghai) Co., Ltd., 200000 - Shanghai/CN
  • 2 Lab, Guangzhou Laboratory, 510000 - Guangzhou/CN

Resources

Login to get immediate access to this content.

If you do not have an ESMO account, please create one for free.

Abstract 1186P

Background

Early cancer screening using circulating tumor DNA (ctDNA) faces challenges due to low abundance and a high signal-to-noise ratio. We aimed to develop a robust screening model that overcomes these limitations.

Methods

Low-pass whole-genome bisulfite sequencing (Low pass-WGBS) was utilized with the high-efficiency WATCHMaker (7K0101-096) library preparation kit for the optimization of cell-free DNA (cfDNA) sample processing, with sample loss minimized and molecular conversion efficiency enhanced. Thirteen cancer-specific differentially methylated regions (DMRs), including those related to lung and liver cancers, were targeted in the analysis. The SmartCS-LPLLM model, a single-molecule multimodal early cancer screening model based on large language models, was developed. Cancer signals were precisely identified by this model through the analysis of cfDNA features, including methylation scoring, sequence length, terminal motif characteristics, and sequence linguistic features.

Results

Reanalysis of public data from BMC Medicine (CRA001537) demonstrated the SmartCS-LPLLM model's significant improvement in differentiating hepatocellular carcinoma (HCC) from non-HCC samples, with an increased AUC value of 0.967. In a blind test of 12 cfDNA samples, the model accurately classified all 5 liver cancer samples. Notably, the model has been enhanced to accurately identify ctDNA at a concentration as low as 0.05%. Furthermore, during the model's construction, it was observed that the highest accuracy was achieved when the DMR region was 120M, with the single-molecule read-level model achieving a 85% accuracy rate in distinguishing tumor from healthy reads.

Conclusions

The SmartCS-LPLLM model, integrating biological features like methylation and copy number variations (CNVs), provides a precise clinical strategy for early cancer screening. Its performance in blind tests confirms its robustness and suitability for identifying low-abundance ctDNA samples, indicating significant clinical relevance.

Clinical trial identification

Editorial acknowledgement

Legal entity responsible for the study

The authors.

Funding

Has not received any funding.

Disclosure

All authors have declared no conflicts of interest.

This site uses cookies. Some of these cookies are essential, while others help us improve your experience by providing insights into how the site is being used.

For more detailed information on the cookies we use, please check our Privacy Policy.

Customise settings
  • Necessary cookies enable core functionality. The website cannot function properly without these cookies, and you can only disable them by changing your browser preferences.