Abstract 1233P
Background
Circulating cell-free DNA (cfDNA) is a promising biomarker for early cancer detection, and its fragmentomics features have been successfully used to detect cancer signals in blood. However, its ability to predict the tissue of origin (TOO) of cancers remains to be evaluated, which is highly desirable to differentiate the most common types of gastrointestinal (GI) cancers, including colorectal (CC), esophageal (EC), gastric (GC), liver (LC), and pancreatic cancer (PC).
Methods
Whole-genome sequencing was performed for the cfDNA of 769 cancer patients (149 CCs, 137 ECs, 149 GCs, 272 LCs, and 62 PCs), to calculate the coverage at repetitive genomic regions (RepeatsCov), the depth and the cleavage diversity around transcription start sites (TSSDepth and TSSClvDiv), and the microbiome abundance (MicrobeAb). Together with other classical fragmentomics features, including copy number variation (CNV), end motif diversity (EDM), fragment size ratio (FSR), and promoter fragmentation entropy (PFE), a stacked ensemble machine learning classifier was trained and tested with sample ratio of 1:1 to predict the TOO of the GI cancers.
Results
The performance of each single feature was evaluated first, showing that the FSR model had the highest accuracy of 67.1% while the RepeatsCov model had the lowest of 53.9%. The ensemble of all the features resulted in an accuracy of 67.6%. Interestingly, a model combining MicrobeAb, RepeatsCov and FSR achieved the highest accuracy of 69.4% for all cancers (CC: 63.8%, EC&GC: 63.3%, LC: 83.6%, and PC: 43.8%), and an elevated accuracy of 87.8% to predict the top two most likely TOOs. We also trained and tested a previously reported multi-features-based model on our data, and our classifier achieved higher accuracy (69.4% vs. 60.6%).
Conclusions
We comprehensively evaluated the classical and our newly developed cfDNA fragmentomics features in predicting the TOO of cancer signals, and showed that by combining features including MicrobeAb, RepeatsCov and FSR, we were able to maximize the accuracy in predicting GI cancers’ TOO. However, results also indicate that features should be carefully selected to avoid multicollinearity or other negative effects.
Clinical trial identification
Editorial acknowledgement
Legal entity responsible for the study
The authors.
Funding
National Key Research and Development Program of China.
Disclosure
R. Fu, K. Xie, Y. Liu, H. Chen, M. Su, Q. He, Z. Su: Financial Interests, Personal, Full or part-time Employment: Singlera Genomics Inc. R. Liu: Financial Interests, Personal, Officer: Singlera Genomics Inc. All other authors have declared no conflicts of interest.
Resources from the same session
1219P - Artificial intelligence-based breast cancer detection facilitates automated prognosis marker assessment using multiplex fluorescence immunohistochemistry
Presenter: Tim Mandelkow
Session: Poster session 14
1220P - Comprehensive diagnose of programmed death-ligand 1 from two-dimensional to three-dimensional in breast cancer with computer-aided artificial intelligence system
Presenter: Yi-Hsuan Lee
Session: Poster session 14
1221P - The functional domain of BRCA1/2 pathogenic variants (PVs) as potential biomarkers of second tumor and domain-related sensitivity to PARP-inhibitors
Presenter: Lorena Incorvaia
Session: Poster session 14
1222P - Detection of androgen-receptor splice variant 7 messenger RNA in circulating tumor cells of prostate cancer by in vitro assay
Presenter: Hoin Kang
Session: Poster session 14
1223P - Homologous recombination deficiency (HRD) testing on ovarian cancer ascites: A feasibility study
Presenter: Alberto Ranghiero
Session: Poster session 14
1224P - Detection of circulating tumor DNA (ctDNA) in untreated patients (pts) with cancer: Implications for early cancer detection (ECD)
Presenter: Yoshiaki Nakamura
Session: Poster session 14
1225P - Combining ctDNA and tissue-based-genomic profiling in advanced cancer: A real-world evidence prospective study in non-Western patients treated at Gustave Roussy cancer campus
Presenter: Tony Ibrahim
Session: Poster session 14
1226P - Multi-site validation of a deep learning solution for HER2 profiling of breast cancer from H&E-stained pathology slides
Presenter: Salim Arslan
Session: Poster session 14
1227P - Novel in vivo photonics-immunoassay system, inPROBE, for the rapid detection of HER2 in breast cancer
Presenter: Magdalena Staniszewska
Session: Poster session 14
1228P - A circulating tumor cell (CTC) based assay for diagnostic immunocytochemistry profiling of lung cancer
Presenter: Nitesh Rohatgi
Session: Poster session 14