The goal of lung cancer early detection is to identify the malignancy at the stage where surgical cure is possible, outcomes are superior, and treatment is less morbid. 5-Hydroxymethylcytosine (5hmC) signatures in circulating cell-free DNA (cfDNA) as diagnostic biomarkers have been examined in different types of cancer. However, little is known in the field of early lung cancer detection.
We utilized well established 5hmC-Seal method (5 ml plasma per patient) to map the 5hmC profiles in cfDNA from a cohort of 100 newly diagnosed early-stage lung adenocarcinoma (LUAD) patients and 90 healthy individuals. The differentially methylated regions (DhMRs) and differentially regulated 5hmC genes were identified, and then the functional enrichment analysis for genes with upregulated and downregulated 5hmC levels was performed. Using multiple deep learning methods, we separated samples into two groups (95 training samples, and 95 validation samples) for classifier model training and evaluation, which was used to assign a methylation score to the withheld samples. This process was repeated with 100 randomly selected training-test sets.
We identfied 1,315 DhMRs including upregulated (n = 99) and downregulated (n = 1,216) regions in LUAD groups by comparing LUAD groups with control groups. Applying the variable selection procedure of the Elastic Net algorithm, we identified a nine-gene model. To evaluate the performance, the model training was repeated 100 times and received an average AUC of 0.969 (95%CI: 0.935-1) on training set and 0.936 (95%CI: 0.890-0.983) on validating set, revealing high sensitivity (86.0%) and specificity (91.1%).
We firstly discovered 5hmC-based biomarkers in circulating cfDNA of early-stage LUAD. It provides a foundation for effective future fluid-biopsy-based lung cancer screening.
Legal entity responsible for the study
Peking University Cancer Hospital and Institute.
National Key R&D Program of China.
All authors have declared no conflicts of interest.