Lung adenocarcinoma accounts for more than 40% of lung cancer incidence. Thus, it is urgent to identify early-stage related markers. In this study, the effectiveness of CpG methylation on predicting lung adenocarcinoma was investigated.
In total, 1,170 patients with lung adenocarcinoma from four independent databases and one medical center were sorted by three phases. In the discovery phase, 338 lung adenocarcinomas and nonmalignant samples were collected from the GEO databases which used Illumina Infinium HumanMethylation27K BeadChip for the methylation analysis. The K-Means Clustering algorithm was used to select significant CpGs. In the training phase, recursive feature elimination was performed to evaluate the importance of selected CpGs to classification model. In the validation phase, four candidate CpGs were validated using two cohorts (n = 832 and n = 10). To explore the potential biological function of selected CpGs, GO enrichment analysis was performed using the Database for DAVID version 6.8.
After the selection of CpGs by the K-Means Clustering algorithm, 62 CpGs showed great different methylation profiles between lung adenocarcinomas and adjacent nonmalignant lung tissue (p < 0.05). Among these selected CpGs, 95.16% were hypermethylated in the malignant samples comparing to only 4.84% were hypomethylated. With the evaluation of recursive feature elimination, four CpGs corresponding to HOXA9, KRTAP8-1, CCND1, and TULP2 were highlighted as candidate predictors in the training phase. The performance of these four candidate CpGs were validated in two validation cohorts (p < 0.01). These disparate hypermethylated genes were significantly enriched in GO biological processes including negative regulation of transcription from RNA polymerase II promoter, DNA-templated transcription, while the hypomethylated gene was obviously enriched with the terms including adenylate cyclase-activating G-protein coupled receptor signaling pathway. The direction of methylation did not affect the enrichments for Out-CpG sites.
A four-CpG-based signature, including HOXA9, KRTAP8-1, CCND1 and TULP2, is useful for the prediction of lung adenocarcinoma.
Clinical trial identification
Legal entity responsible for the study
Xinhua Hospital, School of Medicine, Shanghai Jiao Tong University.
Has not received any funding.
All authors have declared no conflicts of interest.