Abstract 2354P
Background
EMT is a process through which epithelial cells acquire mesenchymal-like traits, such as increased migratory capacity, which facilitate cancer cell metastasis. Here we propose an interpretable, scalable, machine learning (ML)-based method to assess EMT status directly from H&E-stained images. We extracted human interpretable features (HIFs) quantifying cancer cell nuclear morphology and the tumor microenvironment from breast cancer H&E images and confirmed concordance with a known ground-truth EMT expression signature.
Methods
Using a breast cancer-specific EMT signature gene set, we calculated quantile normalized EMT scores for each patient in TCGA-BRCA with RNA-seq data (n = 751). Convolutional neural network models trained using H&E-stained whole slide images were used to extract a set of HIFs across available slides (n = 775). An ordinary least squares regression was run for each HIF against EMT score with tumor purity as a covariate. FDR correction was applied to all p-values.
Results
Lower EMT scores were associated with epithelial attributes, such as high nuclei circularity, while high EMT scores were associated with mesenchymal attributes, such as high nuclei eccentricity and greater variation in nuclei perimeter. Epithelial-type tumors were associated with increased immature stromal content and cancer cell proportion. Mesenchymal-type tumors were associated with increased mature stromal content and stromal cell proportion. Table: 2354P
HIF | Coefficient | Standard error | p-value | R2 | FDR corrected p-value |
Mean cancer cell nuclei circularity | -0.047 | 0.0087 | 8.60E-08 | 0.65 | 2.80E-07 |
Mean cancer cell nuclei eccentricity | 0.042 | 0.0089 | 2.30E-06 | 0.65 | 5.80E-06 |
Standard deviation of cancer cell nuclei perimeter | 0.073 | 0.0089 | 1.20E-15 | 0.67 | 8.30E-15 |
Area proportion immature stroma over all stroma | -0.035 | 0.0083 | 2.50E-05 | 0.65 | 5.20E-05 |
Area proportion mature stroma over all stroma | 0.033 | 0.0082 | 8.40E-05 | 0.65 | 0.00016 |
Count proportion cancer cells over all cells | -0.12 | 0.0075 | 6.50E-51 | 0.75 | 9.80E-49 |
Count proportion stromal cells over all cells | 0.098 | 0.0078 | 1.20E-32 | 0.71 | 3.60E-31 |
Conclusions
ML model-generated HIFs effectively capture cancer cell morphology reflective of EMT status, including nuclei eccentricity and circularity. The relationships observed in these results are concordant with previous studies showing increased EMT in tumors with greater extracellular matrix stiffness and large numbers of cancer associated fibroblasts. Overall these results reflect significant agreement between EMT scores and HIFs, indicating potential for quantification of EMT states directly from H&E images.
Clinical trial identification
Editorial acknowledgement
Legal entity responsible for the study
PathAI, Inc.
Funding
PathAI, Inc.
Disclosure
C. Kirkup, S. Vilchez, S. Srinivasan, M. Lin, J. Abel, M. Drage, J. Conway, A. Khosla, A. Taylor-Weiner: Financial Interests, Full or part-time Employment: PathAI, Inc. All other authors have declared no conflicts of interest.
Resources from the same session
2346P - Spatial and quantified molecular characterization of high endothelial venule predictive of immunotherapy response
Presenter: Kunheng Du
Session: Poster session 16
2347P - Correlation between second primary cancers and first primary cancers: A systematic review and meta-analysis of 9 million cancer patients
Presenter: Xinyu Wang
Session: Poster session 16
2348P - High-throughput screening reveals GSK1838705A as a potent inhibitor of CAFs-promoted tumor progression in esophageal squamous cell carcinoma
Presenter: hongfang zhang
Session: Poster session 16
2349P - Patterns and frequency of pathogenic germline variants (PGVs) among non-western young male patients with cancer: The Jordanian exploratory cancer genetics (Jo-ECAG) study
Presenter: Hira Bani Hani
Session: Poster session 16
2350P - Single-cell RNA-seq dissecting the initiating liver metastasis cells in various cancers
Presenter: Shu-yue Zheng
Session: Poster session 16
2351P - Whole genome sequencing of AYA patients
Presenter: Paul Roepman
Session: Poster session 16
2352P - Sex-based differences in genomic alterations and biomarkers in anal squamous cell carcinoma (ASCC)
Presenter: Stefano Cereda
Session: Poster session 16
2353P - Anti-cancer effects of HDAC8 specific inhibitors EC-352H and EC-374H in lymphoma
Presenter: So Young Lee
Session: Poster session 16
2355P - Carcinoembryonic antigen (CEA) expression in human tumors: A tissue microarray study on 15,413 tumors
Presenter: Kristina Jansen
Session: Poster session 16
2356P - The role of pathological features in predicting prognosis of patients with advanced RET-positive NSCLC
Presenter: Arianna Marinello
Session: Poster session 16