Oops, you're using an old version of your browser so some of the features on this page may not be displaying properly.

MINIMAL Requirements: Google Chrome 24+Mozilla Firefox 20+Internet Explorer 11Opera 15–18Apple Safari 7SeaMonkey 2.15-2.23

Lunch and Poster Display session

97P - The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: A multi-center, retrospective study

Date

16 May 2024

Session

Lunch and Poster Display session

Presenters

Lie Cai

Citation

Annals of Oncology (2024) 9 (suppl_4): 1-9. 10.1016/esmoop/esmoop103095

Authors

L. Cai1, A. Pfob2, C. Sidey-Gibbons3, R. Barr4, M. Golatta2

Author affiliations

  • 1 University Hospital Heidelberg, 69120 - Heidelberg/DE
  • 2 University Hospital Heidelberg, Heidelberg/DE
  • 3 The University of Texas MD Anderson Cancer Center, Houston/US
  • 4 NEOMED - Northeast Ohio Medical University, Rootstown/US

Resources

Login to get immediate access to this content.

If you do not have an ESMO account, please create one for free.

Abstract 97P

Background

Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models’ generalizability by validating them on external data generated by both the original updated software versions.

Methods

We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT 02638935) using 10-fold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).

Results

We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, P < 0.001 and 0.934 vs. 0.872, P < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, p = 0.045). SVM was not calibrated.

Conclusions

Using multicenter, international SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.

Legal entity responsible for the study

The authors.

Funding

Has not received any funding.

Disclosure

All authors have declared no conflicts of interest.

This site uses cookies. Some of these cookies are essential, while others help us improve your experience by providing insights into how the site is being used.

For more detailed information on the cookies we use, please check our Privacy Policy.

Customise settings
  • Necessary cookies enable core functionality. The website cannot function properly without these cookies, and you can only disable them by changing your browser preferences.