Oops, you're using an old version of your browser so some of the features on this page may not be displaying properly.

MINIMAL Requirements: Google Chrome 24+Mozilla Firefox 20+Internet Explorer 11Opera 15–18Apple Safari 7SeaMonkey 2.15-2.23

Poster display session

4320 - Development of a web-based application using machine learning algorithms to facilitate systematic literature reviews


10 Sep 2017


Poster display session


Hui-li Wong


Annals of Oncology (2017) 28 (suppl_5): v511-v520. 10.1093/annonc/mdx385


H. Wong1, T. Luechtefeld2, A. Prawira3, Z. Patterson4, J. Workman4, D. Day5, N. Chooback1, L. Nappi1, H.H. Samawi1, J. Lavoie1, A. Spreafico3, A.R. Hansen3, S. Sahebjam6, L.L. Siu3, S.P. Ivy7, C. Paller8, D. Renouf1

Author affiliations

  • 1 Department Of Medical Oncology, British Columbia Cancer Agency, V5Z 4E6 - Vancouver/CA
  • 2 Johns Hopkins Bloomberg School Of Public Health, Johns Hopkins University, Baltimore/US
  • 3 Division Of Medical Oncology And Hematology, Princess Margaret Cancer Centre, M5G 1Z5 - Toronto/CA
  • 4 -, Insilica LLC, Washington/US
  • 5 Division Of Medical Oncology And Hematology, Princess Margaret Cancer Centre, M5G 2M9 - Toronto/CA
  • 6 Department Of Medical Oncology, H. Lee Moffitt Cancer Center University of South Florida, 33612 - Tampa/US
  • 7 Cancer Therapy Evaluation Program, National Cancer Institute, Bethesda/US
  • 8 Department Of Medical Oncology, The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University, Baltimore/US


Abstract 4320


Systematic review is an important element of medical research but rapid proliferation of published literature presents challenges to manual review. Computer science advances can improve workload by using algorithms to automatically select and extract data from articles. We initiated a systematic review of phase I immunotherapy clinical trials and used natural language processing to aid article screening.


A literature search was performed across MEDLINE, Embase and CENTRAL in September 2016 using 100+ search terms in the categories “neoplasm”, “immunotherapy” and “phase I clinical trial”. Only English language studies published since 1990 were included. We developed a web-based interface that allowed human reviewers to apply inclusion/exclusion labels based on title and abstract screening. Articles were screened by two independent reviewers who were blinded to results. An article similarity based algorithm using weighted logistic regression to predict “include” and “exclude” labels is being trained and herein we report interim results.


28,235 articles were identified from the literature search; 19,000 remained after duplicates and conference abstracts were excluded. 4,034 (21.2%) were screened, of which 532 (13.2%) were labeled “include” by at least one reviewer. 1,944 (10.2%) were screened by two reviewers with concordance of 93.7%. The prediction algorithm was weighted to improve the detection of “include” labels, and achieved 80.6% sensitivity and 78.2% specificity when compared to manual review results. The positive and negative predictive values were 34.4% and 96.6%, respectively.


A machine learning algorithm trained on manual reviews was able to predict systematic review article inclusion with approximately 80% accuracy. Algorithm performance was affected by the low rate of included articles, but irrelevant articles were able to be excluded with high confidence. Further development is ongoing to optimize the algorithm to improve sensitivity. Once optimized, this innovative machine learning process could transform the conduct of systematic reviews.

Clinical trial identification


Legal entity responsible for the study





All authors have declared no conflicts of interest.

This site uses cookies. Some of these cookies are essential, while others help us improve your experience by providing insights into how the site is being used.

For more detailed information on the cookies we use, please check our Privacy Policy.

Customise settings
  • Necessary cookies enable core functionality. The website cannot function properly without these cookies, and you can only disable them by changing your browser preferences.