Machine learning on H&E slides could help identify patients that would benefit from additional therapy after surgery

Pathologists have long recognized that microscopic appearance of cancer is associated with its aggressiveness. Here we report on a computational assay using machine learning to recognize patterns from routine slides relating to prostate cancer aggressiveness, outcome and need for additional therapy.

Like Comment
Read the paper

Over 75,000 radical prostatectomies (RPs), a curative therapy for prostate cancer, are performed annually in the United States alone [1]. However, 10-40% of these patients will experience cancer recurrence [2, 3]. While adjuvant therapy can lower a patient’s risk of disease-specific death [4], a lack of accurate post-RP outcome prognosis tools leads to overtreatment, with 10 patients having to be offered adjuvant therapy to avoid a single death [5]. Simultaneously, life-extending therapy is delayed for other patients until recurrence becomes apparent. Traditionally, prostate cancer characterization is typically done visually by a pathologist looking at the surgical specimen. However, evaluating morphology is a subjective process and different pathologists may disagree on the grade of a tumor [6].

Methods using automated analysis of histology have been developed in response to these limitations with the aim of finding digital biomarkers which transcend grade and relate morphology directly to outcome. Unlike traditional pathological analysis, artificial intelligence systems are objective and capable of quantitatively analyzing every detail of an image. In our article, we demonstrate an image-based assay, called “Histotyping”, which can identify patients at high risk of post-RP cancer recurrence using just a digital image of an H&E slide. Our method can be contrasted with other recent studies in the field, which have largely focused on automated prostate cancer detection and Gleason grading [7, 8, 9]. While these are important objectives for computer aided diagnostic systems, they are not directly prognostic, instead attempting to recapitulate human derived tumor scoring systems. Furthermore, many of these approaches rely entirely on deep learning, a specific type of machine learning in which a neural network attempts to learn the ideal representations to distinguish the categories of interest. Unfortunately, deep learning methods tend to be black-box approaches because of the abstract nature of the feature representations derived for aiding diagnosis and grading.

Histotyping also uses deep learning, but only to segment gland lumens in the tumor region of digital H&E specimens, from which 216 quantitative descriptors of lumen shape and architecture are extracted. The use of a pre-defined set of handcrafted lumen morphology features increases the interpretability of our approach so that the model’s decisions can be better explained and scrutinized. These features were then used in a regression model to predict risk of post-surgical biochemical recurrence, an event associated with greatly elevated risk of metastasis and disease-specific mortality [10]. Gland lumen features were used because gland morphology is known to be associated with prostate cancer aggressiveness, being the foundation of the Gleason grading system used by pathologists. From our feature set, we filtered out features which were found to be susceptible to inter-site variability in specimen preservation, staining, and scanning procedures in order to enhance the generalizability of the model. This allowed us to train Histotyping on data from just two institutions and validate it on patients from five institutions.

Our approach was compared with genomic companion diagnostic tests which are currently used for prognosis in high-risk prostate cancer patients. In this study, we focused on the Decipher genomic test, which examines expression levels of 22 genes and uses a proprietary algorithm to convert these readings to a risk score. While Decipher is prognostic of metastasis and included in the National Comprehensive Cancer Network guidelines, it involves destructive tissue testing and typically only involves sampling of a small representative tissue section. Additionally, Decipher costs several thousand dollars, requires sophisticated laboratory equipment, and has a turnaround time on the order of weeks, putting this out of reach of low- and middle-income countries.

Because it does not destroy any tissue, Histotyping can interrogate a patient’s entire tumor area and capture intra-tumoral heterogeneity that may not be assessed via a molecular test. Being a digital image test and hence lower cost and can be located in the cloud (requiring no physical shipping of tissue blocks), Histotyping could have a significant impact in low- and middle-income countries [11]. Critically, Histotyping needs only a slide scanner and desktop computer to be able to provide prognostic and potentially actionable information to the clinician in minutes, relieving patient anxiety and enabling more informed patient management planning.

In our study, Histotyping was found to be prognostic of recurrence even after controlling for tumor grade, stage, and other clinical markers. In particular, Histotyping was able to identify high risk patients, even within those patients with negative surgical margins, or in intermediate risk groups, such as Gleason grade group 3. These patients would not normally be recommended to adjuvant therapy, making it valuable for Histotyping to detect those patients who, even though they were identified as low-risk based off clinico-pathologic factors, were likely to have poor outcome. Most importantly, Histotyping performed similarly to Decipher and outperformed Decipher when combined with Gleason grade and preoperative prostate specific antigen level. To our knowledge, this is the first study of a head-to-head comparison between Decipher and automated analysis of H&E slides.

Several avenues of future work will further enhance the usefulness of Histotyping. First, while the endpoint in this study was biochemical recurrence, Histotyping could also be used to identify patients at elevated risk of metastasis. Second, Histotyping could be also validated as a predictive assay, allow it to help identify which patients would receive added benefit of adjuvant therapy following surgery or definitive therapy, currently Histotyping has only been validated as prognostic of biochemical recurrence. Fortunately, data from retrospective completed clinical trials could provide an opportunity to develop and validate Histotyping as not just a prognostic test but also a predictive assay. Finally, while Histotyping was successful on a diverse population of patients, our previous research has suggested that morphologic differences in the tissue appearance of prostate cancer might exist between African American and Caucasian American men [12]. This suggests that future avatars of Histotyping might need to explicitly account for population specific differences, resulting in multiple population specific Histotyping models.

In summary, our article demonstrates that automated image analysis has the potential to supplement traditional treatment planning and companion diagnostics by proving a low-cost test to detect patients at high risk of recurrence post-RP, especially in patients considered low-risk by traditional methods.

Works cited

  1. Tyson, M. D., Andrews, P. E., Ferrigni, R. F., Humphreys, M. R., Parker, A. S., and Castle, E. P. Mayo Clinic Proceedings 91(1), 10–16 jan (2016).
  2. Bolla, M., van Poppel, H., Tombal, B., Vekemans, K., Pozzo, L. D., de Reijke, T. M., Verbaeys, A., Bosset, J.-F., van Velthoven, R., Colombel, M., van de Beek, C., Verhagen, P., van den Bergh, A., Sternberg, C., Gasser, T., van Tienhoven, G., Scalliet, P., Haustermans, K., and Collette, L. The Lancet 380(9858), 2018–2027 dec (2012).
  3. Hamdy, F. C., Donovan, J. L., Lane, J. A., Mason, M., Metcalfe, C., Holding, P., Davis, M., Peters, T. J., Turner, E. L., Martin, R. M., et al. New England Journal of Medicine 375(15), 1415–1424 (2016).
  4. James, N. D., Sydes, M. R., Clarke, N. W., Mason, M. D., Dearnaley, D. P., Spears, M. R., Ritchie, A. W., Parker, C. C., Russell, J. M., Attard, G., et al. The Lancet 387(10024), 1163–1177 (2016).
  5. Daly, T., Hickey, B. E., Lehman, M., Francis, D. P., and See, A. M. Cochrane Database of Systematic Reviews  dec (2011).
  6. Ozkan, T. A., Eruyar, A. T., Cebeci, O. O., Memik, O., Ozcan, L., and Kuskonmaz, I. Scandinavian Journal of Urology 50(6), 420–424 jul (2016).
  7. Bulten, W., Pinckaers, H., van Boven, H., Vink, R., de Bel, T., van Ginneken, B., van der Laak, J., van de Kaa, C. H., and Litjens, G. The Lancet Oncology  jan (2020).
  8. Ström, P., Kartasalo, K., Olsson, H., Solorzano, L., Delahunt, B., Berney, D. M., Bostwick, D. G., Evans, A. J., Grignon, D. J., Humphrey, P. A., Iczkowski, K. A., Kench, J. G., Kristiansen, G., van der Kwast, T. H., Leite, K. R. M., McKenney, J. K., Oxley, J., Pan, C.-C., Samaratunga, H., Srigley, J. R., Takahashi, H., Tsuzuki, T., Varma, M., Zhou, M., Lindberg, J., Lindskog, C., Ruusuvuori, P., Wählby, C., Grönberg, H., Rantalainen, M., Egevad, L., and Eklund, M. The Lancet Oncology  jan (2020).
  9. Pantanowitz, L., Quiroga-Garza, G. M., Bien, L., Heled, R., Laifenfeld, D., Linhart, C., Sandbank, J., Shach, A. A., Shalev, V., Vecsler, M., Michelow, P., Hazelhurst, S., and Dhir, R. The Lancet Digital Health 2(8), e407–e416 aug (2020).
  10. Freedland, S. J., Humphreys, E. B., Mangold, L. A., Eisenberger, M., Dorey, F. J., Walsh, P. C., and Partin, A. W. JAMA 294(4), 433 jul (2005).
  11. Sung, H., Ferlay, J., Siegel, R. L., Laversanne, M., Soerjomataram, I., Jemal, A., and Bray, F. CA: A Cancer Journal for Clinicians 71(3), 209–249 feb (2021).
  12. Bhargava, H. K., Leo, P., Elliott, R., Janowczyk, A., Whitney, J., Gupta, S., Fu, P., Yamoah, K., Khani, F., Robinson, B. D., Rebbeck, T. R., Feldman, M., Lal, P., and Madabhushi, A. Clinical Cancer Research 26(8), 1915–1923 mar (2020).

Patrick Leo

Graduate student, Case Western Reserve University