Exploiting convergent phenotypes to derive a pan-cancer cisplatin response gene expression signature


Motivation and clinical significance

Despite rich collections of cancer “-omic” data, precision medicine research has largely focused on producing therapies that target somatic mutations in previously documented driver genes. These therapies have produced some inspiring successes, extending the lives of patients with targetable mutations by months to years.1-3 However, the reach of these drugs is narrow and most patients without targetable mutations simply have not seen the benefits of personalized medicine. In fact, it was estimated that in 2020, just 7.04% of cancer patients in the United States could benefit from genome-driven care.4 Even among the patients who do benefit from mutation-targeted therapies, these drugs are expensive and clinical response is often not sustained, as tumors evolve resistance in response to the targeted selection pressure.

Without an actionable mutation, patients often receive conventional cytotoxic chemotherapy. These drugs have been selected by showing benefit across a population in a clinical trial, but we believe there is significant opportunity for expanding the reach of precision medicine here. Gene expression signatures can be used to predict individual response to traditional chemotherapeutics without relying on targetable mutations.

Key findings

Defined as a set of genes (typically fewer than 100) whose expression changes with a particular trait, certain gene expression signatures have already been incorporated into standard-of-care and clinical decision-making algorithms (e.g. OncotypeDx,5 Mammaprint6). In this manuscript, we establish a novel approach to extracting gene expression signatures that utilizes the power of large cell line repositories to extract genes of interest while refining the results with more clinically-translatable tissue samples.

With this methodology, we extract the cisplatin sensitivity signature, CisSig and show that it can predict drug response in cell lines, clinical trends in tumor samples, and retrospective survival outcome in patients with muscle-invasive bladder cancer (MIBC) who receive cisplatin-containing chemotherapy. CisSig’s performance within cell lines from the Genomics of Drug Sensitivity in Cancer Database are shown below, where expression of CisSig genes is higher in more sensitive cell lines (A, below) and a composite CisSig score tends to be higher in sensitive cell lines (B, below).

Visualization of CisSig expression within GDSC Dataset

a. An unclustered heatmap showing gene expression of the CisSig genes (rows) in cell lines (columns) from the top and bottom quintiles of cisplatin IC50. Color of the heatmap represents the Z-score of gene expression, normalized to each gene. Cell lines denoted as sensitive (right, yellow bar) tend to display higher expression of CisSig genes than cell lines denoted as resistant (left, green bar). Z-scores above 2.5 are denoted as 2.5, and Z-scores below -2.5 are denoted as -2.5. b. Violin plots comparing the distribution of CisSig scores between the cell lines in the highest and lowest quintile of cisplatin IC50. A Wilcoxon Rank Sum Test found that the median CisSig scores between these two cohorts was significantly different (p < 0.001).

Next, we can see below that CisSig expression within disease sites tends to correlate to clinical practices. Here, groups that have higher CisSig expression (greater predicted sensitivity to the drug) are more likely to have cisplatin included in their standard-of-care treatment plans.

Cancer subtypes with greater CisSig expression tend to have cisplatin included in standard of care guidelines.

Cancer subtypes are ranked by median CisSig Score in three data sets, GDSC (left), TCGA (middle), and TCC (right). The color of each violin plot represents the rank of the cancer subtype. The ranks of intersecting subtypes between each dataset are compared with Spearman’s rank correlation, reported with correlation r and p-value. Rank correlation r between GDSC and TCGA and GDSC and TCC datasets is 0.78 (p = 0.0002) and 0.92 (p « 0.0001). Rank correlation r between TCGA and TCC datasets is 0.93 (p « 0.0001). Violin plots display the distribution of CisSig scores for each cancer subtype. Within each violin, a boxplot denotes median signature score for each subtype (middle horizontal line) and 25th/75th percentile for signature scores (box edges). Numbers to the left of each violin plot represent sample size included in each cancer subtype. Created with BioRender.com.

Finally, we demonstrate CisSig performance retrospectively by predicting survival outcomes in patients with MIBC who were treated with cisplatin-containing chemotherapy. Patients predicted to be low risk using a CisSig-trained model had markedly improved survival outcomes in independent cohorts of patients. The other important detail seen in the figure below is that these trained models do not predict survival in patients who didn’t receive cisplatin. This means that the models aren’t simply prognosticating which patients have more aggressive tumors. Instead, they are more likely to be predicting response to treatment. Notably, the group sizes for these cohorts is small (~16-22 patients in each), which means we can’t draw grand conclusions about the translation of this signature to this patient population.

Take home message and future directions

Genetically distant organisms can independently evolve similar traits (convergent phenotypes) in order to increase fitness in their distinct environments. In cancer, therefore, we cannot ignore the possibility that different mutations may lead to the same drug response phenotype.  Our novel method groups convergent phenotypes and uses expression profiling to better predict drug response in cancer. We harnessed the power of a large cell line repository (GDSC) to extract CisSig, with potential for use in predicting cisplatin response in epithelial-origin tumors.

 With CisSig, we demonstrate the following:

  • CisSig is predictive of drug response in cell lines using a variety of modeling methods
  • Elevated expression of CisSig within a disease site correlates with regular use of cisplatin in standard-of-care treatments
  • A CisSig-trained MIBC model can predict survival outcomes in a novel MIBC dataset for patients that received cisplatin-containing chemotherapy.

Validation of CisSig’s use in MIBC remains preliminary due to the relatively small sample sizes used to build our models. More complete validation of CisSig in MIBC will require robust analysis with a greater number of samples, a key future direction of this work. Additional future directions include assessing CisSig’s utility in additional disease sites (e.g., HPV+ head and neck cancer or ovarian cancer).  Finally, expanding this methodology to predict response to combination chemotherapy will improve its clinical utility even further, as this is how most chemotherapy is administered in practice today.

CisSig-trained model is predictive in patients who have received cisplatin, but lacks signal in patients who have not received cisplatin.

a. Schematic description of model training and testing, where a model is trained using patients who did receive cisplatin-containing treatment from Dataset A. Testing of the trained model is done using patients from the Dataset A who did not receive cisplatin-containing treatment and patients from the Dataset B who did receive cisplatin-containing treatment. b. Test samples that did receive cisplatin-containing treatment are separated into groups of “high” and “low risk” based on the model’s predictions using a median cutoff. Kaplan-meier curves show a significant separation between the two groups. c. The same analysis shown in b, using an optimal cutpoint (determined by chi-square statistic) instead of median to separate the cohorts. d-e. The same analyses shown in b-c, separating the groups into “high”, “middle”, and “low risk” groups using tertiles and the optimal two cutpoints, respectively. F-G. The same analyses shown in B and D, using samples from Dataset A that did not receive cisplatin-containing treatment, demonstrating no significant separation between the two groups. Created with BioRender.com

  1. Hirsch, F. R., Scagliotti, G. V., Mulshine, J. L., Kwon, R., Curran, W. J., Wu, Y. L., & Paz-Ares, L. (2017). Lung cancer: current therapies and new targeted treatments. The Lancet389(10066), 299-311.
  2. Solomon, B. J., Mok, T., Kim, D. W., Wu, Y. L., Nakagawa, K., Mekhail, T., ... & Blackhall, F. (2014). First-line crizotinib versus chemotherapy in ALK-positive lung cancer. N Engl j Med371, 2167-2177.
  3. Prasad, V., De Jesús, K., & Mailankody, S. (2017). The high price of anticancer drugs: origins, implications, barriers, solutions. Nature reviews Clinical oncology14(6), 381-390.
  4. Haslam, A., Kim, M. S., & Prasad, V. (2021). Updated estimates of eligibility for and response to genome-targeted oncology drugs among US cancer patients, 2006-2020. Annals of Oncology32(7), 926-932.
  5. Sparano, J. A., Gray, R. J., Makower, D. F., Pritchard, K. I., Albain, K. S., Hayes, D. F., ... & Sledge Jr, G. W. (2018). Adjuvant chemotherapy guided by a 21-gene expression assay in breast cancer. New England Journal of Medicine379(2), 111-121.
  6. Soliman, H., Shah, V., Srkalovic, G., Mahtani, R., Levine, E., Mavromatis, B., ... & Audeh, W. (2020). MammaPrint guides treatment decisions in breast Cancer: results of the IMPACt trial. BMC cancer20, 1-13.

Please sign in or register for FREE

If you are a registered user on Nature Portfolio Cancer Community, please sign in