Prediction of breast cancer proteins involved in immunotherapy, metastasis, and RNA-binding using molecular descriptors and artificial neural networks

Consensus strategy, OncoOmics, and artificial intelligence to beat Breast Cancer.

Like Comment
Read the paper

Breast cancer (BC) is the leading cause of cancer-related death among women and the most commonly diagnosed cancer worldwide1. BC is a heterogeneous disease and represents a significant health problem characterized by an intricate interplay between environmental determinants, metabolic abnormalities, signaling pathway alterations, gene expression deregulation, DNA genomic alterations, and ethnicity2,3. Scientific advances made to date mark the era called the ‘end of the beginning’ of cancer omics4. In other words, each large-scale omics analysis needs to be fully understood as part of a complex network to better understand the BC pathogenesis, to apply precision medicine in clinical oncology, and to discover novel therapeutic targets using artificial intelligence.

In our first study, we developed a Consensus Strategy that was proved to be highly efficient in the recognition of gene-disease association. The main objective was to apply several bioinformatics methods to explore BC pathogenic genes2. In our second study, we validated the aforementioned genes using experimental databases of BC patients following our OncoOmics strategy. The main objective was to analyze genomic alterations, signaling pathways, protein-protein interactome network, protein expression, BC dependencies, and mutations for precision medicine. As result, we identified 3,500 oncogenic mutations and 140 proteins highly associated to BC5.

The prediction of proteins involved in BC is a trending topic in drug design. We used these 140 essential proteins to predict BC proteins involved in immunotherapy, metastasis, and RNA-binding using molecular descriptors and artificial neural networks6. (López-Cortés et al., 2020) Hence, we proposed accurate prediction classifier for BC proteins using six sets of protein sequence descriptors and 13 machine-learning methods. After using a univariate feature selection for the mix of five descriptor families, the best classifier was obtained using multilayer perceptron method (artificial neural network) and 300 features. Lastly, 1,232 cancer immunotherapy proteins (CIPs), 1,903 metastasis driver proteins (MDPs), and 1,369 RNA-binding proteins (RBPs) were screened with the final prediction model.

The CIPs have a promising projection in clinical oncology due to successful long-term durable responses in advanced stages7. According to our machine-learning predictions, the 10 CIPs best related to BC were RPS27, SUPT4H1, CLPSL2, POLR2K, RPL38, AKT3, CDK3, RPS20, RASL11A, and UNTD1. Metastasis, often preceded or accompanied by therapeutic resistance, is the most lethal and insidious aspect of cancer. Due to treatment pressure or tumor evolution, genomic alterations of metastatic tumors can differ substantially from primary tumors8. The 10 MDPs best related to BC were S100A9, DDA1, TXN, PRNP, RPS27, S100A14, S100A7, MAPK1, AGR3, and NDUFA13. Lastly, RNA biology is an under-investigated field of cancer even though pleiotropic changes in the transcriptome are key feature of cancer cell9. RBPs are able to control every aspect of RNA metabolism such as translation, splicing, stability, degradation of mRNA, nucleocytoplasmic transport, capping, and polyadenylation10. The 10 RBPs best related to BC were S100A9, TXN, RPS27L, RPS27, RPS27A, RPL38, MRPL54, PPAN, RPS20, and CSRP1.

In conclusion, our research on BC over the last years has allowed discovering new therapeutic targets to improve precision medicine in clinical oncology, giving our patients more effective treatments and a better qualify of life.


1.        Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA. Cancer J. Clin. (2018) doi:10.3322/caac.21492.

2.        López-Cortés, A. et al. Gene prioritization, communality analysis, networking and metabolic integrated pathway to better understand breast cancer pathogenesis. Sci. Rep. 8, 16679 (2018).

3.        Guerrero, S. et al. Analysis of Racial/Ethnic Representation in Select Basic and Applied Cancer Research Studies. Sci. Rep. 8, 13978 (2018).

4.        Ding, L. et al. Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics. Cell 173, 305-320.e10 (2018).

5.        López-Cortés, A. et al. OncoOmics approaches to reveal essential genes in breast cancer: a panoramic view from pathogenesis to precision medicine. Sci. Rep. 10, 5285 (2020).

6.        López-Cortés, A. et al. Prediction of breast cancer proteins involved in immunotherapy, metastasis, and RNA-binding using molecular descriptors and artificial neural networks. Sci. Rep. 10, 8515 (2020).

7.        Finotello, F., Rieder, D., Hackl, H. & Trajanoski, Z. Next-generation computational tools for interrogating cancer immunity. Nat. Rev. Genet. (2019) doi:10.1038/s41576-019-0166-7.

8.        Angus, L. et al. The genomic landscape of metastatic breast cancer highlights changes in mutation and signature frequencies. Nat. Genet. (2019) doi:10.1038/s41588-019-0507-7.

9.        García-Cárdenas, J. M. et al. Post-transcriptional Regulation of Colorectal Cancer: A Focus on RNA-Binding Proteins. Frontiers in Molecular Biosciences (2019) doi:10.3389/fmolb.2019.00065.

10.      Guerrero, S. et al. In silico analyses reveal new putative Breast Cancer RNA-binding proteins. bioRxiv (2020) doi:10.1101/2020.01.08.898965.




Go to the profile of Andrés López-Cortés

Andrés López-Cortés

Principal Investigator, Universidad UTE

No comments yet.