RCSB Protein Data Bank Tools for 3D Structure-guided Cancer Research: Human Papillomavirus (HPV) Case Study

The RCSB Protein Data Bank provides ready access to 168,000 public domain structures of biomolecules, empowering researchers and the general public to advance and understand the structural biology of cancer.

Like Comment
Read the paper

Figure 1. Structure of the Human Papillomavirus Type 16 capsid (red and orange) covered with antibody Fab molecules (blue), as determined by cryoelectron microscopy. This structure (PDB ID 6bt3) and many others are available in the PDB archive at RCSB.org.  Image from the Molecule of the Month feature on Human Papillomavirus and Vaccines.

We are in the midst of a revolution in our structural knowledge of the inner workings of cells and pathogens that invade them. This information explosion reflects dramatic improvements in the methods that are used to determine three-dimensional (3D) atomic level structures of biomolecules, including streamlined pipelines for determining crystallographic structures using bright synchrotron X-ray sources and breakthrough developments in atomic-resolution cryo-electron microscopy. The Protein Data Bank (PDB) is the single global archive providing open access to this exponentially-growing corpus of data, for use by researchers, educators, and the general public.

Cancer research is one of the major drivers of this revolution in structural biology. Atomic structures are used to understand the basic processes underlying malignant transformation of human cells, from structural understanding of UV-induced mutagenesis to the molecular mechanisms of regulation by oncogenes. 3D structures also reveal Achilles’ heels in these processes that may be exploited for targeted pharmacologic intervention. Structure-guided approaches contributed to the discovery and development >70% of the new small-molecule anti-neoplastic drugs approved by the US Food and Drug Administration 2010-2018.

The field of structural biology, however, suffers from an “embarrassment of riches,” and it is often difficult for new users to navigate the more than 168,000 public domain structures in the PDB and extract relevant information. The US-funded RCSB Protein Data Bank, a founding member of the Worldwide Protein Data Bank partnership, is tasked with gathering, curating, annotating, disseminating this information, and streamlining its use to advance the fight against cancer and many other applications. In this paper, we describe easy-to-use tools that are available at the RCSB PDB Web Portal (RCSB.org) for helping users access the information they need and use it in their research. The RCSB PDB education and outreach website (pdb101.rcsb.org) also provides valuable information about cancer for oncologists, cancer researchers, and patients and their families. We use the human papillomavirus as a case study to demonstrate how PDB data may be accessed and applied, including methods for querying structures that span the HPV proteome, multiple methods for visualization of structures, methods for exploring mutation of these proteins and their interactions with cellular host proteins. 

RCSB PDB is committed to lowering the barrier for access to this seminal archive of information, empowering structural biologists and the larger research community to advance research into basic cancer biology, identify promising drug discovery targets, and establish new treatment paradigms for cancer. In "RCSB Protein Data Bank tools for 3D structure-guided cancer research: human papillomavirus (HPV) case study", we describe the tools that are available for finding, exploring and analysing this data, using human papillomavirus as a case study.

Funding for RCSB PDB is by the National Science Foundation (DBI-1832184), the US Department of Energy (DE-SC0019749), and the National Cancer Institute, National Institute of Allergy and Infectious Diseases, and National Institute of General Medical Sciences of the National Institutes of Health under grant R01GM133198.

David Goodsell

Professor of Computational Biology, The Scripps Research Institute and RCSB Protein Data Bank, Rutgers University