Cancer detection using cell-free RNA

Like Comment
Read the paper

You may not have heard about cell-free nucleic acids - the DNA and RNA molecules floating freely outside of cells in your blood - but you or one of your relatives has likely used a product based on cell-free DNA. Non-Invasive Prenatal Testing (NIPT), which looks in the blood of pregnant women to detect genetic abnormalities in a developing fetus, is a cell-free DNA test that has become a staple of prenatal care. Furthermore, it was the observation by Drs. Diana Bianchi and Meredith Halks-Miller of cancer signals in NIPT samples from pregnant women1 that led to the development of a new category of cfDNA diagnostic tests: the detection of cancer from a blood draw. While cfDNA has become a household name in the field of molecular diagnostics, fewer people may have heard of cfDNA’s free-floating counterpart: cell-free RNA (cfRNA).

In 2000, Dr. Dennis Lo and colleagues identified fetal Y-chromosome RNA in the plasma of pregnant women carrying male fetuses2. They followed up on this observation in 2003 when they identified the mRNA of two pregnancy-related hormones - human placental lactogen (hPL) and human chorionic gonadotropin (hCG) - in the plasma of pregnant women3. Both of these hormones are synthesized exclusively in the placenta (hCG protein is the analyte detected in a home pregnancy test), and it was an early indication that RNA molecules in circulation could originate from solid tissues in the body and survive long enough to be quantified in a blood draw. These observations led researchers to posit that circulating RNA molecules have the potential to serve as tissue-specific biomarkers of other biological processes, including the development of cancer. 

Studies of cfRNA prior to the next-generation sequencing (NGS) era were largely hypothesis-driven. Scientists anticipated that certain RNA transcripts might be present in the plasma, and then designed PCR primers to detect these transcripts. By 2011, NGS costs had become more affordable for academic labs, and I embarked on a new data-driven approach to study cfRNA as a graduate student in Stephen Quake’s lab at Stanford. We recruited pregnant women and collected their blood at each trimester and postpartum. We sequenced all RNA molecules in their plasma samples, from which we not only found mRNA transcripts from protein-coding genes, but also different types of non-coding RNAs. This work identified a comprehensive list of genes - most of which are placenta-specific4 - that correlated with pregnancy progression and paved the way for the use of cfRNA as a biomarker for preterm birth5

In March of 2016, I was preparing the findings for my Ph.D. defense when I heard about GRAIL, a spin-off from Illumina. I was instantly drawn to its mission “to detect cancer early, when it can be cured.” As a scientist working on cfRNA-based prenatal diagnosis for five years, I always wondered if cfRNA could be used as a biomarker for early cancer detection. GRAIL seemed like a perfect place to explore this hypothesis, and I leapt at the opportunity to join.

At that time, cancer cfRNA was still in uncharted waters. There was no systematic analysis of plasma cfRNA in cancer patients. As part of the Circulating Cell-free Genome Atlas study (NCT02889978), we sequenced whole transcriptomes from the plasma of 46 breast cancer patients, 30 lung cancer patients, and 89 non-cancer participants. There were 18,256 transcripts present in non-cancer plasma (32% of 57,820 annotated genes), the majority of which were released by healthy blood cells. Searching for cancer-specific RNA transcripts in this background of healthy RNA is like looking for a needle in a haystack, so we focused on transcripts that were absent in non-cancer plasma but present in cancer plasma. We call these transcripts “dark-channel biomarkers” because they were unexpressed (i.e., “dark”) in non-cancer samples. Instead of looking for a needle in a haystack, a dark channel biomarker looks like a needle next to the haystack.

We identified 20 dark-channel biomarkers from cancer plasma (Fig. 1), half of which were tissue-specific. Mammaglobin-A (SCGB2A2), for example, is predominantly expressed in breast tissue, and is one of the dark-channel biomarkers detected exclusively in the plasma of breast cancer patients. Similarly, many of the lung cancer dark-channel biomarkers are associated with surfactant proteins, which function to lower surface tension at the air/liquid interface in alveolar cells. We also observed that patients with higher shedding of tumor material into the blood are more likely to have detectable dark-channel biomarkers in their plasma. Based on these observations, we built a statistical model to search for other cancer cfRNA biomarkers. We implemented a generalized linear model that accounted for both gene expression in tumor tissue and tumor shedding rate in plasma. This method led to the identification of four additional cancer cfRNA biomarkers, and 23 biomarkers overall.

We discovered that some of these cancer cfRNA biomarkers are not only tissue-specific (originating from either breast or lung tissue), but are also specific to a certain cancer subtype (such as HR+ breast cancer vs triple-negative breast cancer). Tissue-specific markers can help locate the origin of cancer signals in the body, while subtype-specific biomarkers may help guide therapy and predict patient outcome. Every year there are numerous cancer biomarkers reported in the literature. Sadly, most of them are not replicable in larger clinical studies due to a number of factors (small sample size, inappropriate statistical testing, and differences in sample processing to name a few). To validate our findings, we quantified these cfRNA biomarkers in an independent cohort of 38 breast cancer and 18 lung cancer patients, and were reassured to see that the signals for these cancer cfRNA biomarkers were replicable.

Fig. 1 Dark channel biomarker expression in cell-free RNA from breast, lung, and non-cancer plasma samples. Samples are shown in columns and dark channel biomarker genes in rows. Cancer type is indicated above the heatmap, and the tissue specificity of each dark channel biomarker is indicated on the left side of the heatmap. Gene expression values in reads per million (RPM) are shown in purple, and darkness in non-cancer plasma samples is illustrated by standard deviation of RPM (green gradient bar).

Our work with cfRNA cancer diagnostics was just published in Nature Communications (, 20 years since the first discovery of fetal cfRNA in plasma, and 10 years since I started working on cfRNA. It has been a long journey, but I hope this work paves the way for future innovations in the use of cfRNA for cancer care.


  1. Bianchi, D. W. et al. Noninvasive Prenatal Testing and Incidental Detection of Occult Maternal Malignancies. JAMA 314, 162–169 (2015).
  2. Poon, L. L. M., Leung, T. N., Lau, T. K. & Lo, Y. M. D. Presence of Fetal RNA in Maternal Plasma. Clin. Chem. 46, 1832–1834 (2000).
  3. Ng, E. K. O. et al. mRNA of placental origin is readily detectable in maternal plasma. Proc. Natl. Acad. Sci. U. S. A. 100, 4748–4753 (2003).
  4. Koh, W. et al. Noninvasive in vivo monitoring of tissue-specific global gene expression in humans. Proc. Natl. Acad. Sci. U. S. A. 111, 7361–7366 (2014).
  5. Ngo, T. T. M. et al. Noninvasive blood tests for fetal development predict gestational age and preterm delivery. Science 360, 1133–1136 (2018).

Wenying Pan

Chief scientist, GeneGenieDX