What do advances in sequencing technology mean for research in cancer genomics?
We’ve known for a long time that cancer is a disease of the genome. We’ve also known that applying insights from genomic sequencing to cancer studies can unlock its highly complex and variable pathology. The knowledge gained from sequencing data enables oncologists to dive deeper into the complicated biology of tumors and tumorigenesis and gives researchers a deeper understanding of cancer at the genomic and transcriptomic level. Yet historically, embedding genomics widely into cancer research has been challenging because of limitations in sequencing technology. But now, a wave of innovations looks set to fuel new breakthroughs in cancer research, diagnostics, and treatment.
Where are we today?
Each of the two main approaches to genomic sequencing employed today – long- and short-read – have their own applications for different cancer scenarios. High-throughput short-read sequencing is used for applications where the depth and length of long-read is not necessary, such as single-nucleotide polymorphism (SNP) calling or sequencing short microRNAs. This short-read data is useful for tracking residual disease, ongoing cancer screening, and, in some cases, early detection of cancer. In contrast, long-reads are typically kilobases long, which allows them to cover challenging variant types, such as large structural variants and tandem repeats. The accuracy and completeness of long-reads affords a deeper understanding of individual cancers and enables a precision oncology approach.
To date, long-read sequencing has had lower throughput, making it more challenging to incorporate in large-scale cancer studies. Yet, without the ability to gather whole genome sequences from significant numbers of study participants, researchers are unable to identify trends in genetic change and discover new biomarkers associated with risk of cancer. For example, structural variants (genomic differences ≥50 base pairs), are a main driver of cancer but are too large to be reliably discovered with short-read sequencing – and that is true of both germline variants inherited from parents and somatic variants triggered by environmental factors, such as damaging UV rays.
Multiomics at scale
The good news? Developments in long-read sequencing have made the technology increasingly accessible. Overall, the cost of long-read sequencing has come down and throughput has increased. A single whole genome sequencing machine can now deliver more than 1,300 human genomes per year, with reduced sample sizes, far fewer consumables, and exceptional accuracy. As a result, there is less need to batch samples, which improves time-to-result – so labs will no longer need to choose between cost and turnaround speed. In short, we’ve unlocked the possibility of the $1,000 genome with a 24-hour turnaround for patients.
Integrating long-read sequencing into large cancer studies is also now feasible. We’ve already seen the benefits of such innovation in breast cancer; one study that employed long-read data revealed 3,059 breast tumor-specific splicing events, including 35 that are significantly associated with patient survival (1).
Another significant evolution in long-read is the ability to gain insights into the epigenome in a single experiment. Previously, multiple tests have been required to evaluate epigenomic changes, such as methylation, but it is now possible to capture both genetic and epigenetic variation together. This ability has noteworthy implications for oncology research because many genetic changes related to cancer show in the methylation layer first (2). Understanding subtle patterns in this rich source of information will uncover new opportunities for diagnosing specific cancers before solid tumors begin to grow.
Short-read sequencing still as a seat at the table
In addition to long-read innovation, progress in the sensitivity and specificity of short-read sequencing is enabling further oncology use cases. Highly accurate modern short-read systems produce results with far fewer errors in read data, reducing the number of false positives while increasing biological insight. Improved short-read sensitivity also decreases detection limits, allowing for lower frequency of allele detection in samples. This increased sensitivity is particularly important in scaling the use of liquid biopsy in cancer – a far less invasive sampling method than the commonly used Formalin-Fixed Paraffin-Embedded (FFPE) technique. Such innovations will accelerate the development of diagnostic tools to improve therapy selection, recurrence monitoring, and early detection.
Finally, contemporary sequencing machines, both short- and long-read, are increasingly backed by massive computing power and use advanced AI and deep learning techniques. Combining AI methods with genomics technology improves both the accuracy and yield of a single experiment, resulting in more precise identification of genetic variants. This enables deeper analysis of large data sets, helping unlock patterns that will give us greater insight into the pathology of cancer and its progression at a molecular level.
A sequencing step change
The pace of oncology research is about to go through another step change thanks to rapid innovation in sequencing technology. Promising new applications that will benefit patients are coming closer to fruition. We know that health outcomes when cancer reaches stage 3 or 4 are inferior; here, a deeper understanding of biomarkers and genetic changes associated with specific cancers will edge us closer to the goal of diagnosing at the earliest possible stage and informing precision oncology treatment decisions.
The powerful combination of highly accurate short- and long-read sequencing will unlock the pathology of this variable and complex disease, empowering researchers to work towards better outcomes for patients.
Neil Ward is General Manager EMEA at PacBio, based in Reading, UK.