The preliminary results from the 100,000 Genomes Project: the genomic landscape of colorectal cancer


Session type:


William Cross1,Giulio Caravagna2,Jacob Houseman1,Alona Sosinsky3,Nirupa Murugaesu3,Daniel Chubb2,Alexander Cornish2,Anna Frangou4,John Ambrose3,David Wedge4,Richard Houlston2,Andrea Sottoriva2,Trevor Graham1,Ian Tomlinson5
1Queen Mary University, Barts Cancer Institute,2The Institute of Cancer Research,3Genomics England,4The Big Data Institute University of Oxford,5Institute of Cancer and Genomic Sciences Birmingham University



The 100,000 Genomes Project, a hybrid research and NHS transformation initiative delivered by Genomics England in partnership with NHS England, aims to facilitate research on a large number of human conditions, including more than 20 types of cancer. Here, we present the preliminary analysis of 1000 colorectal cancer genomes which represents the most comprehensive dataset of it’s kind in the world.


We performed bioinformatic processing and analysis using a wide variety of tools: the Illumina ISAAC workflow was used in combination with Strelka, Starling and Sequenza to obtain somatic and germline mutational calls. The Ensembl variant effect predictor (VEP) was used to score and annotate mutations and we referred to the TCGA, COSMIC and other sources to identify known driver mutations and compare our results to those published so far.


The current cohort comprises 1077 patients with 400 detailed clinical annotations and at least 120X whole genome sequencing of each primary and normal pairs. Hierarchical clustering of the mutations present in 264 known driver genes (defined by COSMIC as tier1 epithelial) revealed a multitude of differing genotypes, including clusters dominated by several rarer wnt signalling disruptions and SMAD4, FBXW7 and PIK3CA, FAT4 and HLA-A (the latter two having unknown etiology in colorectal cancer). Notably, although the canonical genes APC, KRAS and TP53 were the most frequently mutated, only a small fraction of cancers possessed mutations in all three. Analysis of karyotypes revealed the presence of a core set of copy number aberrations that are mostly independent of the associated drivers.


Colorectal cancers can be driven by hugely varied drivers and in these preliminary results we point to several new features that are only made obvious through en masse analysis of whole genomes. This indicates that the true genetic landscape of epithelial cancers is vast.