Image-based consensus molecular subtype classification (imCMS) of colorectal cancer using deep learning


Session type:

Korsuk Sirinukunwattana1,Enric Domingo1,Susan Richman2,Keara Redmond3,Andrew Blake1,Clare Verrill1,Simon Leedham1,Katerina Chatzipli4,Claire Hardy4,Celina Whalley5,Chieh-Hsi Wu1,Andrew Beggs5,Ultan McDermott6,Philip Dunne3,Angela Meade7,Steven Walker8,Graeme Murray9,Leslie Samuel10,Matt Seymour11,Ian Tomlinson5,Philip Quirke2,Tim Maughan1,Jens Rittscher1,Viktor Koelzer12
1University of Oxford,2University of Leeds,3Queen's University Belfast,4Wellcome Sanger Institute,5University of Birmingham,6AstraZeneca,7University College London,8Almac Group Ltd,9University of Aberdeen,10NHS,11NIHR,12University of Z├╝rich



Complex phenotypes captured on histological slides represent the biological processes at play in the individual cancer but the link to underlying molecular classification has not been clarified or systematized. In colorectal cancer histological grading is of little clinical value, and consensus molecular subtypes (CMS) cannot be distinguished without cohort based gene expression profiling. We hypothesize that the phenotype represents the transcriptional classification and that image analysis is a cost-effective tool to capture complex features of tissue organization and to enable calling of CMS subtype on an individual basis and to resolve unclassifiable or heterogeneous cases.


In this study, we present an approach to predict CRC CMS from standard H&E sections using deep learning that utilises the information provided by tissue morphology. The study was performed using 1,553 tissue sections with comprehensive multi-omic data from three independent datasets (MRC FOCUS trial of 380 stage 4 CRC, 205 rectal cancer biopsies and The Cancer Genome Atlas, TCGA).


Image-based consensus molecular subtyping (imCMS) reached an AUC of 0.88 for CMS class prediction in the FOCUS training cohort and accurately classified whole-slide images in unseen datasets from the TCGA (n=366 slides, AUC 0.82) and pre-operative rectal cancer biopsies (n=205 slides, AUC 0.85). The imCMS classification spatially resolved intra-tumoural heterogeneity providing accurate secondary calls with higher discriminatory power than bioinformatic prediction. In all three cohorts, imCMS classified samples previously unclassifiable by RNA expression profiling. The imCMS classification reproduced the expected correlations with genomic and epigenetic alterations in CRC and effectively stratified CRC patients into prognostic subgroups using H&E sections only.


This study underlines that state-of-the-art image analysis techniques show that RNA expression classifiers, including those mediated through epithelial morphology (CMS2 v 3), can be recognised and called from routine H&E images opening the door to simple, cheap and reliable biological stratification within routine workflows.