PIXdb: Exploring the transcriptomic landscape of prostate cancer development and progression


Session type:

Helen Ross-Adams1,Jacek Marzec1,Stefano Pirro1,Yong-Jie Lu1,Claude Chelala1
1Barts Cancer Institute, QMUL



Prostate cancer (PC) is the second most common cancer diagnosis in men worldwide, and a leading cause of cancer-related death. It is clinically and pathologically heterogeneous, confounding efforts to identify reliable molecular biomarkers for patient management. Several studies have investigated the molecular mechanisms of PC, but none has presented a global view of the transcriptomic landscape linking key molecular events across different disease stages. Next-generation RNA sequencing of primary tumours is now standard, but microarray-based expression data still constitutes the majority of available information on many other clinically valuable samples. Integration of data from both technologies is essential to link distinct disease stages and facilitate translational research.


Using PC as a model, we developed an analytical framework to integrate data from different platforms and disease sub-types. We performed the largest multi-cohort analysis of PC mRNA profiles to date, including 1,488 profiles from 2 RNASeq and 18 array-based datasets on 8 different platforms.


We reconstructed the molecular history of PC to yield the first comprehensive insight into its molecular make-up, by tracking changes in mRNA levels in the transition from normal prostate to high-grade prostatic intraepithelial neoplasia, primary tumour and metastatic disease. We identified established risk genes and molecular pathways known to be enriched at specific stages of disease development. We found HGPIN to be molecularly distinct from primary tumour, and identified nine PC risk candidate genes.


We have developed a robust method to integrate RNAseq and microarray expression data from disparate sources into one coherent whole, to describe a molecular alteration map of PC development and progression. The collected and integrated data, along with analytical and visualisation tools (principal component analysis, gene expression heatmaps, Pearson correlations, tumour content estimation, gene network and survival analyses), are freely available online at the user-friendly Prostate Integrative Expression Database: www.pixdb.org.uk.