Automated extraction and visualisation of data from the TARGET (Tumour chARacterisation to Guide Experimental Targeted therapy) trial to support interim analysis


Session type:


Paul O'Regan1,Julie Stevenson1,Donal Landers1,Matthew Krebs2,Richard Hoskins3
1Cancer Research UK,2The Christie NHS Foundation Trust,3The University of Manchester



The TARGET trial aims to match patients to early phase clinical trials based on their next generation sequencing (NGS) results from tumour tissue and circulating tumour DNA (ctDNA). A high quantity of clinical and sequencing data are generated. Here, we demonstrate the utility of scripted analysis pipelines for the automated extraction and analysis of clinical and genomic data, including the creation of bespoke visualisations of large, complex datasets.


Cancer-associated gene panels were sequenced in ctDNA (N=641 genes) and tumour (N=24 genes) by NGS. The genomic sequencing data were integrated with the clinical data in eTARGET, a cloud-based platform. Data were extracted from the eTARGET database using Structured Query Language queries, processed and visualised using R. Analyses and visualisations were refined based on clinician feedback. Finalised analyses and visualisations were scripted to enable automated reanalysis as new data become available.


As of January 2019, data for 172 TARGET patients were included in eTARGET. Clinical questions focussed on the frequency of mutations in ctDNA versus tumour, concordance between ctDNA and tumour, and patterns of mutation according to tumour type. Data were visualised in a series of histograms, heat maps and dendrograms.


Scripted analysis of data has several key benefits compared with manual, spreadsheet-based approaches. These include: the ability to easily repeat analyses as new data become available; full audit trail from raw data to visualisation to support quality control; and the ability to generate and customise visualisations of large, multi-dimensional data sets.