A161: Automated quantitative imaging feature extraction from the Oncology Medical Image Database

Mishal Patel1,Mark Halling-Brown1

1Royal Surrey County Hospital, Guildford, UK

Presenting date: Monday 2 November
Presenting time: 12.20-13.10


With the advent of digital imaging modalities and the rapid growth in both diagnostic and therapeutic imaging, the ability to be able to harness this large influx of data is of paramount. The Oncology Medical Image Database (OMI-DB)1 was created to provide a centralised, fully annotated dataset for research. Medical imaging provides the ability to detect and localise many changes that are important to determine whether a disease is present or a therapy is effective by depicting alterations in anatomic, physiologic, biochemical/molecular process. Quantitative imaging features (QIFs) are sensitive, specific, accurate and reproducible measures of these changes. Here we describe an extension to the OMI-DB whereby a range of QIF’s can be calculated automatically on image collection.


The database contains unprocessed and processed images, associated data and expert-determined ground truths. The process of collection, annotation and storage is fully automated and adaptable and has been described extensively elsewhere1. Currently our efforts have focused on collecting mammographic images, however the system has been designed to-be easily extended to any modality. An automated feature extraction framework has been developed which can process images from the OMI-DB and extract QIFs, which are subsequently stored in a database. Furthermore, breast density prediction and CAD features were determined by commercial CAD packages and inserted into the database.


At present we have collected 4,180 patient cases, consisting of 67082 2D images of which 680 are normal, 3390 malignant and 207 benign. For each image, QIFs are calculated for the whole breast, parenchymal tissue and any expert determined ROI, resulting in over 10 million data points.


The availability of a resource, which contains, medical images, associated data, QIFs and expertly determined ground truth is highly valuable can be used to build predictive models to aid image classification, treatment response assessment as well as to identify prognostic imaging biomarkers.