Acceleration of Gene Expression Connectivity Mapping using GPUs within the QUADraTic System


Year:

Session type:

Theme:

Jessica Black1,Darragh McArt2,Hans Vandierendonck3,Paul O'Reilly2,Eamonn McKernan3,Manuel Salto-Tellez2
1Bioinformatics and Integromics Laboratory, CCRCB,2Centre for Cancer Research and Cell Biology, Queen's University Belfast,3School of Electronics, Electrical Engineering and Computer Science (EEECS), Queen's University Belfast

Abstract

Background

There is a growing requirement for accelerative analysis of data within biomedical research. Glioma, having a median 14-month survival rate exemplifies this requirement for efficient and novel impact. Addressing this, the Queen’s University Accelerated Drug and Transcriptomic Connectivity (QUADrATiC) connectivity mapping platform allows investigation of small molecule compounds as potential candidates to treat disease. QUADraTiC harnesses the computational power of multi-core Central Processing Units (CPU) to parallelise processing of connection strengths and p-values. Currently, 87,000 LINCS reference profiles are used for comparison, however, this will increase to >1.3 million reference profiles where performance may be impacted. We enhanced the framework to leverage Graphics Processing Unit (GPU) accelerated performance.

Method

Building upon the Java Akka framework for multi-threaded execution on CPUs, the Aparapi system was used to convert Java bytecode to OpenCL at runtime streamlining GPU acceleration. Structure changes were made, in converting hash maps to arrays containing IDs and values for each signature to run on the GPU. The ScoreKernal Java class addressed performance enhancements which GPU acceleration could not, through enhancing elements of the scoring algorithm, namely calculating all median values.

Results

Initial results demonstrate a significant improvement in computation time for a Rand1000 signature execution between the original method, the array enhanced implementation and the GPU, achieving execution times of 165, 80 and 25 seconds. This provides a 6.5x improvement, with an 8.5x speed improvement in the calculation of connection scores.

Conclusion

Initial tests show improved computation time for smaller signatures from changing the data structures, including improved performance for larger signatures by utilizing the GPU. Inclusion of the larger 1.3M LINCS data set is achievable and provides increased scalability for the system. With the low cost of GPU cards, this system removes the need for expensive computer set ups creating a feasible option for candidate compound discovery in modern research.