Can we screen for pancreatic cancer? Identifying a sub-population of patients at high risk of subsequent diagnosis using machine learning techniques applied to primary care data
Year: 2020
Session type: E-poster/poster
Theme: Big data and AI
Abstract
BackgroundPancreatic patients are predominantly diagnosed too late to be treated. Screening is not appropriate because so few people develop the disease. A simple blood or urine test for the disease may soon be available and has the potential to rapidly improve outcomes if used as part of a targeted screening programme aimed at high-risk patients. We examined if such patients are identifiable from routinely collected data. MethodWe conducted a retrospective case-control study on individually linked electronic health records collected from primary care linked to cancer registrations. We examined 1,139 pancreatic patients, aged 15-99 years, diagnosed January 2005 - June 2008 were individually age-, sex- and diagnosis time-matched to four non-pancreatic (cancer) controls. Clinical symptoms and prescription codes for the 24 months preceding diagnosis were used to identify the reporting of 57 individual symptoms. Using a machine learning approach, we trained a logistic regression model on 75% of the data to recognise a combination of atypical symptoms experienced by patients who later developed pancreatic cancer. ResultsUsing patients’ medical history recorded between 20-24 months before diagnosis we were able to identify 41.3% of the population up to 60 years who were at high-risk of developing pancreatic cancer with 72.5% sensitivity, 59% specificity and 66% AUC. Among patients above age 60, 43.2% were similarly identified up to 17 months before diagnosis, with 66% sensitivity, 57% specificity and 61% AUC. ConclusionA sub-population of patients at higher risk were detectable 17-20 months prior to diagnosis. The use of cancer patient controls would have led to increased false positive tests so further work is required using population-based controls. Nevertheless, the model has the potential to be used alongside an accurate and acceptable pre-screening (biomarker) test to increase early diagnosis. This would result in a greater number of patients surviving this devastating disease. Impact statementPairing our model with an accurate pre-screening biomarker test for pancreatic cancer, we can identify patients in primary care with early stage pancreatic tumours. |