Development and validation of a multivariable machine learning algorithm to predict risk of cancer in patients referred urgently from primary care


Session type:

Richard Savage, Mike Messenger, Richard Neal, Rosie Ferguson, Geoff Hall, Colin Johnston, Kat Lloyd, Matt Neal, Nigel Sansom, Nisha Sharma, Beth Shinkins, Jim Skinner, Giles Tully, Sean Duffy, Peter Selby



Urgent Suspected Cancer (2WW) referrals have been successful in improving early cancer detection, but are increasingly a major burden on NHS services. This has been exacerbated by the COVID-19 pandemic.  


We have developed and validated a test to assess the risk of any cancer for 2WW patients.  The test uses routine blood tests (FBC, U&E, LFTs, tumour markers), combining them using machine learning and statistical modeling.  Algorithms were developed and validated for nine 2WW pathways using retrospective data from 371,799 referrals to Leeds Teaching Hospitals Trust (development set 2011-16, validation set 2017-19). A minimum set of blood measurements were required for inclusion, and missing data were modelled internally by the algorithms.


We present results for two clinical use-cases.  In use-case 1, the algorithms correctly identify 20% of patients who do not have cancer and may not need an urgent 2WW referral. In use-case 2, they identify 90% of cancer cases with a high probability of cancer who could be prioritised for review.


Use-case 1:

Negative Predictive Value (NPV)

(95% CI)

Proportion of negative test results which are correct

Use-case 2: 


(95% CI)

Proportion of non-cancer cases not red-flagged


0.9936 (0.9883-0.9981)

0.4582 (0.4450-0.4715)

Lower GI

0.9823 (0.9762-0.9877)

0.2723 (0.2637-0.2811)

Upper GI

0.9880 (0.9806-0.9946)

0.3363 (0.3227-0.3503)


0.9895 (0.9799-0.9979)

0.4674 (0.4473-0.4879)


0.9525 (0.9358-0.9680)

0.3548 (0.3379-0.3710)


0.9630 (0.9281-0.9924)

0.3625 (0.3238-0.3987)


0.9375 (0.8795-0.9868)

0.4330 (0.3807-0.4849)

Head and Neck

0.9748 (0.9623-0.9858)

0.2733 (0.2579-0.2885)


0.9406 (0.9232-0.9570)

0.3905 (0.3745-0.4068)


Combining a panel of widely-available blood markers produces an effective blood test for cancer for NHS 2WW patients. The test is cost-effective, can be deployed to any NHS pathology laboratory with no additional hardware requirements, and is of particular value during the COVID-19 pandemic. It is CE marked and is currently undergoing observational service evaluation in the West Yorkshire and Harrogate region.

Impact statement

This test could save over 400,000 patients/year in England alone from undergoing unnecessary cancer diagnostic tests, avoiding the stress/worry/medical risk that goes with those tests, and reducing pressures on NHS cancer pathways in a time of particularly great need.