Cancer data management and analysis using a pre-planned multi-database approach across Nordic countries and the United Kingdom


Session type:

Morten Andersen1
1Karolinska Institute


Combining data from multiple countries is valuable when studying rare exposures and outcomes. In the CARING (CAncer Risk and INsulin analoGues) project, data from national health registers in the Nordic countries Denmark, Finland, Norway and Sweden and the United Kingdom Clinical Practice Research Datalink were combined with the purpose of investigating cancer risks associated with specific insulins. A major challenge in multi-country studies is how to combine databases with different structure and different drug and diagnosis classification systems. One approach is to analyse country-specific data separately and combine the results using aggregate data (AD) meta-analysis. Another approach is to collect all data centrally at one site and perform a combined individual patient data (IPD) meta-analysis. The implementation of a common data model (CDM) to harmonise databases has several advantages. Less resources are needed both for generating analysis datasets and for the statistical analysis when data are in a uniform structure. Furthermore, the use of an integrated concept dictionary enables the mapping of exposure, confounder and outcomes concepts used in the study to country-specific code systems of drugs and diagnoses in an efficient and transparent way. New insulin user cohorts were formed and Poisson regression used to estimate incidence rate ratios of colorectal cancer, breast cancer and prostate cancer, comparing exposure to different insulins. Analyses were performed both as an IPD meta-analysis on a common dataset from all countries (adjusted for common covariates), and on separate datasets for each country (adjusted for all available covariates), subsequently pooling results using AD meta-analyses with fixed and random effects. No consistent differences in risk of the cancers investigated between different insulins were found. Low power in individual cohorts and uniform distribution of available covariates between cohorts favoured the use of IPD over AD meta-analysis in this specific case.