Elsevier

Cancer Epidemiology

Volume 36, Issue 5, October 2012, Pages 425-429
Cancer Epidemiology

Validity of cancer diagnosis in a primary care database compared with linked cancer registrations in England. Population-based cohort study

https://doi.org/10.1016/j.canep.2012.05.013Get rights and content

Abstract

Aims

The present study aimed to evaluate the validity of cancer diagnoses and death recording in a primary care database compared with cancer registry (CR) data in England.

Methods

The eligible cohort comprised 42,556 participants, registered with English general practices in the General Practice Research Database (GPRD) that consented to CR linkage. CR and primary care records were compared for cancer diagnosis, date of cancer diagnosis and death. Read and ICD cancer code sets were reviewed and agreed by two authors.

Results

There were 5216 (91% of CR total) cancer events diagnosed in both sources. There were 494 (9%) diagnosed in CR only and 213 (4%) that were diagnosed in GPRD only. The predictive value of a GPRD cancer diagnosis was 96% for lung cancer, 92% for urinary tract cancer, 96% for gastro-oesophageal cancer and 98% for colorectal cancer. ‘False negative’ primary care records were sometimes accounted for by registration end dates being shortly before cancer diagnosis dates. The date of cancer diagnosis was median 11 (interquartile range −6 to 30) days later in GPRD compared with CR. Death records were consistent for the two sources for 3337/3397 (99%) of cases.

Conclusion

Recording of cancer diagnosis and mortality in primary care electronic records is generally consistent with CR in England. Linkage studies must pay careful attention to selection of codes to define eligibility and timing of diagnoses in relation to beginning and end of record.

Introduction

Electronic health records are an increasingly utilised resource for epidemiological research. In the UK, records from primary care databases have been used in studies of cancer diagnosis and prognosis [1]. Validation studies have confirmed the accuracy and completeness of UK electronic patient records with respect to several clinical conditions as well as pharmacological treatment, and death [2], [3], [4], [5], [6], [7], [8], [9]. In the UK, a national system for cancer registration aims to record all new cancer diagnoses. Cancer Registry (CR) data are considered to represent an accurate resource for studies of cancer incidence and prognosis [10]. The validity of cancer diagnoses in primary care electronic records in comparison with cancer registrations has not been well described.

The present study builds on an earlier analysis of data from the General Practice Research Database (GPRD) [1] that evaluated the incidence of cancer in patients presenting with four ‘alarm’ symptoms, haematuria, haemoptysis, dyphagia and rectal bleeding. At the time of the initial analysis linked cancer data were not available to ascertain the frequency and validity of cancer diagnoses recorded in the GPRD. In order to confirm our initial findings, we wanted to ascertain the validity of cancer diagnoses and dates of cancer diagnosis in GPRD. We have made use of the opportunity presented by a novel linkage between cancer registrations with primary care electronic records to compare data from the two sources. The present report therefore aims to evaluate the validity of cancer diagnoses in primary care electronic health records by comparing the occurrence, and timing of cancer diagnoses between GPRD with cancer registrations. We specifically evaluated diagnoses of lung cancer, colorectal cancer, cancer of the oesophagus or stomach and urinary tract cancers.

Section snippets

Cancer registry

Cancer registries in England represent the only available source of reliable population-based data on cancer incidence, prevalence and survival, excluding non-melanoma skin cancer which is not collected systematically. Information is collected on new diagnoses of cancer from hospitals, pathology laboratories, hospices, cancer treatment centres, cancer screening programmes, Hospital Episode Statistics (HES), cancer waiting times (CWT) and death certificates. Within hospitals, data are collected

Results

The initial GPRD sample comprised 83,841 participants. Fig. 1 provides a flowchart of the sample selection process. There were 173 (53%) out of the original 334 GPRD practices without any cancer occurrence recorded in the CR that were excluded from further analyses as not consenting to linkage. This left 158 (47%) eligible practices that participated in linkage between GPRD and CR. Consequently, 49% (N = 37,283) of participants with alarm symptoms (N = 76,143) and 46% (N = 5254) participants with

Discussion

The present study aimed to estimate the validity of four cancer diagnoses in GPRD in comparison with CR data as a reference source of cancer diagnoses in England. Overall, the present findings endorse the validity of colorectal, gastro-oesophageal, respiratory and urinary tract cancer diagnosis in the GPRD, including the timing of these diagnoses and subsequent mortality.

High sensitivity is extremely important when the aim is to identify severe but treatable disease such as the four cancer

Conclusion

The present study documented that cancer diagnoses in the GPRD are valid and accurate more than 90% of the time and can be used with reasonable confidence for studies of clinical care, auditing, and prevention on the four cancer diagnoses included in the present study.

Conflict of interest statement

None to be declared.

Funding

None.

References (15)

  • R. Jones et al.

    Alarm symptoms in early diagnosis of cancer in primary care: cohort study using General Practice Research Database

    BMJ

    (2007)
  • H. Jick et al.

    Validation of information recorded on general practitioner based computerized data resource in the United Kingdom

    BMJ

    (1991)
  • H. Jick et al.

    Further validation of information recorded on a general practitioner based computerized data resource in the United Kingdom

    Pharmacoepidemiol Drug Saf

    (1992)
  • S.S. Jick et al.

    Validity of the general practice research database

    Pharmacotherapy

    (2003)
  • N.F. Khan et al.

    Validity of diagnostic coding within the General Practice Research Database: a systematic review

    Br J Gen Pract

    (2010)
  • E. Herrett et al.

    Validation and validity of diagnoses in the General Practice Research Database: a systematic review

    Br J Clin Pharmacol

    (2010)
  • M.D. Manser et al.

    Colorectal cancer registration: the central importance of pathology

    J Clin Pathol

    (2000)
There are more references available in the full text version of this article.

Cited by (109)

View all citing articles on Scopus
View full text