Clinical Investigation
Outcomes, Health Policy, and Managed Care
Epidemiology of angina pectoris: Role of natural language processing of the medical record

https://doi.org/10.1016/j.ahj.2006.12.022Get rights and content

Background

The diagnosis of angina is challenging because it relies on symptom descriptions. Natural language processing (NLP) of the electronic medical record (EMR) can provide access to such information contained in free text that may not be fully captured by conventional diagnostic coding.

Objective

To test the hypothesis that NLP of the EMR improves angina pectoris ascertainment over diagnostic codes.

Methods

Billing records of inpatients and outpatients were searched for International Classification of Diseases, Ninth Revision (ICD-9) codes for angina pectoris, chronic ischemic heart disease, and chest pain. EMR clinical reports were searched electronically for 50 specific nonnegated natural language synonyms to these ICD-9 codes. The 2 methods were compared to a standardized assessment of angina by Rose questionnaire for 3 diagnostic levels: unspecified chest pain, exertional chest pain, and Rose angina.

Results

Compared with the Rose questionnaire, the true-positive rate of EMR-NLP for unspecified chest pain was 62% (95% CI 55-67) versus 51% (95% CI 44-58) for diagnostic codes (P < .001). For exertional chest pain, the EMR-NLP true-positive rate was 71% (95% CI 61-80) versus 62% (95% CI 52-73) for diagnostic codes (P = .10). Both approaches had 88% (95% CI 65-100) true-positive rate for Rose angina. The EMR-NLP method consistently identified more patients with exertional chest pain over a 28-month follow-up.

Conclusion

EMR-NLP method improves the detection of unspecified and exertional chest pain cases compared to diagnostic codes. These findings have implications for epidemiological and clinical studies of angina pectoris.

Section snippets

Rose Angina Questionnaire as reference

The Rose Angina Questionnaire is the only validated instrument for assessing symptoms of typical angina pectoris in the general population, independent of medical care. It is highly (>90%) specific when compared against physician-diagnosed angina20, 21, 22 and is strongly associated with coronary artery calcification23 and subsequent risk of coronary events.24, 25 The Rose Angina Questionnaire was administered using a survey of a random sample of the Olmsted County, Minnesota, population (124 

Rose chest pain and subsequent clinical notes

Of 892 participants, 871 (98%) had at least one clinical note dictated between January 1, 2003, and November 1, 2005. Of these 871, 202 (23%) reported chest pain. Of 871, 85 (10%) reported exertional chest pain. Baseline characteristics are reported in Table I. Restricting the cohort by date and presence of clinical notes in the EMR did not change the results.

Comparison between the EMR-NLP and the ICD codes

The true-positive rate of the EMR-NLP system in identifying patients who reported any chest pain on the questionnaire was 62% (95% CI

Discussion

Our findings indicate that, compared to diagnostic coding, NLP of the EMR results in higher true-positive rate to identify patients in the general population with self-reported exertional chest pain and any chest pain. To the best of our knowledge, this is the first study that relies on an advanced EMR system and expertise in population-based research and medical informatics computer science and linguistics to validate the use of NLP of the EMR to identify a chronic symptomatic condition within

References (42)

  • K.W. Fung et al.

    Integrating SNOMED CT into the UMLS: an exploration of different views of synonymy and quality of editing

    J Am Med Inform Assoc

    (2005)
  • J. Hsia et al.

    Predictors of angina pectoris versus myocardial infarction from the Women's Health Initiative Observational Study

    Am J Cardiol

    (2004)
  • S.J. Wang et al.

    Using patient-reportable clinical history factors to predict myocardial infarction

    Comput Biol Med

    (2001)
  • Committee on Quality of Healthcare in America

    Crossing the quality chasm: a new health system for the 21st century

    (2001)
  • N. Sager et al.

    Natural language processing and the representation of clinical data

    J Am Med Inform Assoc

    (1994)
  • C. Friedman

    A broad-coverage natural language processing system

    Proc AMIA Symp

    (2000)
  • H.S. Javitz et al.

    Cost of illness of chronic angina

    Am J Manag Care

    (2004)
  • H. Hemingway et al.

    Incidence and prognostic implications of stable angina pectoris among women and men

    JAMA

    (2006)
  • K.A. Schulman et al.

    The effect of race and sex on physicians' recommendations for cardiac catheterization

    N Engl J Med

    (1999)
  • H. Hemingway et al.

    Prognosis of angina with and without a diagnosis: 11 year follow up in the Whitehall II prospective cohort study

    BMJ

    (2003)
  • D. Gans et al.

    Medical groups' adoption of electronic health records and information systems

    Health Aff

    (2005)
  • Cited by (0)

    This work was supported in part by grants for the Public Health Service (RO1 HL 59205, HL 72435) (Bethesda, MD), the Rochester Epidemiology Project (GM14321 and AR30582) (Bethesda, MD), and the National Institutes of Health Roadmap Multidisciplinary Clinical Research Career Development Award Grant (K12/NICHD)-HD49078 (Bethesda, MD). Dr Hemingway is supported by a Department of Health Public Health Career Scientist Award (London, UK).

    View full text