Skip to main content
Erschienen in: Journal of Medical Systems 9/2020

01.09.2020 | Education & Training

A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language Processing

verfasst von: Helene B. Fevrier, Liyan Liu, Lisa J. Herrinton, Dan Li

Erschienen in: Journal of Medical Systems | Ausgabe 9/2020

Einloggen, um Zugang zu erhalten

Abstract

Key variables recorded as text in colonoscopy and pathology reports have been extracted using natural language processing (NLP) tools that were not easily adaptable to new settings. We aimed to develop a reliable NLP tool with broad adaptability. During 1996–2016, Kaiser Permanente Northern California performed 401,566 colonoscopies with linked pathology. We randomly sampled 1000 linked reports into a Training Set and developed an NLP tool using SAS® PERL regular expressions. The NLP tool captured five colonoscopy and pathology variables: type, size, and location of polyps; extent of procedure; and quality of bowel preparation. We used a Validation Set (N = 3000) to confirm the variables’ classifications using manual chart review as the reference. Performance of the NLP tool was assessed using the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and Cohen’s κ. Cohen’s κ ranged from 93 to 99%. The sensitivity and specificity ranged from 95 to 100% across all categories. For categories with prevalence exceeding 10%, the PPV ranged from 97% to 100% except for adequate quality of preparation (prevalence 92%), for which the PPV was 65%. For categories with prevalence below 10%, the PPVs ranged from 62% to 100%. NPVs ranged from 94% to 100% except for the “complete” extent of procedure, for which the NPV was 73%. Using information from a large community-based population, we developed a transparent and adaptable NLP tool for extracting five colonoscopy and pathology variables. The tool can be readily tested in other healthcare settings.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Levin B, Lieberman DA, McFarland B, et al.. American Cancer Society Colorectal Cancer Advisory Group; US Multi-Society Task Force; American College of Radiology Colon Cancer Committee. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: a joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. CA Cancer J Clin 2008;58(3):130-60.CrossRef Levin B, Lieberman DA, McFarland B, et al.. American Cancer Society Colorectal Cancer Advisory Group; US Multi-Society Task Force; American College of Radiology Colon Cancer Committee. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: a joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. CA Cancer J Clin 2008;58(3):130-60.CrossRef
2.
Zurück zum Zitat Rex DK, Boland CR, Dominitz JA, et al. Colorectal cancer screening: recommendations for physicians and patients from the U.S. Multi-society Task Force on colorectal cancer. Gastroenterology 2017;153:307e323.CrossRef Rex DK, Boland CR, Dominitz JA, et al. Colorectal cancer screening: recommendations for physicians and patients from the U.S. Multi-society Task Force on colorectal cancer. Gastroenterology 2017;153:307e323.CrossRef
3.
Zurück zum Zitat Kaminski MF, Wieszczy P, Rupinski M, et al. Increased Rate of Adenoma Detection Associates With Reduced Risk of Colorectal Cancer and Death. Gastroenterology 2017;153(1):98-105.CrossRef Kaminski MF, Wieszczy P, Rupinski M, et al. Increased Rate of Adenoma Detection Associates With Reduced Risk of Colorectal Cancer and Death. Gastroenterology 2017;153(1):98-105.CrossRef
4.
Zurück zum Zitat Rex, D. K, Ahnen, D. J, Baron, J. A, et al. 2012. Serrated lesions of the colorectum: review and recommendations from an expert panel. Am J Gastroenterol, 107:1315-29; quiz 1314, 1330. Rex, D. K, Ahnen, D. J, Baron, J. A, et al. 2012. Serrated lesions of the colorectum: review and recommendations from an expert panel. Am J Gastroenterol, 107:1315-29; quiz 1314, 1330.
5.
Zurück zum Zitat Erichsen R, Baron JA, Hamilton-Dutoit SJ, et al. Increased risk of colorectal cancer development among patients with serrated polyps. Gastroenterology 2016;150:895-902.CrossRef Erichsen R, Baron JA, Hamilton-Dutoit SJ, et al. Increased risk of colorectal cancer development among patients with serrated polyps. Gastroenterology 2016;150:895-902.CrossRef
6.
Zurück zum Zitat Anderson JC, Butterly LF, Weiss JE, et al. Providing data for serrated polyp detection rate benchmarks: an analysis of the New Hampshire Colonoscopy Registry. Gastrointest Endosc 2017;85:1188-94.CrossRef Anderson JC, Butterly LF, Weiss JE, et al. Providing data for serrated polyp detection rate benchmarks: an analysis of the New Hampshire Colonoscopy Registry. Gastrointest Endosc 2017;85:1188-94.CrossRef
7.
Zurück zum Zitat Liao KP, Cai T, Savova GK, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ 2015;350:h1885.CrossRef Liao KP, Cai T, Savova GK, et al. Development of phenotype algorithms using electronic medical records and incorporating natural language processing. BMJ 2015;350:h1885.CrossRef
8.
Zurück zum Zitat Lee JK, Jensen CD, Lee A, et al. Development and validation of an algorithm for classifying colonoscopy indication. Gastrointest Endosc 2015;81:575-82.CrossRef Lee JK, Jensen CD, Lee A, et al. Development and validation of an algorithm for classifying colonoscopy indication. Gastrointest Endosc 2015;81:575-82.CrossRef
9.
Zurück zum Zitat Lee JK, Jensen CD, Levin TR, et al. Accurate identification of colonoscopy quality and polyp findings using natural language processing. J Clin Gastroenterol 2019;53(1):e25-e30. Lee JK, Jensen CD, Levin TR, et al. Accurate identification of colonoscopy quality and polyp findings using natural language processing. J Clin Gastroenterol 2019;53(1):e25-e30.
10.
Zurück zum Zitat Harkema H, Chapman WW, Saul M, Dellon ES, Schoen RE, Mehrotra A. Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc 2011;18 Suppl 1:i150-6.CrossRef Harkema H, Chapman WW, Saul M, Dellon ES, Schoen RE, Mehrotra A. Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc 2011;18 Suppl 1:i150-6.CrossRef
11.
Zurück zum Zitat Carrell DS, Schoen RE, Leffler DA, et al. Challenges in adapting existing clinical natural language processing systems to multiple, diverse health care settings. J Am Med Inform Assoc 2017;24(5):986-991.CrossRef Carrell DS, Schoen RE, Leffler DA, et al. Challenges in adapting existing clinical natural language processing systems to multiple, diverse health care settings. J Am Med Inform Assoc 2017;24(5):986-991.CrossRef
12.
Zurück zum Zitat Gawron AJ, Thompson WK, Keswani RN, et al. Anatomic and advanced adenoma detection rates as quality metrics determined via natural language processing. Am J Gastroenterol 2014;109:1844-9.CrossRef Gawron AJ, Thompson WK, Keswani RN, et al. Anatomic and advanced adenoma detection rates as quality metrics determined via natural language processing. Am J Gastroenterol 2014;109:1844-9.CrossRef
13.
Zurück zum Zitat Imler TD, Morea J, Kahi C, Imperiale TF. Natural language processing accurately categorizes findings from colonoscopy and pathology reports. Clin Gastroenterol Hepatol 2013;11(6):689-94.CrossRef Imler TD, Morea J, Kahi C, Imperiale TF. Natural language processing accurately categorizes findings from colonoscopy and pathology reports. Clin Gastroenterol Hepatol 2013;11(6):689-94.CrossRef
14.
Zurück zum Zitat Imler TD, Morea J, Kahi C, et al. Multi-center colonoscopy quality measurement utilizing natural language processing. Am J Gastroenterol. 2015;110:543-52.CrossRef Imler TD, Morea J, Kahi C, et al. Multi-center colonoscopy quality measurement utilizing natural language processing. Am J Gastroenterol. 2015;110:543-52.CrossRef
15.
Zurück zum Zitat Naylor J, Borges LF, Goryachev S, Gainer VS, Saltzman JR. Natural language processing accurately calculates adenoma and sessile serrated polyp detection rates. Dig Dis Sci 2018;63:1794-1800.CrossRef Naylor J, Borges LF, Goryachev S, Gainer VS, Saltzman JR. Natural language processing accurately calculates adenoma and sessile serrated polyp detection rates. Dig Dis Sci 2018;63:1794-1800.CrossRef
16.
Zurück zum Zitat Raju GS, Lum PJ, Slack RS, et al. Natural language processing as an alternative to manual reporting of colonoscopy quality metrics. Gastrointest Endosc 2015;82(3):512-9.CrossRef Raju GS, Lum PJ, Slack RS, et al. Natural language processing as an alternative to manual reporting of colonoscopy quality metrics. Gastrointest Endosc 2015;82(3):512-9.CrossRef
17.
Zurück zum Zitat Miller T, Dligach D, Bethard S, et al. Towards generalizable entity-centric clinical coreference resolution. J Biomed Inform 2017;69:251-258.CrossRef Miller T, Dligach D, Bethard S, et al. Towards generalizable entity-centric clinical coreference resolution. J Biomed Inform 2017;69:251-258.CrossRef
18.
Zurück zum Zitat Li D, Woolfrey J, Jiang SF, et al. Diagnosis and predictors of sessile serrated adenoma after educational training in a large, community-based, integrated healthcare setting. Gastrointest Endosc 2018;87(3):755-765.CrossRef Li D, Woolfrey J, Jiang SF, et al. Diagnosis and predictors of sessile serrated adenoma after educational training in a large, community-based, integrated healthcare setting. Gastrointest Endosc 2018;87(3):755-765.CrossRef
19.
Zurück zum Zitat Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960;20(1):37–46.CrossRef Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas 1960;20(1):37–46.CrossRef
20.
Zurück zum Zitat Corley DA, Jensen CD, Marks AR, et al. Adenoma detection rate and risk of colorectal cancer and death. N Engl J Med 2014;370:1298-306.CrossRef Corley DA, Jensen CD, Marks AR, et al. Adenoma detection rate and risk of colorectal cancer and death. N Engl J Med 2014;370:1298-306.CrossRef
22.
Zurück zum Zitat Liu L, Shorstein NH, Amsden LB, Herrinton LJ. Natural language processing to ascertain two key variables from operative reports in ophthalmology. Pharmacoepidemiol Drug Saf 2017;26(4):378-385.CrossRef Liu L, Shorstein NH, Amsden LB, Herrinton LJ. Natural language processing to ascertain two key variables from operative reports in ophthalmology. Pharmacoepidemiol Drug Saf 2017;26(4):378-385.CrossRef
23.
Zurück zum Zitat Lieberman DA, Rex DK, Winawer SJ, Giardiello FM, Johnson DA, Levin TR et al.. Guidelines for colonoscopy surveillance after screening and polypectomy: a consensus update by the US Multi-Society Task Force on Colorectal Cancer. Gastroenterology 2012;143:844-57.CrossRef Lieberman DA, Rex DK, Winawer SJ, Giardiello FM, Johnson DA, Levin TR et al.. Guidelines for colonoscopy surveillance after screening and polypectomy: a consensus update by the US Multi-Society Task Force on Colorectal Cancer. Gastroenterology 2012;143:844-57.CrossRef
24.
Zurück zum Zitat Lai EJ, Calderwood AH, Doros G, et al. The Boston Bowel Preparation Scale: A valid and reliable instrument for colonoscopy-oriented research. Gastrointest Endosc 2009;69:620-5.CrossRef Lai EJ, Calderwood AH, Doros G, et al. The Boston Bowel Preparation Scale: A valid and reliable instrument for colonoscopy-oriented research. Gastrointest Endosc 2009;69:620-5.CrossRef
Metadaten
Titel
A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language Processing
verfasst von
Helene B. Fevrier
Liyan Liu
Lisa J. Herrinton
Dan Li
Publikationsdatum
01.09.2020
Verlag
Springer US
Erschienen in
Journal of Medical Systems / Ausgabe 9/2020
Print ISSN: 0148-5598
Elektronische ISSN: 1573-689X
DOI
https://doi.org/10.1007/s10916-020-01604-8

Weitere Artikel der Ausgabe 9/2020

Journal of Medical Systems 9/2020 Zur Ausgabe