research-article

Free Access

From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches

Authors:
Alan R. Aronson

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Olivier Bodenreider

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Dina Demner-Fushman

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Kin Wah Fung

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Vivian K. Lee

Lister Hill Center, Bethesda, MD and Vanderbilt University, Nashville, TN

Lister Hill Center, Bethesda, MD and Vanderbilt University, Nashville, TN
View Profile

,
James G. Mork

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Aurélie Névéol

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Lee Peters

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

,
Willie J. Rogers

Lister Hill Center, Bethesda, MD

Lister Hill Center, Bethesda, MD
View Profile

BioNLP '07: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language ProcessingJune 2007Pages 105–112

Published:29 June 2007Publication History

BioNLP '07: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing

Pages 105–112

ABSTRACT

This paper describes the application of an ensemble of indexing and classification systems, which have been shown to be successful in information retrieval and classification of medical literature, to a new task of assigning ICD-9-CM codes to the clinical history and impression sections of radiology reports. The basic methods used are: a modification of the NLM Medical Text Indexer system, SVM, k-NN and a simple pattern-matching method. The basic methods are combined using a variant of stacking. Evaluated in the context of a Medical NLP Challenge, fusion produced an F-score of 0.85 on the Challenge test set, which is considerably above the mean Challenge F-score of 0.77 for 44 participating groups.

References

Aronson AR, Demner-Fushman D, Humphrey SM, Lin J, Liu H, Ruch P, Ruiz ME, Smith LH, Tanabe LK, Wilbur WJ. Fusion of knowledge-intensive and statistical approaches for retrieving and annotating textual genomics documents. Proc TREC 2005, 36--45.Google Scholar
Aronson AR, Mork JG, Gay CW, Humphrey SM and Rogers WJ. The NLM Indexing Initiative's Medical Text Indexer. Medinfo. 2004: 268--72.Google Scholar
Bodenreider O, Nelson SJ, Hole WT and Chang HF. Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies. Proc AMIA Symp 1998: 815--9.Google Scholar
Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan B. Evaluation of negation phrases in narrative clinical reports. Proc AMIA Symp. 2001a:105--9.Google Scholar
Chapman WW, Bridewell W, Hanbury P, Cooper GF and Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001b;34:301--10.Google Scholar
Demner-Fushman D, Humphrey SM, Ide NC, Loane RF, Ruch P, Ruiz ME, Smith LH, Tanabe LK, Wilbur WJ and Aronson AR. Finding relevant passages in scientific articles: fusion of automatic approaches vs. an interactive team effort. Proc TREC 2006, 569--76.Google Scholar
Fung KW and Bodenreider O. Utilizing the UMLS for semantic mapping between terminologies. AMIA Annu Symp Proc 2005: 266--70.Google Scholar
Gay CW, Kayaalp M and Aronson AR. Semi-automatic indexing of full text biomedical articles. AMIA Annu Symp Proc. 2005:271--5.Google Scholar
Goldin I and Chapman WW. Learning to detect negation with 'not' in medical texts. Proc Workshop on Text Analysis and Search for Bioinformatics, ACM SIGIR, 2003.Google Scholar
Hunter L and Cohen KB. Biomedical language processing: what's beyond PubMed? Mol Cell. 2006 Mar 3;21(5):589--94.Google Scholar
Tanabe L and Wilbur WJ. (2002) Tagging gene and protein names in biomedical text. Bioinformatics, Aug 2002; 18: 1124--32.Google Scholar
Ting WK and Witten I. 1997. Stacking bagged and dagged models. 367--375. Proc. of ICML'97. Morgan Kaufmann, San Francisco, CA. Google ScholarDigital Library

From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches
1. Applied computing
  1. Life and medical sciences
  2. Operations research
2. Computing methodologies

Recommendations

Convolutional neural networks for biomedical text classification: application in indexing biomedical articles
BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics

Building high accuracy text classifiers is an important task in biomedicine given the wealth of information hidden in unstructured narratives such as research articles and clinical documents. Due to large feature spaces, traditionally, discriminative ...
Read More
Improving multiclass text classification with error-correcting output coding and sub-class partitions
AI'10: Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence

Error-Correcting Output Coding (ECOC) is a general framework for multiclass text classification with a set of binary classifiers It can not only help a binary classifier solve multi-class classification problems, but also boost the performance of a ...
Read More
The impact of indexing approaches on Arabic text classification

This paper investigates the impact of using different indexing approaches full-word, stem, and root when classifying Arabic text. In this study, the na ve Bayes classifier is used to construct the multinomial classification models and is evaluated using ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BioNLP '07: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
June 2007
241 pages
Conference Chairs:
K. Bretonnel Cohen
University of Colorado School of Medicine
,
Dina Demner-Fushman
Lister Hill National Center for Biomedical Communications
,
Carol Friedman
Columbia Universtity
,
Lynette Hirschman
MITRE
,
John Pestian
Computational Medicine Center, University of Cincinnati, Cincinnati Children's Hospital Medical Center
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 29 June 2007
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate33of92submissions,36%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 479
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches

BioNLP '07: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing

ABSTRACT

References

Cited By

Recommendations

Convolutional neural networks for biomedical text classification: application in indexing biomedical articles

Improving multiclass text classification with error-correcting output coding and sub-class partitions

The impact of indexing approaches on Arabic text classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches

BioNLP '07: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing

ABSTRACT

References

Cited By

Recommendations

Convolutional neural networks for biomedical text classification: application in indexing biomedical articles

Improving multiclass text classification with error-correcting output coding and sub-class partitions

The impact of indexing approaches on Arabic text classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media