skip to main content
10.5555/1596374.1596399dlproceedingsArticle/Chapter ViewAbstractPublication PagesconllConference Proceedingsconference-collections
research-article
Free Access

Design challenges and misconceptions in named entity recognition

Published:04 June 2009Publication History

ABSTRACT

We analyze some of the fundamental design challenges and misconceptions that underlie the development of an efficient and robust NER system. In particular, we address issues such as the representation of text chunks, the inference approach needed to combine local NER decisions, the sources of prior knowledge and how to use them within an NER system. In the process of comparing several solutions to these challenges we reach some surprising conclusions, as well as develop an NER system that achieves 90.8 F1 score on the CoNLL-2003 NER shared task, the best reported result for this dataset.

References

  1. R. K. Ando and T. Zhang. 2005. A high-performance semi-supervised learning method for text chunking. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. F. Brown, P. V. deSouza, R. L. Mercer, V. J. D. Pietra, and J. C. Lai. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. X. Carreras, L. Màrquez, and L. Padró. 2003. Learning a perceptron-based named entity chunker via online recognition feedback. In CoNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. Chieu and H. T. Ng. 2003. Named entity recognition with a maximum entropy approach. In Proceedings of CoNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. W. Cohen. 2004. Exploiting dictionaries in named entity extraction: Combining semi-markov extraction processes and data integration methods. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Collins. 2002. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In EMNLP. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Edward. 2007. Finding good sequential model structures using output transformations. In EMNLP).Google ScholarGoogle Scholar
  8. O. Etzioni, M. J. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. 2005. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1):91--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. R. Finkel, T. Grenager, and C. D. Manning. 2005. Incorporating non-local information into information extraction systems by gibbs sampling. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Florian, A. Ittycheriah, H. Jing, and T. Zhang. 2003. Named entity recognition through classifier combination. In CoNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Freund and R. Schapire. 1999. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Kazama and K. Torisawa. 2007a. Exploiting wikipedia as external knowledge for named entity recognition. In EMNLP.Google ScholarGoogle Scholar
  13. J. Kazama and K. Torisawa. 2007b. A new perceptron algorithm for sequence labeling with non-local features. In EMNLP-CoNLL.Google ScholarGoogle Scholar
  14. T. Koo, X. Carreras, and M. Collins. 2008. Simple semi-supervised dependency parsing. In ACL.Google ScholarGoogle Scholar
  15. V. Krishnan and C. D. Manning. 2006. An effective two-stage model for exploiting non-local dependencies in named entity recognition. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Liang. 2005. Semi-supervised learning for natural language. Masters thesis, Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  18. S. Miller, J. Guinness, and A. Zamanian. 2004. Name tagging with word clusters and discriminative training. In HLT-NAACL.Google ScholarGoogle Scholar
  19. A. Molina and F. Pla. 2002. Shallow parsing using specialized hmms. The Journal of Machine Learning Research, 2:595--613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Niculescu-Mizil and R. Caruana. 2005. Predicting good probabilities with supervised learning. In ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. V. Punyakanok and D. Roth. 2001. The use of classifiers in sequential inference. In NIPS.Google ScholarGoogle Scholar
  22. L. R. Rabiner. 1989. A tutorial on hidden markov models and selected applications in speech recognition. In IEEE.Google ScholarGoogle Scholar
  23. E. Riloff and R. Jones. 1999. Learning dictionaries for information extraction by multi-level bootstrapping. In AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Rizzolo and D. Roth. 2007. Modeling discriminative global inference. In ICSC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Roth and D. Zelenko. 1998. Part of speech tagging using a network of linear separators. In COLING-ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. H. Shen and A. Sarkar. 2005. Voting between multiple data representations for text chunking. Advances in Artificial Intelligence, pages 389--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Suzuki and H. Isozaki. 2008. Semi-supervised sequential labeling and segmentation using giga-word scale unlabeled data. In ACL.Google ScholarGoogle Scholar
  28. E. Tjong, K. and F. De Meulder. 2003. Introduction to the conll-2003 shared task: Language-independent named entity recognition. In CoNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Toral and R. Munoz. 2006. A proposal to automatically build and maintain gazetteers for named entity recognition by using wikipedia. In EACL.Google ScholarGoogle Scholar
  30. K. Toutanova, D. Klein, C. Manning, and Y. Singer. 2003. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Veenstra. 1999. Representing text chunks. In EACL.Google ScholarGoogle Scholar
  32. T. Zhang and D. Johnson. 2003. A robust risk minimization based named entity recognition system. In CoNLL. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Design challenges and misconceptions in named entity recognition

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image DL Hosted proceedings
            CoNLL '09: Proceedings of the Thirteenth Conference on Computational Natural Language Learning
            June 2009
            243 pages
            ISBN:9781932432299

            Publisher

            Association for Computational Linguistics

            United States

            Publication History

            • Published: 4 June 2009

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader