ABSTRACT
We discuss two named-entity recognition models which use characters and character n-grams either exclusively or as an important part of their data representation. The first model is a character-level HMM with minimal context information, and the second model is a maximum-entropy conditional markov model with substantially richer context features. Our best model achieves an overall F1 of 86.07% on the English test data (92.31% on the development data). This number represents a 25% error reduction over the same model without word-internal (substring) features.
- Daniel M. Bikel, Scott Miller, Richard Schwartz, and Ralph Weischedel. 1997. Nymble: a high-performance learning name-finder. In Proceedings of ANLP-97, pages 194--201. Google ScholarDigital Library
- Andrew Borthwick. 1999. A Maximum Entropy Approach to Named Entity Recognition. Ph.D. thesis, New York University. Google ScholarDigital Library
- Silviu Cucerzan and David Yarowsky. 1999. Language independent named entity recognition combining morphological and contextual evidence. In Joint SIGDAT Conference on EMNLP and VLC.Google Scholar
- Shai Fine, Yoram Singer, and Naftali Tishby. 1998. The hierarchical hidden markov model: Analysis and applications. Machine Learning, 32:41--62. Google ScholarDigital Library
- Andrew McCallum, Dayne Freitag, and Fernando Pereira. 2000. Maximum entropy Markov models for information extraction and segmentation. In ICML-2000. Google ScholarDigital Library
- Andrei Mikheev. 1997. Automatic rule induction for unknown-word guessing. Computational Linguistics, 23(3):405--423. Google ScholarDigital Library
- Adwait Ratnaparkhi. 1996. A maximum entropy model for part-of-speech tagging. In EMNLP 1, pages 133--142.Google Scholar
- Joseph Smarr and Christopher D. Manning. 2002. Classifying unknown proper noun phrases without context. Technical Report dbpubs/2002-46, Stanford University, Stanford, CA.Google Scholar
- Nina Wacholder, Yael Ravin, and Misook Choi. 1997. Disambiguation of proper names in text. In ANLP 5, pages 202--208. Google ScholarDigital Library
Recommendations
Single character Chinese named entity recognition
SIGHAN '03: Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17Single character named entity (SCNE) is a name entity (NE) composed of one Chinese character, such as "[Abstract contained text which could not be captured.]" (zhong1, China) and "[Abstract contained text which could not be captured.]" (e2, Russia). ...
Chinese Named Entity Recognition with Character-Word Mixed Embedding
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge ManagementNamed Entity Recognition (NER) is an important basis for the tasks in natural language processing such as relation extraction, entity linking and so on. The common method of existing Chinese NER systems is to use the character sequence as the input, and ...
NERA: Named Entity Recognition for Arabic
Name identification has been worked on quite intensively for the past few years, and has been incorporated into several products revolving around natural language processing tasks. Many researchers have attacked the name identification problem in a ...
Comments