The C-value/NC-value Method of Automatic Recognition for Multi-word Terms

Frantzi, Katerina T.; Ananiadou, Sophia; Tsujii, Junichi

doi:10.1007/3-540-49653-X_35

Katerina T. Frantzi⁵,
Sophia Ananiadou⁵ &
Junichi Tsujii⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1513))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

1141 Accesses
62 Citations

Abstract

Technical terms (henceforth called simply terms), are important elements for digital libraries. In this paper we present a domainindependent method for the automatic extraction of multi-word terms, from machine-readable special language corpora.

The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms), 2) the incorporation of information from term context words to the extraction of terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ananiadou, S.: A Methodology for Automatic Term Recognition. Proceedings of the 15th International Conference on Computational Linguistics, COLING’94, (1994) 1034–1038
Google Scholar
Ananiadou, S.: Towards a Methodology for Automatic Term Recognition. University of Manchester Institute of Science and Technology (1988)
Google Scholar
Bourigault, D.: Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases. Proceedings of the 14th International Conference on Computational Lingustics, COLING’92, (1992) 977–981
Google Scholar
Brill, E.: A simple rule-based part of speech tagger. Proceedings of the 3rd Conference of Applied Natural Language Processing, ANLP’92, (1992)
Google Scholar
Brill, E.: A Corpus-Based Approach to Language Learning. Ph.D. Thesis, Dept. of Computer and information Science, University of Pennsylvania (1993)
Google Scholar
Dagan, I., Pereira, F., Lee, L.: Similarity-Based Estimation of Word Cooccurence Probabilities. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, ACL’94, (1994) 272–278
Google Scholar
Dagan, I., Church, K.: Termight: Identifying and Translating Technical Terminology. Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics, EACL’95, (1995) 34–40
Google Scholar
Daille, B., Gaussier, E., Langé, J.: Towards Automatic extraction of Monolingual and Bilingual Terminology. Proceedings of the 15th International Conference on Computational Linguistics, COLING’94, (1994) 515–521
Google Scholar
Damerau, F.J.: Generating and Evaluating Domain-Oriented Multi-Word Terms from Texts. Information Processing & Management 29 (1993) 433–447
Article Google Scholar
Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19 (1993) 61–74
Google Scholar
Enguehard, C., Pantera, L.: Automatic Natural Acquisition of a Terminology. Journal of Quantitative Linguistics 2 (1994) 27–32
Article Google Scholar
Frantzi, K.T., and Sophia Ananiadou, S., Tsujii, J.: Extracting Terminological Expressions. The Special Interest Group Notes of Information Processing Society of Japan, 96-NL-112, (1996) 83–88
Google Scholar
Frantzi, K.T., Ananiadou, S.: Extracting Nested Collocations. Proceedings of the 16th International Conference on Computational Linguistics, COLING’96, (1996) 41–46
Google Scholar
Frantzi, K.T., Ananiadou, S., Tsujii, J.: Automatic Term Recognition using Contextual Cues. Proceedings of the 2nd Workshop on Multilinguality in Software Industry (MULSAIC’97), 15th International Joint Conference on Artificial Intelligence, IJCAI’97, (1997) 73–79
Google Scholar
Frantzi, K.T.: Incorporating Context Information for the Extraction of Terms. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL) and 8th Conference of the European Chapter of the Association for Computational Linguistics (EACL), (1997) 501–503
Google Scholar
Frantzi, K.T.: Automatic Recognition of Multi-Word Terms. Ph.D. Thesis, Manchester Metropolitan University Dept. Of Computing & Mathematics, in collaboration with UMIST Centre for Computational Linguistics, (1998)
Google Scholar
Grefenstette, G.: Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, (1994)
Google Scholar
Justeson, J.S., Katz, S.M.: Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1 (1995) 9–27
Article Google Scholar
Kageura, K., Umino, B,: Methods of Automatic Term Recognition-A Review-. Terminology 3 (1996) 259–289
Article Google Scholar
Larson, H.J., Larson, J.: Introduction to probability theory and statistical inference. Wiley series in probability and mathematical statistics, Wiley, New York, Chichester (1982)
Google Scholar
Lauriston, A.: Automatic Term Recognition: performance of Linguistic and Statistical Techniques. Ph.D. Thesis, University of Manchester Institute of Science and Technology (1996)
Google Scholar
Lehrberger, J.: Sublanguage analysis. Analyzing language in restricted domains, Ralph Grishman and Richard Kittredge (editors), Lawrence Erlbaum, 2 (1986) 1938
Google Scholar
Penn: Penn Treebank Annotation. Computational Linguistics 19 (1993)
Google Scholar
Sager, J.C.: Commentary by Prof. Juan Carlos Sager, Actes Table Ronde sur les Problfiemes du Découpage du Terms, Montréal, 26 aouŨt. Guy Rondeau, AILAComterm, Office de la Langue Francaise, Québec, (1978) 39–74
Google Scholar
Sager, J.C., Dungworth, D., McDonald, P.F.: English Special Languages: principles and practice in science and technology. Oscar Brandstetter Verlag KG, Wiesbaden, (1980)
Google Scholar
Sager, J.C.: A Practical Course in Terminology Processing. John Benjamins Publishing Company, (1990)
Google Scholar
Salton, G.: Introduction to modern information retrieval. Computer Science, McGraw-Hill (1983)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computing and Mathematics, Manchester Metropolitan University, Chester Str., Manchester, M1 5GD, UK
Katerina T. Frantzi & Sophia Ananiadou
Dept. of Information Science, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, 113, Japan
Junichi Tsujii

Authors

Katerina T. Frantzi
View author publications
You can also search for this author in PubMed Google Scholar
Sophia Ananiadou
View author publications
You can also search for this author in PubMed Google Scholar
Junichi Tsujii
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Foundation for Research and Technology - Hellas (FORTH) Science and Technology Park of Crete, Institute of Computer Science (ICS), GR-71110, Heraklion, Crete, Greece
Christos Nikolaou & Constantine Stephanidis &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Frantzi, K.T., Ananiadou, S., Tsujii, J. (1998). The C-value/NC-value Method of Automatic Recognition for Multi-word Terms. In: Nikolaou, C., Stephanidis, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1998. Lecture Notes in Computer Science, vol 1513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49653-X_35

Download citation

DOI: https://doi.org/10.1007/3-540-49653-X_35
Published: 15 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65101-7
Online ISBN: 978-3-540-49653-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics