Skip to main content

The C-value/NC-value Method of Automatic Recognition for Multi-word Terms

  • Conference paper
  • First Online:
Research and Advanced Technology for Digital Libraries (ECDL 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1513))

Included in the following conference series:

Abstract

Technical terms (henceforth called simply terms), are important elements for digital libraries. In this paper we present a domainindependent method for the automatic extraction of multi-word terms, from machine-readable special language corpora.

The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms), 2) the incorporation of information from term context words to the extraction of terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ananiadou, S.: A Methodology for Automatic Term Recognition. Proceedings of the 15th International Conference on Computational Linguistics, COLING’94, (1994) 1034–1038

    Google Scholar 

  2. Ananiadou, S.: Towards a Methodology for Automatic Term Recognition. University of Manchester Institute of Science and Technology (1988)

    Google Scholar 

  3. Bourigault, D.: Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases. Proceedings of the 14th International Conference on Computational Lingustics, COLING’92, (1992) 977–981

    Google Scholar 

  4. Brill, E.: A simple rule-based part of speech tagger. Proceedings of the 3rd Conference of Applied Natural Language Processing, ANLP’92, (1992)

    Google Scholar 

  5. Brill, E.: A Corpus-Based Approach to Language Learning. Ph.D. Thesis, Dept. of Computer and information Science, University of Pennsylvania (1993)

    Google Scholar 

  6. Dagan, I., Pereira, F., Lee, L.: Similarity-Based Estimation of Word Cooccurence Probabilities. Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, ACL’94, (1994) 272–278

    Google Scholar 

  7. Dagan, I., Church, K.: Termight: Identifying and Translating Technical Terminology. Proceedings of the 7th Conference of the European Chapter of the Association for Computational Linguistics, EACL’95, (1995) 34–40

    Google Scholar 

  8. Daille, B., Gaussier, E., Langé, J.: Towards Automatic extraction of Monolingual and Bilingual Terminology. Proceedings of the 15th International Conference on Computational Linguistics, COLING’94, (1994) 515–521

    Google Scholar 

  9. Damerau, F.J.: Generating and Evaluating Domain-Oriented Multi-Word Terms from Texts. Information Processing & Management 29 (1993) 433–447

    Article  Google Scholar 

  10. Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19 (1993) 61–74

    Google Scholar 

  11. Enguehard, C., Pantera, L.: Automatic Natural Acquisition of a Terminology. Journal of Quantitative Linguistics 2 (1994) 27–32

    Article  Google Scholar 

  12. Frantzi, K.T., and Sophia Ananiadou, S., Tsujii, J.: Extracting Terminological Expressions. The Special Interest Group Notes of Information Processing Society of Japan, 96-NL-112, (1996) 83–88

    Google Scholar 

  13. Frantzi, K.T., Ananiadou, S.: Extracting Nested Collocations. Proceedings of the 16th International Conference on Computational Linguistics, COLING’96, (1996) 41–46

    Google Scholar 

  14. Frantzi, K.T., Ananiadou, S., Tsujii, J.: Automatic Term Recognition using Contextual Cues. Proceedings of the 2nd Workshop on Multilinguality in Software Industry (MULSAIC’97), 15th International Joint Conference on Artificial Intelligence, IJCAI’97, (1997) 73–79

    Google Scholar 

  15. Frantzi, K.T.: Incorporating Context Information for the Extraction of Terms. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL) and 8th Conference of the European Chapter of the Association for Computational Linguistics (EACL), (1997) 501–503

    Google Scholar 

  16. Frantzi, K.T.: Automatic Recognition of Multi-Word Terms. Ph.D. Thesis, Manchester Metropolitan University Dept. Of Computing & Mathematics, in collaboration with UMIST Centre for Computational Linguistics, (1998)

    Google Scholar 

  17. Grefenstette, G.: Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, (1994)

    Google Scholar 

  18. Justeson, J.S., Katz, S.M.: Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1 (1995) 9–27

    Article  Google Scholar 

  19. Kageura, K., Umino, B,: Methods of Automatic Term Recognition-A Review-. Terminology 3 (1996) 259–289

    Article  Google Scholar 

  20. Larson, H.J., Larson, J.: Introduction to probability theory and statistical inference. Wiley series in probability and mathematical statistics, Wiley, New York, Chichester (1982)

    Google Scholar 

  21. Lauriston, A.: Automatic Term Recognition: performance of Linguistic and Statistical Techniques. Ph.D. Thesis, University of Manchester Institute of Science and Technology (1996)

    Google Scholar 

  22. Lehrberger, J.: Sublanguage analysis. Analyzing language in restricted domains, Ralph Grishman and Richard Kittredge (editors), Lawrence Erlbaum, 2 (1986) 1938

    Google Scholar 

  23. Penn: Penn Treebank Annotation. Computational Linguistics 19 (1993)

    Google Scholar 

  24. Sager, J.C.: Commentary by Prof. Juan Carlos Sager, Actes Table Ronde sur les Problfiemes du Découpage du Terms, Montréal, 26 aouŨt. Guy Rondeau, AILAComterm, Office de la Langue Francaise, Québec, (1978) 39–74

    Google Scholar 

  25. Sager, J.C., Dungworth, D., McDonald, P.F.: English Special Languages: principles and practice in science and technology. Oscar Brandstetter Verlag KG, Wiesbaden, (1980)

    Google Scholar 

  26. Sager, J.C.: A Practical Course in Terminology Processing. John Benjamins Publishing Company, (1990)

    Google Scholar 

  27. Salton, G.: Introduction to modern information retrieval. Computer Science, McGraw-Hill (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Frantzi, K.T., Ananiadou, S., Tsujii, J. (1998). The C-value/NC-value Method of Automatic Recognition for Multi-word Terms. In: Nikolaou, C., Stephanidis, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1998. Lecture Notes in Computer Science, vol 1513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49653-X_35

Download citation

  • DOI: https://doi.org/10.1007/3-540-49653-X_35

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65101-7

  • Online ISBN: 978-3-540-49653-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics