Abstract
This paper presents the design, implementation and evaluation of GATE, a General Architecture for Text Engineering.GATE lies at the intersection of human language computation and software engineering, and constitutes aninfrastructural system supporting research and development of languageprocessing software.
Similar content being viewed by others
References
Appelt, D. “An Introduction to Information Extraction”. Artificial Intelligence Communications, 12(3) (1999), pp. 161–172.
Bird, S. and M. Liberman. “A Formal Framework for Linguistic Annotation”. Technical Report MS-CIS-99-01, Department of Computer And Information Science, University of Pennsylvania. http://xxx.lanl.gov/-abs.cs.CL/9903003, 1999.
Booch, G. Object-Oriented Analysis and Design, 2nd Edn. Benjamin/Cummings, 1994.
Booch, G., J. Rumbaugh and I. Jacobson. The Unified Modelling Language User Guide. Addison-Wesley, Reading, MA, 1999.
Brughman, H., A. Russel, P. Wittenburg and R. Piepenbrock. “Corpus-based Research Using and Internet”. In First International Conference on Language Resources and Evaluation (LREC) Workshop on Distributing and Accessing Linguistic Resources. Granada, Spain, 1998.
Brugman, H., H. Russel and P.Wittenburg. “An Infrastructure for Collaboratively Building and Using Multimedia Corpora in the Humaniora”. In Proceedings of the ED-MEDIA/ED-TELECOM Conference. Freiburg, 1998.
Burnett, M.,M. Baker, C. Bohus, P. Carlson, S. Yang and P. van Zee. “Scaling Up Visual Languages”. IEEE Computer, 28(3) (1987), pp. 45–54.
Clements, P. and L. Northrop. “Software Architecture: An Executive Overview”. Technical Report CMU/SEI-96-TR-003, Software Engineering Institute, Carnegie Mellon University, 1996.
Cockburn, A. “Structuring Use Cases with Goals”. Journal of Object-Oriented Programming, Sept– Oct and Nov–Dec, 1997.
Cowie, J. and W. Lehnert. “Information Extraction”. Communications of the ACM, 39(1) (1996), pp. 80–91.
Cunningham, H. “A Definition and Short History of Language Engineering”. Journal of Natural Language Engineering, 5(1) (1999a), pp. 1–16.
Cunningham, H. “Information Extraction: A User Guide (revised version)”. Research Memorandum CS-99-07, Department of Computer Science, University of Sheffield, 1999b.
Cunningham, H. “JAPE: A Java Annotation Patterns Engine”. Research Memorandum CS-99-06, Department of Computer Science, University of Sheffield, 1999c.
Cunningham, H. “Software Architecture for Language Engineering”. Ph.D. thesis, University of Sheffield. http://gate.ac.uk/sale/thesis/, 2000.
Cunningham, H., K. Bontcheva, V. Tablan and Y. Wilks. “Software Infrastructure for Language Resources: A Taxonomy of Previous Work and a Requirements Analysis”. In Proceedings of the 2nd International Conference On Language Resources and Evaluation (LREC-2). Athens. http://gate.ac.uk/, 2000a.
Cunningham, H., M. Freeman and W. Black. “Software Reuse, Object-Oriented Frameworks and Natural Language Processing”. In New Methods in Language Processing (NeMLaP-1), September 1994. lManchester, (Re-published in book form 1997 by UCL Press), 1994.
Cunningham, H., R. Gaizauskas, K. Humphreys and Y. Wilks. “Experience with a Language Engineering Architecture: Three Years of GATE”. In Proceedings of the AISB'99 Workshop on Reference Architectures and Data Standards for NLP. Edinburgh, The Society for the Study of Artificial Intelligence and Simulation of Behaviour, 1999.
Cunningham, H., R. Gaizauskas and Y.Wilks. “A General Architecture for Text Engineering (GATE) – a New Approach to Language Engineering R&D”. Technical Report CS-95-21, Department of Computer Science, University of Sheffield. http://xxx.lanl.gov/abs/cs.CL/9601009, 1995.
Cunningham, H., K. Humphreys, R. Gaizauskas and M. Stower. “CREOLE Developer's Manual”. Technical report, Department of Computer Science, University of Sheffield. http://www.dcs.shef.ac.uk/nlp/gate, 1996a.
Cunningham, H., K. Humphreys, R. Gaizauskas and Y. Wilks. “TIPSTER-Compatible Projects at Sheffield”. In Advance in Text Processing, TIPSTER Program Phase II. Morgan Kaufmann, California, 1996b.
Cunningham, H., K. Humphreys, R. Gaizauskas and Y. Wilks. “GATE – a TIPSTER-based General Architecture for Text Engineering”. In Proceedings of the TIPSTER Text Program (Phase III) 6 Month Workshop. Morgan Kaufmann, California, 1997b.
Cunningham, H., K. Humphreys, R. Gaizauskas and Y. Wilks. “Software Infrastructure for Natural Language Processing”. In Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLP-97). http://xxx.lanl.gov/abs/cs.CL.9702005, 1997a.
Cunningham, H., D. Maynard, K. Bontcheva, V. Tablan and Y. Wilks. “Experience of Using GATE for NLP R&D”. In Proceedings of the Workshop on Using Toolsets and Architectures to Build NLP Systems at COLING-2000. Luxembourg. http://gate.ac.uk/, 2000b.
Cunningham, H., W. Peters, C. McCauley, K. Bontcheva and Y. Wilks. “A Level Playing Field for Language Resource Evaluation”. In Workshop on Distributing and Accessing Lexical Resources at Conference on Language Resources Evaluation. Granada, Spain, 1998a.
Cunningham, H.,M. Stevenson and Y.Wilks. “Implementing a Sense Tagger within a General Architecture for Language Engineering”. In Proceedings of the Third Conference on New Methods in Language Engineering (NeMLaP-3). Sydney, Australia, 1998b, pp. 59–72.
Cunningham, H., Y. Wilks and R. Gaizauskas. “GATE – a General. Architecture for Text Engineering”. In Proceedings of the 16th Conference on Computational Linguistics (COLING-96). Gopenhagen, 1996c.
Cunningham, H., Y. Wilks and R. Gaizauskas. “New Methods, Current Trends and Software Infrastructure for NLP”. In Proceedings of the Conference on New Methods in Natural Language Processing (NeMLaP-2). Bilkent University, Turkey. http://xxx.lanl.gov/abs/cs.CL/9607025, 1996d.
Cunningham, H., Y. Wilks and R. Gaizauskas. “Software Infrastructure for Language Engineering”. In Proceedings of the AISB Workshop on Language Engineering for Document Analysis and Recognition. Brighton, U.K., 1996e.
Day, D., J. Aberdeen, L. Hirschman, R. Kozierok, P. Robinson and M. Vilain. “Mixed-Initiative Development of Language Processing Systems”. In Proceedings of the 5th Conference on Applied NLP Syatems (ANLP-97), 1997.
Day, D., P. Robinson, M. Vilain and A. Yeh. “MITEE: Description of the Alembic System Used for MUC-7”. In Proceedings of the Seventh Message Understanding Conference (MUC-7). http://www.itl.nist.giv/iaui/894.02/-related_project/muc/index.html, 1998.
Dybkjær, L., N. Bernsen, H. Dybkjær, D. McKelvie and A. Mengel. “The MATE Markup Framework. MATE Deliverable Dl.2”. Technical Report D1.2, MATE Project, http://mate.nis.sdu.dk/, 1998.
Eriksson, M. “Final Report of Svensk”. Technical report, SICS, http://www.sics.se/humle/ projects/svensk/, 1997.
Erikison, M. and B. Gambäck. “SVENSK: A Toolbox of Swedish Language Processing Resources”. In Proceedings of the 2nd Conference on Recent Advances in Natural Language Processing (RANLP-2). Tzigov Chark, Bulgaria, 1997.
Fowler, M. and K. Scott. UML Distilled. Addison-Welsey, Reading, MA, 1997.
Fowler, M. and K. Scott. UML Distilled, Second Edition. Addison-Welsey, Reading, MA, 2000.
Fröhlich, M. and M. Werner. “Demonstration of the Graph Visualization System daVinci”. In Proceedings of DIMACS Workshop on Graph Drawing’ 94, LNCS 894. Springer-Verlag, 1995.
Gaizauskas, R., H. Cunningham, Y. Wilks, P. Rodgers and K. Humphreys. “GATE – an Environment to Support Reaearch and Development in Natural Language Engineering”. In Proceedings of the 8th IEEE International Conference on Tool with Artificial Intelligence (ICTAI-96). Toulouse, France, 1996a.
Gaizauskas, R., P. Rodgers, H. Cunningham and K. Humphreys. “GATE User Guide”. http:// www.dcs.shef.ac.uk/nlp/gate, 1996b.
Gaizauskas, R., T. Wakao, K. Humpbreys, H. Cunningham and Y. Wilks. “Description of the LaSIE system as used for MUC-6”. In Proceedings of the Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann, California, 1995.
Gambäck, B. and F. Olason. “Experiences of Language Engineering Algorithm Reuse”. In Second International Conference on Language Resources and Evaluation (LREC). Athens, Greece, 2000, pp. 155–160.
Goldfarb, C. and P. Prescod. The XML Handbook. Prentice Hall, New York, 1998.
Goldfarb, C.F. The SGML Handbook. Oxford University Press, 1990.
Gotoh, Y., S. Renals, R. Gaizauskas, G. Williams and H. Cunningham. “Named Entity Tagged Language Models for LVCSR”. Technical Report CS-98-05, Department of Computer Science, University of Sheffield, 1998.
Grishman, R. “TIPSTER Architecture Design Document Version 2.3”. Technical report, DARPA. http://www.itl.nist.gov/div894/894.02/-related_projects/tipster/, 1997.
Grishman, R. and B. Sundheim. “Message Understanding Conference – 6: A Brief History”. In Proceedings of the 16 International Conference on Computational Linguistics. Copenhagen, 1996.
Harrison, P. “Evluating Syntax Performance of Parsers/Grammars of English”. In Proceedings of the Workshop on Evaluating Natural Language Processing Systems, ACL, 1991.
Hayes-Roth, F. “Architecture-Based Acquisition and Development of Software: Guidelines and Recommendations from the ARPA Domain-Specific Software Architecture (DSSA) Program”. Technical report, Techknowledge Federal Systems. http://www.oswego.com/dssa/, visited 29th March 1999, 1994.
Jelinek, F. Statistical Methods for Speech Recognition. MIT Press, Cambridge, MA, 1997.
Keijola, M. “BRIEFS-Gaining Information of Value in Dynamical Business Environments”. http://www.tuta.hut.fi/briefs, 1999.
Kokkinakis, D. “AVENTINUS, GATE and Swedish Lingware”. In Proceedings of the 11th NODALIDA Conference. Copenhagen, 1998, pp. 22–33.
Kokkinakis, D. and S. Johansson-Kokkinakis. “A Cascaded Finite-State Parser for Syntactic Analysis of Swedish”. Technical report, Department of Swedish, University of Göteborg, Göteborg, 1999.
LREC-1. “Conference on Language Resources Evaluation (LREC-1)”. Granada, Spain, 1998.
Manning, C. and H. Schütze. Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. Supporting materials available at http://www.sultry.arts.usyd.edu.au/fsnlp/, 1999.
Maynard, D., H. Cunningham, K. Bontcheva, R. Catizone, G. Demetriou, R. Gaizauskas, O. Hamza, M. Hepple, P. Herring, B. Mitchell, M. Oakes, W. Peters, A. Setzer, M. Stevenson, V. Tablan, C. Ursu and Y. Wilks. “A Survey of Uses of GATE”. Technical Report CS-00-06, Department of Computer Science, University of Sheffield, 2000.
McEnery, A., P. Baker, R. Gaizauskas and H. Cunningham. “EMILLE: Building a Corpus of South Asian Languages”. Vivek, A Quarterly in Artificial Intelligence, 13(3) (2000), pp. 23–32.
McKelvie, D., C. Brew and H. Thompson. “Using SGML as a Basis for Data-Intensive NLP”. In Proceedings of the fifth Conference on Applied Natural Language Processing (ANLP-97). Washington, DC, 1997.
McKelvie, D., C. Brew and H. Thompson. “Using SGML as a Basis for Data-Intensive Natural Language Processing”. Computers and the Humanities, 31(5) (1998), pp. 367–388.
Nelson, T. “Embedded Markup Considered Harmful”. In XML: Principles, Tools and Techniques. Ed. D. Connolly, O'Reilly, Cambridge, MA, 1997, pp. 129–134.
Olsson, F. “Tagging and Morphological Processing in the SVENSK System”. Master's thesis, University of Uppsala. http://http://stp.ling.uu.se/fredriko/exjobb.ps, 1997.
Olsson, F., B. Gambäck and M. Eriksson. “Reusing Swedish Language Processing Resources in SVENSK”. In Workshop on Minimising the Efforts for LR Acquistion. Granada, Spain, 1998.
Ousterhout, J. Tcl and the Tk Toolkit. Addison-Wesley, Reading, MA, 1994.
Peter, W., H. Cunningham, C. McCauley, K. Bontcheva and Y. Wilks. “Uniform Language Resource Access and Distribution”. In Workshop on Distributing and Accessing Lexical Resources at Conference on Language Resources Evaluation. Granada, Spain, 1998.
Roche, E. and Y. Schabes. finite-State Language Processing. MIT Press, Cambridge, MA, 1997.
Rodgers, P., R. Gaizauskas, K. Humphreys and H. Cunningham. “Visual Execution and Data Visualisation in Natural Language Processing”. In IEEE Visual Language. Capri, Italy, 1997.
Spyropoulos, C. “Final Report of the Greek Information Extraction (GIE) Project”. Technical report, NKSR Demokritus, Athens, 1999.
Stevenson, M., H. Cunningham and Y. Wilks. “Sense Tagging and Language Engineering”. In Proceedings of the 13th European Conference on Artificial Intellingence (ECAI-98). Brighton, U.K., 1998, pp. 185–189.
The Unicode Consortium. The Unicode Standard, Version 2.0. Addison-Wesley, Reading, MA, 1996.
Tracz, W. “Domain-Specific Software Architecture (DSSA) Frequently Asked Questions (FAQ)”. http://www.oswego.com/dssa/faq/faq.html, 1995.
Yourdon, E. Modern Structured Analysis. Prentice Hall, New York, 1989.
Yourdon, E. The Rise and Resurrection of the American Programmer. Prentice Hall, New York, 1996.
Zajac, R. “An Open Distributed Architecture for Reuse and Integration of Heterogenous NLP Components”. In Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97), 1997.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Cunningham, H. GATE, a General Architecture for Text Engineering. Computers and the Humanities 36, 223–254 (2002). https://doi.org/10.1023/A:1014348124664
Issue Date:
DOI: https://doi.org/10.1023/A:1014348124664