"οικΟΝΟΜΙΑ": A project for annotated economic corpora" (in collaboration).
In Proceedings of 22th Symposium on Applied Linguistics organised by the Department Of Philology and Linguistics, Aristotle University of Thessaloniki, pp. 20-31, Thessaloniki 2001.
The object of this article is the presentation of the financial body text which was collected, characterized and discussed at various levels of linguistic analysis under the ESTO "economy" for the development and control system information extraction. The growth in System Identification deals with the recognition of nominal entities (persons, organizations, place names, temporal expressions, arithmetic expressions) from pc's, free text, in accordance with the standards of the International Conference evaluation of information extraction systems (Message Understanding Conferences - MUC), but adapted to the Greek data.
In this paper the process of collection of texts (texts printed transcripts or electronic type), the methodology followed, the standards set by the linguistic annotation per level, as well as computational tools used for processing text. For work, required the design and implementation of a prototype database for the collection and recording of the corpus (body of about 120,000 words) made by the writer.