no code implementations • LREC 2022 • Christian Chiarcos, Christian Fäth, Maxim Ionov
Large-scale diachronic corpus studies covering longer time periods are difficult if more than one corpus are to be consulted and, as a result, different formats and annotation schemas need to be processed and queried in a uniform, comparable and replicable manner.
no code implementations • LREC 2022 • Christian Chiarcos, Christian Fäth, Maxim Ionov
The OntoLex vocabulary has become a widely used community standard for machine-readable lexical resources on the web.
no code implementations • gwll (LREC) 2022 • Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truică
Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC.
no code implementations • LREC 2020 • Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, S Stolk, er, Thierry Declerck, John Philip McCrae
Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing.
no code implementations • LREC 2020 • Christian Chiarcos, Christian F{\"a}th, Maxim Ionov
In this paper, we report the release of the ACoLi Dictionary Graph, a large-scale collection of multilingual open source dictionaries available in two machine-readable formats, a graph representation in RDF, using the OntoLex-Lemon vocabulary, and a simple tabular data format to facilitate their use in NLP tasks, such as translation inference across dictionaries.
no code implementations • LREC 2020 • Christian F{\"a}th, Christian Chiarcos, Bj{\"o}rn Ebbrecht, Maxim Ionov
We introduce the Flexible and Integrated Transformation and Annotation eNgeneering (Fintan) platform for converting heterogeneous linguistic resources to RDF.