no code implementations • GWC 2018 • John P. McCrae
Lexical resource differ from encyclopaedic resources and represent two distinct types of resource covering general language and named entities respectively.
no code implementations • CSRNLP (LREC) 2022 • Tapan Auti, Rajdeep Sarkar, Bernardo Stearns, Atul Kr. Ojha, Arindam Paul, Michaela Comerford, Jay Megaro, John Mariano, Vall Herard, John P. McCrae
Pharmaceutical text classification is an important area of research for commercial and research institutions working in the pharmaceutical domain.
no code implementations • LREC 2022 • Fahad Khan, Francisco J. Minaya Gómez, Rafael Cruz González, Harry Diakoff, Javier E. Diaz Vera, John P. McCrae, Ciara O’Loughlin, William Michael Short, Sander Stolk
In this paper we will discuss our preliminary work towards the construction of a WordNet for Old English, taking our inspiration from other similar WN construction projects for ancient languages such as Ancient Greek, Latin and Sanskrit.
no code implementations • LREC 2022 • Priya Rani, John P. McCrae, Theodorus Fransen
This data-set is the first Magahi-Hindi-English code-mixed data-set for similar language identification task.
no code implementations • WMT (EMNLP) 2020 • Atul Kr. Ojha, Priya Rani, Akanksha Bansal, Bharathi Raja Chakravarthi, Ritesh Kumar, John P. McCrae
NUIG-Panlingua-KMI submission to WMT 2020 seeks to push the state-of-the-art in Similar Language Translation Task for Hindi↔Marathi language pair.
no code implementations • GWC 2018 • John P. McCrae, Ian Wood, Amanda Hicks
Princeton WordNet is one of the most widely-used resources for natural language processing, but is updated only infrequently and cannot keep up with the fast-changing usage of the English language on social media platforms such as Twitter.
no code implementations • GWC 2018 • Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae
In addition to that, we carried out a manual evaluation of the translations for the Tamil language, where we demonstrate that our approach can aid in improving wordnet resources for under-resourced Dravidian languages.
no code implementations • EMNLP 2021 • Koustava Goswami, Sourav Dutta, Haytham Assem, Theodorus Fransen, John P. McCrae
We demonstrate the efficacy of an unsupervised as well as a weakly supervised variant of our framework on STS, BUCC and Tatoeba benchmark tasks.
no code implementations • EMNLP (NLLP) 2021 • Rajdeep Sarkar, Atul Kr. Ojha, Jay Megaro, John Mariano, Vall Herard, John P. McCrae
This method allows predictive coding methods to be rapidly developed for new regulations and markets.
no code implementations • COLING (CogALex) 2020 • Saurav Karmakar, John P. McCrae
This paper presents a bidirectional transformer based approach for recognising semantic relationships between a pair of words as proposed by CogALex VI shared task in 2020.
1 code implementation • EACL (GWC) 2021 • John P. McCrae, Michael Wayne Goodman, Francis Bond, Alexandre Rademaker, Ewa Rudnicka, Luis Morgado Da Costa
The Global Wordnet Formats have been introduced to enable wordnets to have a common representation that can be integrated through the Global WordNet Grid.
no code implementations • LREC 2022 • Cécile Robin, Gautham Vadakkekara Suresh, Víctor Rodriguez-Doncel, John P. McCrae, Paul Buitelaar
Language resources are a key component of natural language processing and related research and applications.
no code implementations • WILDRE (LREC) 2022 • Pritha Majumdar, Deepak Alok, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae
A preliminary set of sentences was annotated manually - 600 for Bengali and 200 for Magahi.
no code implementations • VarDial (COLING) 2020 • Bharathi Raja Chakravarthi, Navaneethan Rajasekaran, Mihael Arcan, Kevin McGuinness, Noel E. O’Connor, John P. McCrae
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art approaches to this leverage pretrained monolingual word embeddings using supervised or semi-supervised approaches.
1 code implementation • GWC 2019 • John P. McCrae, Alexandre Rademaker, Francis Bond, Ewa Rudnicka, Christiane Fellbaum
We describe the release of a new wordnet for English based on the Princeton WordNet, but now developed under an open-source model.
no code implementations • NAACL (SMM4H) 2021 • Atul Kr. Ojha, Priya Rani, Koustava Goswami, Bharathi Raja Chakravarthi, John P. McCrae
Social media platforms such as Twitter and Facebook have been utilised for various research studies, from the cohort-level discussion to community-driven approaches to address the challenges in utilizing social media data for health, clinical and biomedical information.
no code implementations • EACL (DravidianLangTech) 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Navya Jose, Anand Kumar M, Thomas Mandl, Prasanna Kumar Kumaresan, Rahul Ponnusamy, Hariharan R L, John P. McCrae, Elizabeth Sherly
Detecting offensive language in social media in local languages is critical for moderating user-generated content.
no code implementations • EACL (DravidianLangTech) 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Shubhanker Banerjee, Richard Saldanha, John P. McCrae, Anand Kumar M, Parameswari Krishnamurthy, Melvin Johnson
This paper describes the datasets used, the methodology used for the evaluation of participants, and the experiments’ overall results.
no code implementations • EACL (GWC) 2021 • John P. McCrae, David Cillessen
WordNet is the most widely used lexical resource for English, while Wikidata is one of the largest knowledge graphs of entity and concepts available.
no code implementations • EACL (GWC) 2021 • Sina Ahmadi, John P. McCrae
Words are defined based on their meanings in various ways in different resources.
no code implementations • 7 Mar 2024 • Priya Rani, Gaurav Negi, Theodorus Fransen, John P. McCrae
The present paper introduces new sentiment data, MaCMS, for Magahi-Hindi-English (MHE) code-mixed language, where Magahi is a less-resourced minority language.
no code implementations • 12 Feb 2024 • Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondřej Dušek
This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved.
1 code implementation • 9 Nov 2023 • Koustava Goswami, Priya Rani, Theodorus Fransen, John P. McCrae
We train an encoder to gain morphological knowledge of a language and transfer the knowledge to perform unsupervised and weakly-supervised cognate detection tasks with and without the pivot language for the closely-related languages.
1 code implementation • 11 Jul 2023 • Ghanshyam Verma, Shovon Sengupta, Simon Simanta, Huan Chen, Janos A. Perge, Devishree Pillai, John P. McCrae, Paul Buitelaar
Personalized recommendations have a growing importance in direct marketing, which motivates research to enhance customer experiences by knowledge graph (KG) applications.
no code implementations • 18 Nov 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Sajeetha Thavareesan, Dhivya Chinnappa, Durairaj Thenmozhi, Elizabeth Sherly, John P. McCrae, Adeep Hande, Rahul Ponnusamy, Shubhanker Banerjee, Charangan Vasantharajan
We received 22 systems for Tamil-English, 15 systems for Malayalam-English, and 15 for Kannada-English.
1 code implementation • 17 Jun 2021 • Bharathi Raja Chakravarthi, Ruba Priyadharshini, Vigneshwaran Muralidaran, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, John P. McCrae
This paper describes the development of a multilingual, manually annotated dataset for three under-resourced Dravidian languages generated from social media comments.
no code implementations • 9 Jun 2021 • Bharathi Raja Chakravarthi, Jishnu Parameswaran P. K, Premjith B, K. P Soman, Rahul Ponnusamy, Prasanna Kumar Kumaresan, Kingston Pal Thamburaj, John P. McCrae
This is the first multimodal sentiment analysis dataset for Tamil and Malayalam by volunteer annotators.
no code implementations • COLING 2020 • Koustava Goswami, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae
Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines.
1 code implementation • COLING 2020 • Rajdeep Sarkar, Koustava Goswami, Mihael Arcan, John P. McCrae
Conversational recommender systems focus on the task of suggesting products to users based on the conversation flow.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Omnia Zayed, John P. McCrae, Paul Buitelaar
Identifying metaphors in text is very challenging and requires comprehending the underlying comparison.
no code implementations • 4 Aug 2020 • Bharathi Raja Chakravarthi, Priya Rani, Mihael Arcan, John P. McCrae
It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation.
no code implementations • SEMEVAL 2020 • Koustava Goswami, Priya Rani, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae
Code mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons.
1 code implementation • LREC 2020 • Bharathi Raja Chakravarthi, Vigneshwaran Muralidaran, Ruba Priyadharshini, John P. McCrae
One such application is to analyse the popular sentiments of videos on social media based on viewer comments.
1 code implementation • LREC 2020 • Bharathi Raja Chakravarthi, Navya Jose, Shardul Suryawanshi, Elizabeth Sherly, John P. McCrae
However, very few resources are available for code-mixed data to create models specific for this data.
1 code implementation • LREC 2020 • Georg Rehm, Dimitrios Galanis, Penny Labropoulou, Stelios Piperidis, Martin Welß, Ricardo Usbeck, Joachim köhler, Miltos Deligiannis, Katerina Gkirtzou, Johannes Fischer, Christian Chiarcos, Nils Feldhus, Julián Moreno-Schneider, Florian Kintzel, Elena Montiel, Víctor Rodríguez Doncel, John P. McCrae, David Laqua, Irina Patricia Theile, Christian Dittmar, Kalina Bontcheva, Ian Roberts, Andrejs Vasiljevs, Andis Lagzdiņš
With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows.
1 code implementation • 11 Apr 2020 • Md. Rezaul Karim, Bharathi Raja Chakravarthi, John P. McCrae, Michael Cochez
Evaluations against several baseline embedding models, e. g., Word2Vec and GloVe yield up to 92. 30%, 82. 25%, and 90. 45% F1-scores in case of document classification, sentiment analysis, and hate speech detection, respectively during 5-fold cross-validation tests.
no code implementations • 12 Dec 2018 • Narumol Prangnawarat, John P. McCrae, Conor Hayes
We then show that integrating multiple time frames in our methods can give a better overall similarity demonstrating that temporal evolution can have an important effect on entity relatedness.
no code implementations • COLING 2018 • Abigail Walsh, Claire Bonial, Kristina Geeraert, John P. McCrae, Nathan Schneider, Clarissa Somers
This paper describes the construction and annotation of a corpus of verbal MWEs for English, as part of the PARSEME Shared Task 1. 1 on automatic identification of verbal MWEs.