1 code implementation • CODI 2021 • Zae Myung Kim, Vassilina Nikoulina, Dongyeop Kang, Didier Schwab, Laurent Besacier
This paper presents an interactive data dashboard that provides users with an overview of the preservation of discourse relations among 28 language pairs.
1 code implementation • Findings (ACL) 2022 • Cécile Macaire, Didier Schwab, Benjamin Lecouteux, Emmanuel Schang
We investigate the exploitation of self-supervised models for two Creole languages with few resources: Gwadloupéyen and Morisien.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • GWC 2016 • Michael Zock, Didier Schwab
Next we will show under what conditions WN is suitable for word access, and finally we will present a roadmap showing the obstacles to be overcome to build a resource allowing the text producer to find the word s/he is looking for.
no code implementations • JEP/TALN/RECITAL 2021 • Aidan Mannion, Thierry Chevalier, Didier Schwab, Lorraine Goeuriot
Cet article présente un résumé de notre soumission pour Tâche 1 de DEFT 2021.
no code implementations • ACL (IWSLT) 2021 • Hang Le, Florentin Barbier, Ha Nguyen, Natalia Tomashenko, Salima Mdhaffar, Souhir Gabiche Gahbiche, Benjamin Lecouteux, Didier Schwab, Yannick Estève
This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2021, low-resource speech translation and multilingual speech translation.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 11 Sep 2023 • Titouan Parcollet, Ha Nguyen, Solene Evain, Marcely Zanon Boito, Adrien Pupier, Salima Mdhaffar, Hang Le, Sina Alisamir, Natalia Tomashenko, Marco Dinarelli, Shucong Zhang, Alexandre Allauzen, Maximin Coavoux, Yannick Esteve, Mickael Rouvier, Jerome Goulian, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing.
no code implementations • 20 Jul 2023 • Aidan Mannion, Thierry Chevalier, Didier Schwab, Lorraine Geouriot
In the biomedical domain, significant progress has been made in adapting this paradigm to NLP tasks that require the integration of domain-specific knowledge as well as statistical modelling of language.
1 code implementation • 27 Jan 2023 • Phuong-Hang Le, Hongyu Gong, Changhan Wang, Juan Pino, Benjamin Lecouteux, Didier Schwab
Nevertheless, CTC is only a partial solution and thus, in our second contribution, we propose a novel pre-training method combining CTC and optimal transport to further reduce this gap.
no code implementations • 27 Sep 2021 • Nairit Bandyopadhyay, Sébastien Riou, Didier Schwab
We trained a multi modal convolutional neural network and analysed its performance with and without calibration and this evaluation provides clear insights on how calibration improved the performance of the Deep Learning model in estimating gaze in the wild.
2 code implementations • ACL 2021 • Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier
Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP.
Ranked #1 on Speech-to-Text Translation on MuST-C EN->ES
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • Findings (ACL) 2021 • Zae Myung Kim, Laurent Besacier, Vassilina Nikoulina, Didier Schwab
Recent studies on the analysis of the multilingual representations focus on identifying whether there is an emergence of language-independent representations, or whether a multilingual model partitions its weights among different languages.
1 code implementation • 23 Apr 2021 • Solene Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Esteve, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier
In this paper, we propose LeBenchmark: a reproducible framework for assessing SSL from speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
1 code implementation • COLING 2020 • Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier
We propose two variants of these architectures corresponding to two different levels of dependencies between the decoders, called the parallel and cross dual-decoder Transformers, respectively.
Ranked #1 on Speech-to-Text Translation on MuST-C EN->FR
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • JEPTALNRECITAL 2020 • Sol{\`e}ne Evain, Adrien Contesse, Antoine Pinchaud, Didier Schwab, Benjamin Lecouteux, Nathalie Henrich Bernardoni
Nous proposons un syst{\`e}me de reconnaissance des sons de beatbox s{'}inspirant de la reconnaissance automatique de la parole.
no code implementations • JEPTALNRECITAL 2020 • Hang Le, Lo{\"\i}c Vial, Jibril Frej, Vincent Segonne, Maximin Coavoux, Benjamin Lecouteux, Alex Allauzen, re, Beno{\^\i}t Crabb{\'e}, Laurent Besacier, Didier Schwab
Les mod{\`e}les de langue pr{\'e}-entra{\^\i}n{\'e}s sont d{\'e}sormais indispensables pour obtenir des r{\'e}sultats {\`a} l{'}{\'e}tat-de-l{'}art dans de nombreuses t{\^a}ches du TALN.
no code implementations • LREC 2020 • Didier Schwab, Pauline Trial, C{\'e}line Vaschalde, Lo{\"\i}c Vial, Emmanuelle Esperanca-Rodier, Benjamin Lecouteux
In order to make it possible to use pictograms automatically in NLP applications, we propose a database that links them to semantic knowledge.
no code implementations • 24 Apr 2020 • Jibril Frej, Phillipe Mulhem, Didier Schwab, Jean-Pierre Chevallet
Document indexing is a key component for efficient information retrieval (IR).
7 code implementations • LREC 2020 • Hang Le, Loïc Vial, Jibril Frej, Vincent Segonne, Maximin Coavoux, Benjamin Lecouteux, Alexandre Allauzen, Benoît Crabbé, Laurent Besacier, Didier Schwab
Language models have become a key step to achieve state-of-the art results in many different Natural Language Processing (NLP) tasks.
Ranked #1 on Natural Language Inference on XNLI French
1 code implementation • LREC 2020 • Jibril Frej, Didier Schwab, Jean-Pierre Chevallet
Since most standard ad-hoc information retrieval datasets publicly available for academic research (e. g. Robust04, ClueWeb09) have at most 250 annotated queries, the recent deep learning models for information retrieval perform poorly on these datasets.
no code implementations • EMNLP (IWSLT) 2019 • Loïc Vial, Benjamin Lecouteux, Didier Schwab, Hang Le, Laurent Besacier
Therefore, we implemented a Transformer-based encoder-decoder neural system which is able to use the output of a pre-trained language model as input embeddings, and we compared its performance under three configurations: 1) without any pre-trained language model (constrained), 2) using a language model trained on the monolingual parts of the allowed English-Czech data (constrained), and 3) using a language model trained on a large quantity of external monolingual data (unconstrained).
no code implementations • WS 2019 • Raki Lachraf, El Moatez Billah Nagoudi, Youcef Ayachi, Ahmed Abdelali, Didier Schwab
Word Embeddings (WE) are getting increasingly popular and widely applied in many Natural Language Processing (NLP) applications due to their effectiveness in capturing semantic properties of words; Machine Translation (MT), Information Retrieval (IR) and Information Extraction (IE) are among such areas.
no code implementations • JEPTALNRECITAL 2019 • Didier Schwab, Pauline Trial, Vaschalde C{\'e}line, Lo{\"\i}c Vial, Benjamin Lecouteux
Cet article pr{\'e}sente une ressource qui fait le lien entre WordNet et Arasaac, la plus grande base de pictogrammes librement disponible.
no code implementations • JEPTALNRECITAL 2019 • Lo{\"\i}c Vial, Benjamin Lecouteux, Didier Schwab
En D{\'e}sambigu{\"\i}sation Lexicale (DL), les syst{\`e}mes supervis{\'e}s dominent largement les campagnes d{'}{\'e}valuation.
2 code implementations • GWC 2019 • Loïc Vial, Benjamin Lecouteux, Didier Schwab
In this article, we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database.
Ranked #1 on Word Sense Disambiguation on SemEval 2015 Task 13
no code implementations • 2 Nov 2018 • Loïc Vial, Benjamin Lecouteux, Didier Schwab
Our method leads to state of the art results on most WSD evaluation tasks, while improving the coverage of supervised systems, reducing the training time and the size of the models, without additional training data.
Ranked #2 on Word Sense Disambiguation on SemEval 2013 Task 12
no code implementations • JEPTALNRECITAL 2018 • Lo{\"\i}c Vial, Benjamin Lecouteux, Didier Schwab
En d{\'e}sambigu{\"\i}sation lexicale, l{'}utilisation des r{\'e}seaux de neurones est encore peu pr{\'e}sente et tr{\`e}s r{\'e}cente.
no code implementations • JEPTALNRECITAL 2018 • Marwa Hadj Salah, Lo{\"\i}c Vial, Herv{\'e} Blanchon, Mounir Zrigui, Didier Schwab
Nous {\'e}valuons la qualit{\'e} de nos syst{\`e}mes de d{\'e}sambigu{\"\i}sation gr{\^a}ce {\`a} un corpus d{'}{\'e}valuation en arabe nouvellement disponible.
no code implementations • JEPTALNRECITAL 2018 • Marwa Hadj Salah, Herv{\'e} Blanchon, Mounir Zrigui, Didier Schwab
OntoNotes comprend le seul corpus manuellement annot{\'e} en sens librement disponible pour l{'}arabe.
no code implementations • 6 Feb 2018 • Marwa Hadj Salah, Didier Schwab, Hervé Blanchon, Mounir Zrigui
Machine translation (MT) is the process of translating text written in a source language into text in a target language.
no code implementations • SEMEVAL 2017 • El Moatez Billah Nagoudi, J{\'e}r{\'e}my Ferrero, Didier Schwab
This article describes our proposed system named LIM-LIG.
no code implementations • JEPTALNRECITAL 2017 • Lo{\"\i}c Vial, Benjamin Lecouteux, Didier Schwab
Dans cet article, nous proposons une nouvelle m{\'e}thode pour repr{\'e}senter sous forme vectorielle les sens d{'}un dictionnaire.
no code implementations • JEPTALNRECITAL 2017 • Lo{\"\i}c Vial, Benjamin Lecouteux, Didier Schwab
Pour la d{\'e}sambigu{\"\i}sation lexicale en anglais, on compte aujourd{'}hui une quinzaine de corpus annot{\'e}s en sens dans des formats souvent diff{\'e}rents et provenant de diff{\'e}rentes versions du Princeton WordNet.
1 code implementation • WS 2017 • Jeremy Ferrero, Laurent Besacier, Didier Schwab, Frederic Agnes
This paper is a deep investigation of cross-language plagiarism detection methods on a new recently introduced open dataset, which contains parallel and comparable collections of documents with multiple characteristics (different genres, languages and sizes of texts).
no code implementations • 7 Apr 2017 • Loïc Vial, Andon Tchechmedjiev, Didier Schwab
We find that CSA, GA and SA all eventually converge to similar results (0. 98 F1 score), but CSA gets there faster (in fewer scorer calls) and reaches up to 0. 95 F1 before SA in fewer scorer calls.
1 code implementation • SEMEVAL 2017 • Jeremy Ferrero, Frederic Agnes, Laurent Besacier, Didier Schwab
We present our submitted systems for Semantic Textual Similarity (STS) Track 4 at SemEval-2017.
no code implementations • WS 2017 • El Moatez Billah Nagoudi, Didier Schwab
Semantic textual similarity is the basis of countless applications and plays an important role in diverse areas, such as information retrieval, plagiarism detection, information extraction and machine translation.
no code implementations • EACL 2017 • J{\'e}r{\'e}my Ferrero, Laurent Besacier, Didier Schwab, Fr{\'e}d{\'e}ric Agn{\`e}s
This paper proposes to use distributed representation of words (word embeddings) in cross-language textual similarity detection.
no code implementations • JEPTALNRECITAL 2016 • Lo{\"\i}c Vial, Andon Tchechmedjiev, Didier Schwab
La proximit{\'e} s{\'e}mantique de deux d{\'e}finitions est {\'e}valu{\'e}e en comptant le nombre de mots communs dans les d{\'e}finitions correspondantes dans un dictionnaire.
no code implementations • JEPTALNRECITAL 2016 • Marwa Hadj Salah, Herv{\'e} Blanchon, Mounir Zrigui, Didier Schwab
Dans cet article, nous pr{\'e}sentons une m{\'e}thode pour am{\'e}liorer la traduction automatique d{'}un corpus annot{\'e} et porter ses annotations de l{'}anglais vers une langue cible.
1 code implementation • LREC 2016 • J{\'e}r{\'e}my Ferrero, Fr{\'e}d{\'e}ric Agn{\`e}s, Laurent Besacier, Didier Schwab
In this paper we describe our effort to create a dataset for the evaluation of cross-language textual similarity detection.
no code implementations • JEPTALNRECITAL 2015 • Mohammad Nasiruddin, Andon Tchechmedjiev, Herv{\'e} Blanchon, Didier Schwab
Nous pr{\'e}sentons une m{\'e}thode pour cr{\'e}er rapidement un syst{\`e}me de d{\'e}sambigu{\"\i}sation lexicale (DL) pour une langue L peu dot{\'e}e pourvu que l{'}on dispose d{'}un syst{\`e}me de traduction automatique statistique (TAS) d{'}une langue riche en corpus annot{\'e}s en sens (ici l{'}anglais) vers L. Il est, en effet, plus facile de disposer des ressources n{\'e}cessaires {\`a} la cr{\'e}ation d{'}un syst{\`e}me de TAS que des ressources d{\'e}di{\'e}es n{\'e}cessaires {\`a} la cr{\'e}ation d{'}un syst{\`e}me de DL pour la langue L. Notre m{\'e}thode consiste {\`a} traduire automatiquement un corpus annot{\'e} en sens vers la langue L, puis de cr{\'e}er le syst{\`e}me de d{\'e}sambigu{\"\i}sation pour L par des m{\'e}thodes supervis{\'e}es classiques.