no code implementations • WS 2020 • Jerry Liu, Nathan O{'}Hara, Alex Rubin, er, Rachel Draelos, Cynthia Rudin
The detection of metaphors can provide valuable information about a given text and is crucial to sentiment analysis and machine translation.
no code implementations • WS 2020 • Egon Stemle, Alex Onysko, er
The particular focus of our approach is on the potential influence that the metadata given in the ETS Corpus of Non-Native Written English might have on the automatic detection of metaphors in this dataset.
no code implementations • WS 2020 • Felix Schneider, Alex Waibel, er
Simultaneous machine translation systems rely on a policy to schedule read and write operations in order to begin translating a source sentence before it is complete.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • ACL 2020 • Emily M. Bender, Alex Koller, er
The success of the large neural language models on many NLP tasks is exciting.
no code implementations • ACL 2020 • Alex Erdmann, er, Tom Kenter, Markus Becker, Christian Schallhart
Lexica distinguishing all morphologically related forms of each lexeme are crucial to many language technologies, yet building them is expensive.
no code implementations • WS 2020 • Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang
The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.
no code implementations • WS 2020 • Ngoc-Quan Pham, Felix Schneider, Tuan-Nam Nguyen, Thanh-Le Ha, Thai Son Nguyen, Maximilian Awiszus, Sebastian St{\"u}ker, Alex Waibel, er
This paper describes KIT{'}s submissions to the IWSLT2020 Speech Translation evaluation campaign.
no code implementations • ACL 2020 • Marion Weller-Di Marco, Alex Fraser, er
This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology.
no code implementations • LREC 2020 • Alex Zahrer, er, Andrej Zgank, Barbara Schuppler
The experiments are based on recordings from an ongoing documentation project for the endangered Muyu language in New Guinea.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • LREC 2020 • Leah Michel, Viktor Hangya, Alex Fraser, er
We use a publicly available Hiligaynon corpus with only 300K words, and match it with a comparable corpus in English.
no code implementations • LREC 2020 • Oddur Kjartansson, Alex Gutkin, er, Alena Butryna, Isin Demirsahin, Clara Rivera
This paper introduces new open speech datasets for three of the languages of Spain: Basque, Catalan and Galician.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • LREC 2020 • Rudolf Schneider, Tom Oberhauser, Paul Grundmann, Felix Alex Gers, Alex Loeser, er, Steffen Staab
We present PubMedSection, a novel topic classification dataset focussed on the biomedical domain.
no code implementations • LREC 2020 • Verena Lyding, Alex K{\"o}nig, er, Monica Pretti
The major European language infrastructure initiatives like CLARIN (Hinrichs and Krauwer, 2014), DARIAH (Edmond et al., 2017) or Europeana (Europeana Foundation, 2015) have been built by focusing in the first place on institutions of larger scale, like specialized research departments and larger official units like national libraries, etc.
no code implementations • LREC 2020 • Adriana Guevara-Rukoz, Isin Demirsahin, Fei He, Shan-Hui Cathy Chu, Supheakmungkol Sarin, Knot Pipatsrisawat, Alex Gutkin, er, Alena Butryna, Oddur Kjartansson
In this paper we present a multidialectal corpus approach for building a text-to-speech voice for a new dialect in a language with existing resources, focusing on various South American dialects of Spanish.
no code implementations • LREC 2020 • Alex Henlein, Giuseppe Abrami, Attila Kett, Alex Mehler, er
People{'}s visual perception is very pronounced and therefore it is usually no problem for them to describe the space around them in words.
no code implementations • LREC 2020 • Giuseppe Abrami, Manuel Stoeckel, Alex Mehler, er
The annotation of texts and other material in the field of digital humanities and Natural Language Processing (NLP) is a common task of research projects.
no code implementations • LREC 2020 • Fei He, Shan-Hui Cathy Chu, Oddur Kjartansson, Clara Rivera, Anna Katanova, Alex Gutkin, er, Isin Demirsahin, Cibu Johny, Martin Jansche, Supheakmungkol Sarin, Knot Pipatsrisawat
We present free high quality multi-speaker speech corpora for Gujarati, Kannada, Malayalam, Marathi, Tamil and Telugu, which are six of the twenty two official languages of India spoken by 374 million native speakers.
no code implementations • LREC 2020 • Ildar Kagirov, Denis Ivanko, Dmitry Ryumin, Alex Axyonov, er, Alexey Karpov
The database includes lexical units (single words and phrases) from Russian sign language within one subject area, namely, {``}food products at the supermarket{''}, and was collected using MS Kinect 2. 0 device including both FullHD video and the depth map modes, which provides new opportunities for the lexicographical description of the Russian sign language vocabulary and enhances research in the field of automatic gesture recognition.
no code implementations • LREC 2020 • Yin May Oo, Theeraphol Wattanavekin, Chenfang Li, Pasindu De Silva, Supheakmungkol Sarin, Knot Pipatsrisawat, Martin Jansche, Oddur Kjartansson, Alex Gutkin, er
This paper introduces an open-source crowd-sourced multi-speaker speech corpus along with the comprehensive set of finite-state transducer (FST) grammars for performing text normalization for the Burmese (Myanmar) language.
no code implementations • LREC 2020 • Konstantina Lazaridou, Alex L{\"o}ser, er, Maria Mestre, Felix Naumann
Yet, political propaganda and one-sided views can be found in the news and can cause distrust in media.
no code implementations • LREC 2020 • Lionel Nicolas, Verena Lyding, Claudia Borg, Corina Forascu, Kar{\"e}n Fort, Katerina Zdravkova, Iztok Kosem, Jaka {\v{C}}ibej, {\v{S}}pela Arhar Holdt, Alice Millour, Alex K{\"o}nig, er, Christos Rodosthenous, Federico Sangati, Umair ul Hassan, Anisia Katinskaia, Anabela Barreiro, Lavinia Aparaschivei, Yaakov HaCohen-Kerner
We introduce in this paper a generic approach to combine implicit crowdsourcing and language learning in order to mass-produce language resources (LRs) for any language for which a crowd of language learners can be involved.
no code implementations • LREC 2020 • Antonio Roque, Alex Tsuetaki, er, Vasanth Sarathy, Matthias Scheutz
Resolving Indirect Speech Acts (ISAs), in which the intended meaning of an utterance is not identical to its literal meaning, is essential to enabling the participation of intelligent systems in peoples{'} everyday lives.
no code implementations • LREC 2020 • Isin Demirsahin, Oddur Kjartansson, Alex Gutkin, er, Clara Rivera
This paper presents a dataset of transcribed high-quality audio of English sentences recorded by volunteers speaking with different accents of the British Isles.
no code implementations • LREC 2020 • Alex Gutkin, er
While the auditory periphery mechanisms responsible for transducing the sound pressure wave into the auditory nerve discharge are relatively well understood, the models that describe them are usually very complex because they try to faithfully simulate the behavior of several functionally distinct biological units involved in hearing.
no code implementations • LREC 2020 • Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, S Stolk, er, Thierry Declerck, John Philip McCrae
Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing.
no code implementations • LREC 2020 • Christos Rodosthenous, Verena Lyding, Federico Sangati, Alex K{\"o}nig, er, Umair ul Hassan, Lionel Nicolas, Jolita Horbacauskiene, Anisia Katinskaia, Lavinia Aparaschivei
In this work, we report on a crowdsourcing experiment conducted using the V-TREL vocabulary trainer which is accessed via a Telegram chatbot interface to gather knowledge on word relations suitable for expanding ConceptNet.
no code implementations • LREC 2020 • Alex Henlein, Alex Mehler, er
The inclusion of CR as a pre-processing step is expected to lead to improvements in downstream tasks.
no code implementations • LREC 2020 • Iva Marinova, Laska Laskova, Petya Osenova, Kiril Simov, Alex Popov, er
The paper reports on the usage of deep learning methods for improving a Named Entity Recognition (NER) training corpus and for predicting and annotating new types in a test corpus.
no code implementations • LREC 2020 • Maria Eskevich, Franciska de Jong, Alex K{\"o}nig, er, Darja Fi{\v{s}}er, Dieter van Uytvanck, Tero Aalto, Lars Borin, Olga Gerassimenko, Jan Hajic, Henk van den Heuvel, Neeme Kahusk, Krista Liin, Martin Matthiesen, Stelios Piperidis, Kadri Vider
CLARIN is a European Research Infrastructure providing access to digital language resources and tools from across Europe and beyond to researchers in the humanities and social sciences.
no code implementations • LREC 2020 • Oliver Czulo, Alex Ziem, er, Tiago Timponi Torrent
Framenets as an incarnation of frame semantics have been set up to deal with lexicographic issues (cf.
no code implementations • LREC 2020 • Manuel Stoeckel, Alex Henlein, Wahed Hemati, Alex Mehler, er
Since most of the available Latin word embeddings were trained on either few or inaccurate data, we trained several embeddings on better data in the first step.
1 code implementation • LREC 2020 • Ossama Obeid, Nasser Zalmout, Salam Khalifa, Dima Taji, Mai Oudah, Bashar Alhafni, Go Inoue, Fadhl Eryani, Alex Erdmann, er, Nizar Habash
We present CAMeL Tools, a collection of open-source tools for Arabic natural language processing in Python.
no code implementations • LREC 2020 • Silvia Severini, Viktor Hangya, Alex Fraser, er, Hinrich Sch{\"u}tze
We participate in both the open and closed tracks of the shared task and we show improved results of our method compared to simple vector similarity based approaches.
no code implementations • TACL 2020 • Alex Clark, er, Nathana{\"e}l Fijalkow
Learning probabilistic context-free grammars (PCFGs) from strings is a classic problem in computational linguistics since Horning (1969).
no code implementations • WS 2019 • Chris van der Lee, van der Z, Tess en, Emiel Krahmer, Maria Mos, Alex Schouten, er
Results show that LIWC and machine learning models correlate with human evaluations in terms of content-related labels.
no code implementations • WS 2019 • Aleks Wawer, er, Grzegorz Wojdyga, Justyna Sarzy{\'n}ska-Wawer
The goal of our paper is to compare psycholinguistic text features with fact checking approaches to distinguish lies from true statements.
no code implementations • CONLL 2019 • Sajawel Ahmed, Manuel Stoeckel, Christine Driller, Adrian Pachzelt, Alex Mehler, er
The Specialized Information Service Biodiversity Research (BIOfid) has been launched to mobilize valuable biological data from printed literature hidden in German libraries for over the past 250 years.
no code implementations • CONLL 2019 • Lucia Donatelli, Meaghan Fowlie, Jonas Groschwitz, Alex Koller, er, Matthias Lindemann, Mario Mina, Pia Wei{\ss}enhorn
We describe the Saarland University submission to the shared task on Cross-Framework Meaning Representation Parsing (MRP) at the 2019 Conference on Computational Natural Language Learning (CoNLL).
no code implementations • WS 2019 • Sebastian Gehrmann, Zachary Ziegler, Alex Rush, er
Neural abstractive document summarization is commonly approached by models that exhibit a mostly extractive behavior.
no code implementations • WS 2019 • Saar Hommes, Chris van der Lee, Felix Clouth, Jeroen Vermunt, X Verbeek, er, Emiel Krahmer
In this paper, we present a novel data-to-text system for cancer patients, providing information on quality of life implications after treatment, which can be embedded in the context of shared decision making.
no code implementations • WS 2019 • Chris van der Lee, Albert Gatt, Emiel van Miltenburg, S Wubben, er, Emiel Krahmer
Currently, there is little agreement as to how Natural Language Generation (NLG) systems should be evaluated.
no code implementations • WS 2019 • Arne K{\"o}hn, Alex Koller, er
When generating technical instructions, it is often necessary to describe an object that does not exist yet.
no code implementations • RANLP 2019 • Nikolay Arefyev, Boris Sheludko, Alex Panchenko, er
Word Sense Induction (WSI) is the task of grouping of occurrences of an ambiguous word according to their meaning.
no code implementations • RANLP 2019 • Aleks Wawer, er, Julita Sobiczewska
In the second we train models on all available data except the given test collection, which we use for testing (one vs rest cross-domain).
no code implementations • RANLP 2019 • Alex Popov, er, Kiril Simov, Petya Osenova
This paper introduces several improvements over the current state of the art in knowledge-based word sense disambiguation.
no code implementations • RANLP 2019 • Alex Popov, er, Jennifer Sikos
Lexical resources such as WordNet (Miller, 1995) and FrameNet (Baker et al., 1998) are organized as graphs, where relationships between words are made explicit via the structure of the resource.
no code implementations • RANLP 2019 • Verena Lyding, Christos Rodosthenous, Federico Sangati, Umair ul Hassan, Lionel Nicolas, Alex K{\"o}nig, er, Jolita Horbacauskiene, Anisia Katinskaia
In this paper, we present our work on developing a vocabulary trainer that uses exercises generated from language resources such as ConceptNet and crowdsources the responses of the learners to enrich the language resource.
no code implementations • WS 2019 • Alex Erdmann, er, Salam Khalifa, Mai Oudah, Nizar Habash, Houda Bouamor
We present de-lexical segmentation, a linguistically motivated alternative to greedy or other unsupervised methods, requiring only minimal language specific input.
no code implementations • WS 2019 • Segun Taofeek Aroyehun, Alex Gelbukh, er
This paper details our approach to the task of detecting reportage of adverse drug reaction in tweets as part of the 2019 social media mining for healthcare applications shared task.
no code implementations • WS 2019 • Dario Stojanovski, Viktor Hangya, Matthias Huck, Alex Fraser, er
We describe LMU Munich{'}s machine translation system for German→Czech translation which was used to participate in the WMT19 shared task on unsupervised news translation.
no code implementations • WS 2019 • Dario Stojanovski, Alex Fraser, er
We describe LMU Munich{'}s machine translation system for English→German translation which was used to participate in the WMT19 shared task on supervised news translation.
no code implementations • WS 2019 • Alex Kuhnle, er, Ann Copestake
The correct interpretation of quantifier statements in the context of a visual scene requires non-trivial inference mechanisms.
no code implementations • WS 2019 • Alex Molchanov, er
This paper describes the PROMT submissions for the WMT 2019 Shared News Translation Task.
1 code implementation • WS 2019 • Dmitry Puzyrev, Artem Shelmanov, Alex Panchenko, er, Ekaterina Artemova
This paper presents the first gold-standard resource for Russian annotated with compositionality information of noun compounds.
1 code implementation • ACL 2019 • Sergey Golovanov, Rauf Kurbanov, Sergey Nikolenko, Kyryl Truskovskyi, Alex Tselousov, er, Thomas Wolf
Large-scale pretrained language models define state of the art in natural language processing, achieving outstanding performance on a variety of tasks.
no code implementations • ACL 2019 • {\"O}zge Sevgili, Alex Panchenko, er, Chris Biemann
Entity Disambiguation (ED) is the task of linking an ambiguous entity mention to a corresponding entry in a knowledge base.
1 code implementation • ACL 2019 • Artem Chernodub, Oleksiy Oliynyk, Philipp Heidenreich, Alex Bondarenko, Matthias Hagen, Chris Biemann, Alex Panchenko, er
We present TARGER, an open source neural argument mining framework for tagging arguments in free input texts and for keyword-based retrieval of arguments from an argument-tagged web-scale corpus.
no code implementations • ACL 2019 • Matthias Huck, Viktor Hangya, Alex Fraser, er
In our experiments we use a system trained on Europarl and mine sentences containing medical terms from monolingual data.
1 code implementation • ACL 2019 • Viktor Hangya, Alex Fraser, er
Mining parallel sentences from comparable corpora is important.
1 code implementation • ACL 2019 • Alex Koller, er, Stephan Oepen, Weiwei Sun
This tutorial is on representing and processing sentence meaning in the form of labeled directed graphs.
no code implementations • ACL 2019 • Abhik Jana, Dima Puzyrev, Alex Panchenko, er, Pawan Goyal, Chris Biemann, Animesh Mukherjee
In particular, we use hypernymy information of the multiword and its constituents encoded in the form of the recently introduced Poincar{\'e} embeddings in addition to the distributional information to detect compositionality for noun phrases.
no code implementations • WS 2019 • Matthias Huck, Diana Dutka, Alex Fraser, er
We tackle the important task of part-of-speech tagging using a neural model in the zero-resource scenario, where we have no access to gold-standard POS training data.
no code implementations • WS 2019 • Lisa Ferro, John Aberdeen, Karl Branting, Craig Pfeifer, Alex Yeh, er, Amartya Chakraborty
Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions.
2 code implementations • NAACL 2019 • Alex Erdmann, er, David Joseph Wrisley, Benjamin Allen, Christopher Brown, Sophie Cohen-Bod{\'e}n{\`e}s, Micha Elsner, Yukun Feng, Brian Joseph, B{\'e}atrice Joyeux-Prunel, Marie-Catherine de Marneffe
Scholars in inter-disciplinary fields like the Digital Humanities are increasingly interested in semantic annotation of specialized corpora.
no code implementations • SEMEVAL 2019 • Alex Oberstrass, er, Julia Romberg, Anke Stoll, Stefan Conrad
We present our results for OffensEval: Identifying and Categorizing Offensive Language in Social Media (SemEval 2019 - Task 6).
no code implementations • SEMEVAL 2019 • Iqra Ameer, Muhammad Hammad Fahim Siddiqui, Grigori Sidorov, Alex Gelbukh, er
The goal of this paper is to detect (A) Hate speech against immigrants and women, (B) Aggressive behavior and target classification, both for English and Spanish.
no code implementations • SEMEVAL 2019 • Nikolay Arefyev, Boris Sheludko, Adis Davletov, Dmitry Kharchev, Alex Nevidomsky, Alex Panchenko, er
We describe our solutions for semantic frame and role induction subtasks of SemEval 2019 Task 2.
1 code implementation • WS 2018 • Chris van der Lee, Emiel Krahmer, S Wubben, er
The current study investigated novel techniques and methods for trainable approaches to data-to-text generation.
no code implementations • WS 2018 • Hendrik Strobelt, Sebastian Gehrmann, Michael Behrisch, Adam Perer, Hanspeter Pfister, Alex Rush, er
Neural attention-based sequence-to-sequence models (seq2seq) (Sutskever et al., 2014; Bahdanau et al., 2014) have proven to be accurate and robust for many sequence prediction tasks.
no code implementations • WS 2018 • Alex Shvets, er, Simon Mille, Leo Wanner
An increasing amount of research tackles the challenge of text generation from abstract ontological or semantic structures, which are in their very nature potentially large connected graphs.
1 code implementation • WS 2018 • Thiago Castro Ferreira, Diego Moussallem, Emiel Krahmer, S Wubben, er
This paper describes the enrichment of WebNLG corpus (Gardent et al., 2017a, b), with the aim to further extend its usefulness as a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation.
no code implementations • WS 2018 • Henry Elder, Sebastian Gehrmann, Alex O{'}Connor, er, Qun Liu
In natural language generation (NLG), the task is to generate utterances from a more abstract input, such as structured data.
no code implementations • WS 2018 • Alex Erdmann, er, Nizar Habash
Morphologically rich languages are challenging for natural language processing tasks due to data sparsity.
no code implementations • WS 2018 • Dario Stojanovski, Viktor Hangya, Matthias Huck, Alex Fraser, er
We describe LMU Munich{'}s unsupervised machine translation systems for English↔German translation.
1 code implementation • EMNLP 2018 • Luke Melas-Kyriazi, Alex Rush, er, George Han
Image paragraph captioning models aim to produce detailed descriptions of a source image.
no code implementations • WS 2018 • Dario Stojanovski, Alex Fraser, er
We show that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7. 02 BLEU for coreference and 1. 89 BLEU for coherence on subtitles translation.
no code implementations • WS 2018 • Segun Taofeek Aroyehun, Alex Gelbukh, er
We describe our submissions to the Third Social Media Mining for Health Applications Shared Task.
1 code implementation • EMNLP 2018 • Navonil Majumder, Soujanya Poria, Alex Gelbukh, er, Md. Shad Akhtar, Erik Cambria, Asif Ekbal
Sentiment analysis has immense implications in e-commerce through user feedback mining.
no code implementations • WS 2018 • Viktor Hangya, Alex Fraser, er
In this paper we describe LMU Munich{'}s submission for the \textit{WMT 2018 Parallel Corpus Filtering} shared task which addresses the problem of cleaning noisy parallel corpora.
no code implementations • WS 2018 • Matthias Huck, Dario Stojanovski, Viktor Hangya, Alex Fraser, er
The systems were used for our participation in the WMT18 biomedical translation task and in the shared task on machine translation of news.
no code implementations • WS 2018 • Ngoc-Quan Pham, Jan Niehues, Alex Waibel, er
We present our experiments in the scope of the news translation task in WMT 2018, in directions: English→German.
no code implementations • WS 2018 • Alex Molchanov, er
This paper describes the PROMT submissions for the WMT 2018 Shared News Translation Task.
1 code implementation • COLING 2018 • Juan Diego Rodriguez, Adam Caldwell, Alex Liu, er
Our results empirically demonstrate when each of the published approaches tends to do well.
Entity Extraction using GAN Named Entity Recognition (NER) +2
no code implementations • COLING 2018 • Florian Dessloch, Thanh-Le Ha, Markus M{\"u}ller, Jan Niehues, Thai-Son Nguyen, Ngoc-Quan Pham, Elizabeth Salesky, Matthias Sperber, Sebastian St{\"u}ker, Thomas Zenkel, Alex Waibel, er
{\%} Combining these techniques, we are able to provide an adapted speech translation system for several European languages.
no code implementations • COLING 2018 • Chris van der Lee, Bart Verduijn, Emiel Krahmer, S Wubben, er
We present an evaluation of PASS, a data-to-text system that generates Dutch soccer reports from match statistics which are automatically tailored towards fans of one club or the other.
no code implementations • COLING 2018 • Segun Taofeek Aroyehun, Alex Gelbukh, er
On this task, we investigate the efficacy of deep neural network models of varying complexity.
no code implementations • COLING 2018 • Daniel Baumartz, Tolga Uslu, Alex Mehler, er
In this paper we present LTV, a website and API that generates labeled topic classifications based on the Dewey Decimal Classification (DDC), an international standard for topic classification in libraries.
1 code implementation • COLING 2018 • Florian Kunneman, S Wubben, er, Antal Van den Bosch, Emiel Krahmer
In the second evaluation, the gold-standard pros and cons were assessed along with the system output.
no code implementations • WS 2018 • Jean Senellart, Dakun Zhang, Bo wang, Guillaume Klein, Ramatch, Jean-Pierre irin, Josep Crego, Alex Rush, er
We present a system description of the OpenNMT Neural Machine Translation entry for the WNMT 2018 evaluation.
no code implementations • ACL 2018 • Hannah Rohde, Alex Johnson, er, Nathan Schneider, Bonnie Webber
Theories of discourse coherence posit relations between discourse segments as a key feature of coherent text.
1 code implementation • WS 2018 • Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer
This study describes the approach developed by the Tilburg University team to the shallow task of the Multilingual Surface Realization Shared Task 2018 (SR18).
1 code implementation • WS 2018 • Alex Rush, er
A major goal of open-source NLP is to quickly and accurately reproduce the results of new work, in a manner that the community can easily use and modify.
no code implementations • ACL 2018 • Mikhail Burtsev, Alex Seliverstov, er, Rafael Airapetyan, Mikhail Arkhipov, Dilyara Baymurzina, Nickolay Bushkov, Olga Gureenkova, Taras Khakhulin, Yuri Kuratov, Denis Kuznetsov, Alexey Litinsky, Varvara Logacheva, Alexey Lymar, Valentin Malykh, Maxim Petrov, Vadim Polulyakh, Leonid Pugachev, Alexey Sorokin, Maria Vikhreva, Marat Zaynutdinov
It supports modular as well as end-to-end approaches to implementation of conversational agents.
1 code implementation • ACL 2018 • Viktor Hangya, Fabienne Braune, Alex Fraser, er, Hinrich Sch{\"u}tze
Bilingual tasks, such as bilingual lexicon induction and cross-lingual classification, are crucial for overcoming data sparsity in the target language.
no code implementations • ACL 2018 • Alex Erdmann, er, Nasser Zalmout, Nizar Habash
Arabic dialects lack large corpora and are noisy, being linguistically disparate with no standardized spelling.
no code implementations • WS 2018 • Alex Rich, er, Pamela Osborn Popp, David Halpern, Anselm Rothe, Todd Gureckis
Psychological research on learning and memory has tended to emphasize small-scale laboratory studies.
no code implementations • WS 2018 • Segun Taofeek Aroyehun, Jason Angel, Daniel Alej P{\'e}rez Alvarez, ro, Alex Gelbukh, er
We describe the systems of NLP-CIC team that participated in the Complex Word Identification (CWI) 2018 shared task.
no code implementations • NAACL 2018 • Nasser Zalmout, Alex Erdmann, er, Nizar Habash
User-generated text tends to be noisy with many lexical and orthographic inconsistencies, making natural language processing (NLP) tasks more challenging.
no code implementations • SEMEVAL 2018 • Elena Mikhalkova, Yuri Karyakin, Alex Voronov, er, Dmitry Grigoriev, Artem Leoznov
The paper describes our search for a universal algorithm of detecting intentional lexical ambiguity in different forms of creative language.
no code implementations • WS 2018 • Agnieszka Mykowiecka, Aleks Wawer, er, Malgorzata Marciniak
The paper addresses detection of figurative usage of words in English text.
1 code implementation • WS 2018 • Egon Stemle, Alex Onysko, er
This article describes the system that participated in the shared task on metaphor detection on the Vrije University Amsterdam Metaphor Corpus (VUA).
no code implementations • WS 2018 • Filip Skurniak, Maria Janicka, Aleks Wawer, er
This paper describes multiple solutions designed and tested for the problem of word-level metaphor detection.
no code implementations • WS 2018 • Agnieszka Mykowiecka, Malgorzata Marciniak, Aleks Wawer, er
The paper addresses the classification of isolated Polish adjective-noun phrases according to their metaphoricity.
no code implementations • SEMEVAL 2018 • Lena Hettinger, Alex Dallmann, er, Albin Zehe, Thomas Niebler, Andreas Hotho
In this paper we describe our system for SemEval-2018 Task 7 on classification of semantic relations in scientific literature for clean (subtask 1. 1) and noisy data (subtask 1. 2).
1 code implementation • NAACL 2018 • Fabienne Braune, Viktor Hangya, Tobias Eder, Alex Fraser, er
Bilingual word embeddings are useful for bilingual lexicon induction, the task of mining translations of given words.
no code implementations • SEMEVAL 2018 • Alex Zhang, er, Marine Carpuat
We describe the University of Maryland{'}s submission to SemEval-018 Task 10, {``}Capturing Discriminative Attributes{''}: given word triples (w1, w2, d), the goal is to determine whether d is a discriminating attribute belonging to w1 but not w2.
no code implementations • LREC 2018 • Nizar Habash, Fadhl Eryani, Salam Khalifa, Owen Rambow, Dana Abdulrahim, Alex Erdmann, er, Reem Faraj, Wajdi Zaghouani, Houda Bouamor, Nasser Zalmout, Sara Hassan, Faisal Al-Shargi, Sakhar Alkhereyf, Basma Abdulkareem, Esk, Ramy er, Mohammad Salameh, Hind Saddiki
no code implementations • IJCNLP 2017 • Somnath Banerjee, Partha Pakray, Riyanka Manna, Dipankar Das, Alex Gelbukh, er
In this paper, we describe a deep learning framework for analyzing the customer feedback as part of our participation in the shared task on Customer Feedback Analysis at the 8th International Joint Conference on Natural Language Processing (IJCNLP 2017).
no code implementations • IJCNLP 2017 • Bill McDowell, Nathanael Chambers, Alex Ororbia II, er, David Reitter
Within this prediction reranking framework, we propose an alternative scoring function, showing an 8. 8{\%} relative gain over the original CAEVO.
no code implementations • WS 2017 • Thiago Castro Ferreira, Iacer Calixto, S Wubben, er, Emiel Krahmer
In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT).
no code implementations • WS 2017 • Jeffrey Ling, Alex Rush, er
Sequence-to-sequence models with attention have been successful for a variety of NLP problems, but their speed does not scale well for tasks with long source sequences such as document summarization.
Ranked #25 on Document Summarization on CNN / Daily Mail
no code implementations • RANLP 2017 • Seid Muhie Yimam, Steffen Remus, Alex Panchenko, er, Andreas Holzinger, Chris Biemann
In this paper, we describe the concept of entity-centric information access for the biomedical domain.
no code implementations • RANLP 2017 • Alex Popov, er
This paper presents a neural network architecture for word sense disambiguation (WSD).
no code implementations • WS 2017 • Christoph Teichmann, Alex Koller, er, Jonas Groschwitz
We generalize coarse-to-fine parsing to grammar formalisms that are more expressive than PCFGs and/or describe languages of trees or graphs.
no code implementations • WS 2017 • Alex Koller, er, Nikos Engonopoulos
Integrating surface realization and the generation of referring expressions into a single algorithm can improve the quality of the generated sentences.
no code implementations • RANLP 2017 • Aleks Wawer, er, Agnieszka Mykowiecka
In this paper we describe experiments with automated detection of metaphors in the Polish language.
no code implementations • WS 2017 • Chris van der Lee, Emiel Krahmer, S Wubben, er
We present PASS, a data-to-text system that generates Dutch soccer reports from match statistics.
no code implementations • WS 2017 • Thomas Alex Trost, er, Dietrich Klakow
Word embeddings are high-dimensional vector representations of words and are thus difficult to interpret.
no code implementations • WS 2017 • Alex Prange, er, Margarita Chikobava, Peter Poller, Michael Barz, Daniel Sonntag
We present a multimodal dialogue system that allows doctors to interact with a medical decision support system in virtual reality (VR).
no code implementations • ACL 2017 • Mart{\'\i}n Villalba, Christoph Teichmann, Alex Koller, er
The referring expressions (REs) produced by a natural language generation (NLG) system can be misunderstood by the hearer, even when they are semantically correct.
no code implementations • CL 2017 • Hassan Sajjad, Helmut Schmid, Alex Fraser, er, Hinrich Sch{\"u}tze
After training, the unlabeled data is disambiguated based on the posterior probabilities of the two sub-models.
no code implementations • EACL 2017 • Stefano Faralli, Alex Panchenko, er, Chris Biemann, Simone Paolo Ponzetto
In this paper, we present ContrastMedium, an algorithm that transforms noisy semantic networks into full-fledged, clean taxonomies.
no code implementations • EACL 2017 • Beto Boullosa, Richard Eckart de Castilho, Alex Geyken, er, Lothar Lemnitzer, Iryna Gurevych
This paper describes an application system aimed to help lexicographers in the extraction of example sentences for a given headword based on its different senses.
no code implementations • EACL 2017 • Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er
The model relies on the REGnames corpus, a dataset with 53, 102 proper name references to 1, 000 people in different discourse contexts.
no code implementations • EACL 2017 • Alex Panchenko, er, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
On the example of word sense induction and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy.
no code implementations • EACL 2017 • Tolga Uslu, Wahed Hemati, Alex Mehler, er, Daniel Baumartz
R is a very powerful framework for statistical modeling.
no code implementations • EACL 2017 • Johannes Gontrum, Jonas Groschwitz, Alex Koller, er, Christoph Teichmann
We present Alto, a rapid prototyping tool for new grammar formalisms.
no code implementations • WS 2017 • Aleks Wawer, er, Agnieszka Mykowiecka
This paper compares two approaches to word sense disambiguation using word embeddings trained on unambiguous synonyms.
no code implementations • EACL 2017 • Matthias Huck, Ale{\v{s}} Tamchyna, Ond{\v{r}}ej Bojar, Alex Fraser, er
Translating into morphologically rich languages is difficult.
no code implementations • EACL 2017 • Verena Henrich, Alex Lang, er
Understanding the social media audience is becoming increasingly important for social media analysis.
no code implementations • EACL 2017 • Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde
Many errors in phrase-based SMT can be attributed to problems on three linguistic levels: morphological complexity in the target language, structural differences and lexical choice.
no code implementations • WS 2017 • Alex Calderwood, er, Elizabeth A. Pruett, Raymond Ptucha, Christopher Homan, Cecilia Ovesdotter Alm
Interpersonal violence (IPV) is a prominent sociological problem that affects people of all demographic backgrounds.
no code implementations • WS 2017 • Alex Panchenko, er, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
We introduce a new method for unsupervised knowledge-based word sense disambiguation (WSD) based on a resource that links two types of sense-aware lexical networks: one is induced from a corpus using distributional semantics, the other is manually constructed.
no code implementations • WS 2016 • Sowmya Vajjala, Detmar Meurers, Alex Eitel, er, Katharina Scheiter
Computational approaches to readability assessment are generally built and evaluated using gold standard corpora labeled by publishers or teachers rather than being grounded in observations about human performance.
no code implementations • WS 2016 • Christian Bentz, Tatyana Ruzsics, Alex Koplenig, er, Tanja Samard{\v{z}}i{\'c}
Language complexity is an intriguing phenomenon argued to play an important role in both language learning and processing.
no code implementations • COLING 2016 • Rudolf Schneider, Cordula Guder, Torsten Kilias, Alex L{\"o}ser, er, Jens Graupmann, Oleks Kozachuk, R
We present INDREX-MM, a main memory database system for interactively executing two interwoven tasks, declarative relation extraction from text and their exploitation with SQL.
no code implementations • COLING 2016 • Wahed Hemati, Tolga Uslu, Alex Mehler, er
More and more disciplines require NLP tools for performing automatic text analyses on various levels of linguistic resolution.
no code implementations • WS 2016 • Markus Kreuzthaler, Michel Oleynik, Alex Avian, er, Stefan Schulz
The disambiguation of period characters is therefore an important task for sentence and abbreviation detection.
no code implementations • COLING 2016 • Yimai Fang, Haoyue Zhu, Ewa Muszy{\'n}ska, Alex Kuhnle, er, Simone Teufel
It is a further development of an existing summariser that has an incremental, proposition-based content selection process but lacks a natural language (NL) generator for the final output.
no code implementations • WS 2016 • Alex Erdmann, er, Christopher Brown, Brian Joseph, Mark Janse, Petra Ajaka, Micha Elsner, Marie-Catherine de Marneffe
Although spanning thousands of years and genres as diverse as liturgy, historiography, lyric and other forms of prose and poetry, the body of Latin texts is still relatively sparse compared to English.
no code implementations • COLING 2016 • Sebastian Arnold, Robert Dziuba, Alex L{\"o}ser, er
We introduce TASTY (Tag-as-you-type), a novel text editor for interactive entity linking as part of the writing process.
no code implementations • WS 2016 • Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer
no code implementations • WS 2016 • Jan-Thorsten Peter, Tamer Alkhouli, Hermann Ney, Matthias Huck, Fabienne Braune, Alex Fraser, er, Ale{\v{s}} Tamchyna, Ond{\v{r}}ej Bojar, Barry Haddow, Rico Sennrich, Fr{\'e}d{\'e}ric Blain, Lucia Specia, Jan Niehues, Alex Waibel, Alex Allauzen, re, Lauriane Aufrant, Franck Burlot, Elena Knyazeva, Thomas Lavergne, Fran{\c{c}}ois Yvon, M{\=a}rcis Pinnis, Stella Frank
Ranked #12 on Machine Translation on WMT2016 English-Romanian
no code implementations • ACL 2016 • Seid Muhie Yimam, Heiner Ulrich, von L, Tatiana esberger, Marcel Rosenbach, Michaela Regneri, Alex Panchenko, er, Franziska Lehmann, Uli Fahrer, Chris Biemann, Kathrin Ballweg
no code implementations • NAACL 2016 • Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er
no code implementations • SEMEVAL 2016 • Alex Panchenko, er, Stefano Faralli, Eugen Ruppert, Steffen Remus, Hubert Naets, C{\'e}drick Fairon, Simone Paolo Ponzetto, Chris Biemann
1 code implementation • LREC 2016 • Alice Frain, S Wubben, er
We test the viability of our data on the task of classification of satire.
no code implementations • LREC 2016 • Alex Panchenko, er
Word sense embeddings represent a word sense as a low-dimensional numeric vector.
no code implementations • LREC 2016 • Andy Luecking, Alex Mehler, er, D{\'e}sir{\'e}e Walther, Marcel Mauri, Dennis Kurf{\"u}rst
The stimulus terms have been compiled mainly from image schemata from psycholinguistics, since such schemata provide a panoply of abstract contents derived from natural language use.
no code implementations • LREC 2016 • Aleks Wawer, er
The paper contains a description of OPFI: Opinion Finder for the Polish Language, a freely available tool for opinion target extraction.
no code implementations • LREC 2016 • Maxim Sidorov, Alex Schmitt, er, Eugene Semenkin, Wolfgang Minker
Emotion Recognition (ER) is an important part of dialogue analysis which can be used in order to improve the quality of Spoken Dialogue Systems (SDSs).
no code implementations • LREC 2016 • Ann Copestake, Guy Emerson, Michael Wayne Goodman, Matic Horvat, Alex Kuhnle, er, Ewa Muszy{\'n}ska
We describe resources aimed at increasing the usability of the semantic representations utilized within the DELPH-IN (Deep Linguistic Processing with HPSG) consortium.
no code implementations • LREC 2016 • Tim vor der Br{\"u}ck, Alex Mehler, er
We present a morphological tagger for Latin, called TTLab Latin Tagger based on Conditional Random Fields (TLT-CRF) which uses a large Latin lexicon.
no code implementations • LREC 2016 • Andy Luecking, Armin Hoenen, Alex Mehler, er
In order to introduce TGermaCorp in comparison to more homogeneous corpora of contemporary everyday language, quantitative assessments of syntactic and lexical diversity are provided.
no code implementations • LREC 2016 • Steffen Eger, R{\"u}diger Gleim, Alex Mehler, er
This paper relates to the challenge of morphological tagging and lemmatization in morphologically rich languages by example of German and Latin.
no code implementations • LREC 2016 • Alex Gutkin, er, Linne Ha, Martin Jansche, Knot Pipatsrisawat, Richard Sproat
We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh.