no code implementations • WS 2020 • Aquia Richburg, Esk, Ramy er, Smar Muresan, a, Marine Carpuat
Byte-Pair Encoding (BPE) (Sennrich et al., 2016) has become a standard pre-processing step when building neural machine translation systems.
no code implementations • LREC 2020 • Petra Galuscakova, Douglas Oard, Joe Barrow, Suraj Nair, Shing Han-Chin, Elena Zotkina, Esk, Ramy er, Rui Zhang
At about the midpoint of the IARPA MATERIAL program in October 2019, an evaluation was conducted on systems{'} abilities to find Lithuanian documents based on English queries.
no code implementations • LREC 2020 • Esk, Ramy er, Francesca Callejas, Elizabeth Nichols, Judith Klavans, Smar Muresan, a
Computational morphological segmentation has been an active research topic for decades as it is beneficial for many natural language processing tasks.
no code implementations • WS 2019 • Esk, Ramy er, Judith Klavans, Smar Muresan, a
Polysynthetic languages pose a challenge for morphological analysis due to the root-morpheme complexity and to the word class {``}squish{''}.
no code implementations • WS 2018 • Esk, Ramy er, Owen Rambow, Smar Muresan, a
Morphological segmentation is beneficial for several natural language processing tasks dealing with large vocabularies.
no code implementations • LREC 2018 • Nizar Habash, Fadhl Eryani, Salam Khalifa, Owen Rambow, Dana Abdulrahim, Alex Erdmann, er, Reem Faraj, Wajdi Zaghouani, Houda Bouamor, Nasser Zalmout, Sara Hassan, Faisal Al-Shargi, Sakhar Alkhereyf, Basma Abdulkareem, Esk, Ramy er, Mohammad Salameh, Hind Saddiki
no code implementations • COLING 2016 • Esk, Ramy er, Owen Rambow, Tianchun Yang
We investigate using Adaptor Grammars for unsupervised morphological segmentation.
no code implementations • COLING 2016 • Esk, Ramy er, Nizar Habash, Owen Rambow, Arfath Pasha
Arabic dialects present a special problem for natural language processing because there are few resources, they have no standard orthography, and have not been studied much.
no code implementations • LREC 2016 • Faisal Al-Shargi, Aidan Kaplan, Esk, Ramy er, Nizar Habash, Owen Rambow
We present new language resources for Moroccan and Sanaani Yemeni Arabic.
no code implementations • LREC 2016 • Mohamed Al-Badrashiny, Arfath Pasha, Mona Diab, Nizar Habash, Owen Rambow, Wael Salloum, Esk, Ramy er
Text preprocessing is an important and necessary task for all NLP applications.
no code implementations • WS 2014 • Ann Bies, Zhiyi Song, Mohamed Maamouri, Stephen Grimes, Haejoong Lee, Jonathan Wright, Stephanie Strassel, Nizar Habash, Esk, Ramy er, Owen Rambow
no code implementations • LREC 2014 • Mohamed Maamouri, Ann Bies, Seth Kulick, Michael Ciul, Nizar Habash, Esk, Ramy er
This paper describes the parallel development of an Egyptian Arabic Treebank and a morphological analyzer for Egyptian Arabic (CALIMA).
no code implementations • LREC 2014 • Mona Diab, Mohamed Al-Badrashiny, Maryam Aminian, Mohammed Attia, Heba Elfardy, Nizar Habash, Abdelati Hawwari, Wael Salloum, Pradeep Dasigi, Esk, Ramy er
Multiple levels of quality checks are performed on the output of each step in the creation process.
no code implementations • LREC 2014 • Arfath Pasha, Mohamed Al-Badrashiny, Mona Diab, Ahmed El Kholy, Esk, Ramy er, Nizar Habash, Manoj Pooleery, Owen Rambow, Ryan Roth
In this paper, we present MADAMIRA, a system for morphological analysis and disambiguation of Arabic that combines some of the best aspects of two previously commonly used systems for Arabic processing, MADA (Habash and Rambow, 2005; Habash et al., 2009; Habash et al., 2013) and AMIRA (Diab et al., 2007).