JCTDHS at SemEval-2019 Task 5: Detection of Hate Speech in Tweets using Deep Learning Methods, Character N-gram Features, and Preprocessing Methods

SEMEVAL 2019 · Yaakov HaCohen-Kerner, Elyashiv Shayovitz, Shalom Rochman, Eli Cahn, Gal Didi, Ziv Ben-David ·

In this paper, we describe our submissions to SemEval-2019 contest. We tackled subtask A - {``}a binary classification where systems have to predict whether a tweet with a given target (women or immigrants) is hateful or not hateful{''}, a part of task 5 {``}Multilingual detection of hate speech against immigrants and women in Twitter (hatEval){''}. Our system JCTDHS (Jerusalem College of Technology Detects Hate Speech) was developed for tweets written in English. We applied various supervised ML methods, various combinations of n-gram features using the TF-IDF scheme and. In addition, we applied various combinations of eight basic preprocessing methods. Our best submission was a special bidirectional RNN, which was ranked at the 11th position out of 68 submissions.

PDF Abstract