DAICT: A Dialectal Arabic Irony Corpus Extracted from Twitter

Identifying irony in user-generated social media content has a wide range of applications; however to date Arabic content has received limited attention. To bridge this gap, this study builds a new open domain Arabic corpus annotated for irony detection. We query Twitter using irony-related hashtags to collect ironic messages, which are then manually annotated by two linguists according to our working definition of irony. Challenges which we have encountered during the annotation process reflect the inherent limitations of Twitter messages interpretation, as well as the complexity of Arabic and its dialects. Once published, our corpus will be a valuable free resource for developing open domain systems for automatic irony recognition in Arabic language and its dialects in social media text.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here