DAICT is a new Arabic irony-detection corpus extracted from Twitter. The dataset includes 5,588 tweets -- written in both MSA and dialectual Arabic -- manually annotated by two professional linguistics from HBKU. Tweets were collected using four irony-related hashtags.
This new dataset bridges a gap since currently there are very few Arabic corpora annotated for irony.
This corpus is a valuable resource for works in the the field of irony detection, Arabic dialects, Arabic social media, and sentiment analysis.
This project was supported by the generous grant NPRP 09-175-1-033 from the Qatar National Research Fund (a member of Qatar Foundation).
By downloading the from HERE you agree to the terms and conditions.