论文标题
tunizi:突尼斯阿拉伯情绪分析数据集
TUNIZI: a Tunisian Arabizi sentiment analysis Dataset
论文作者
论文摘要
在社交媒体上,阿拉伯人倾向于以自己的本地方言表达自己。突尼斯人更特别地使用称为“突尼斯Arabizi”的非正式方式。分析研究旨在探索和认可旨在利用它们以进行计划和预测目的的在线意见,例如衡量客户满意度以及建立销售和营销策略。但是,基于深度学习的分析研究是饥饿的数据。另一方面,非洲语言和方言被认为是低资源语言。例如,据我们所知,不存在带注释的突尼斯阿拉伯数据集。在本文中,我们介绍了Tunizi的情感分析突尼斯阿拉伯数据集,该数据集是从社交网络中收集的,进行了预处理进行分析研究,并由突尼斯母语者手动注释。
On social media, Arabic people tend to express themselves in their own local dialects. More particularly, Tunisians use the informal way called "Tunisian Arabizi". Analytical studies seek to explore and recognize online opinions aiming to exploit them for planning and prediction purposes such as measuring the customer satisfaction and establishing sales and marketing strategies. However, analytical studies based on Deep Learning are data hungry. On the other hand, African languages and dialects are considered low resource languages. For instance, to the best of our knowledge, no annotated Tunisian Arabizi dataset exists. In this paper, we introduce TUNIZI a sentiment analysis Tunisian Arabizi Dataset, collected from social networks, preprocessed for analytical studies and annotated manually by Tunisian native speakers.