تجاوز إلى المحتوى الرئيسي
User Image

Nora S. AlTwairesh

Assistant Professor

Head, Information Technology Department

علوم الحاسب والمعلومات
KSU Female Campus - Building 6 T121
المنشورات
مقال فى مجلة
2019

Surface and Deep Features Ensemble for Sentiment Analysis of Arabic Tweets

Sentiment analysis (SA) of Arabic tweets is a complex task due to the rich morphology of the Arabic language and the informal nature of language on Twitter. Previous research on the SA of tweets mainly focused on manually extracting features from the text. Recently, neural word embeddings have been utilized as less labor-intensive representations than manual feature engineering. Most of these word-embeddings model the syntactic information of words while ignoring the sentiment context. In this paper, we propose to learn sentiment-specific word embeddings from Arabic tweets and use them in the Arabic Twitter sentiment classification. Moreover, we propose a feature ensemble model of surface and deep features. The surface features are manually extracted features, and the deep features are generic word embeddings and sentiment-specific word embeddings. The extensive experiments are performed to test the effectiveness of the surface and deep features ensemble, pooling functions, embeddings size, and cross-dataset models. The recent language representation model BERT is also evaluated on the task of SA of Arabic tweets. The models are evaluated on three different datasets of Arabic tweets, and they outperform the previous results on all these datasets with a significant increase in the F-score. The experimental results demonstrate that: 1) the highest performing model is the ensemble of surface and deep features and 2) the approach achieves the state-of-the-art results on several benchmarking datasets.

رقم المجلد
7
مجلة/صحيفة
https://ieeexplore.ieee.org/abstract/document/8743359
الصفحات
84122 - 84131
مزيد من المنشورات
publications

The field of natural language processing (NLP) has witnessed a boom in language representation models with the introduction of pretrained language models that are trained on massive textual data…

publications

Approaches for developing Dialogue Systems (DSs) are typically categorized into rule-based and data-driven. Data-driven DSs require a massive quantity of training data, while rule-based DSs rely…

2021
publications

Over the past few years, Twitter has experienced massive growth and the volume of its online content has increased rapidly. This content has been a rich source for several studies that focused on…

2021