AraSenTi: Large-Scale Twitter-Specific Arabic Sentiment Lexicons

Sentiment Analysis (SA) is an active research area nowadays due to the tremendous interest in aggregating and evaluating opinions being disseminated by users on the Web. SA of English has been thoroughly researched; however research on SA of Arabic has just flourished. Twitter is considered a powerful tool for disseminating information and a rich resource for opinionated text containing views on many different topics. In this paper we attempt to bridge a gap in Arabic SA of Twitter which is the lack of sentiment lexicons that are tailored for the informal language of Twitter.

Combining Instance Weighting and Fine Tuning for Training Naïve Bayesian Classifiers with Scant data

This work addresses the problem of having to train a Naïve Bayesian classifier using limited data. It first presents an improved instance-weighting algorithm that is accurate and robust to noise and then it shows how to combine it with a fine tuning algorithm to achieve even better classification accuracy. Our empirical work using 49 benchmark data sets shows that the improved instance-weighting method outperforms the original algorithm on both noisy and noise-free data sets.

Arabic Spam Detection in Twitter

Spam in Twitter has emerged due to the proliferation of this social network among users worldwide coupled with the ease of creating content. Having different characteristics than Web or mail spam, Twitter spam detection approaches have become a new research problem. This study aims to analyse the content of Saudi tweets to detect spam by developing both a rule-based approach that exploits a spam lexicon extracted from the tweets and a supervised learning approach that utilizes statistical methods based on the bag of words model and several features.

الصفحات

اشترك ب KSU Faculty آر.إس.إس