Arabic Tweets Spam Detection Based on Various Supervised Machine Learning and Deep Learning Classifiers

Hassan, Shimaa I.; Elrefaei, Lamiaa; Andraws, Mina Shoukrey

doi:10.21608/msaeng.2023.291931

	Arabic Tweets Spam Detection Based on Various Supervised Machine Learning and Deep Learning Classifiers
MSA Engineering Journal
Volume 2, Issue 2 - Serial Number 6, March 2023, Pages 1099-1119 PDF (634.03 K)
Document Type: Original Article
DOI: 10.21608/msaeng.2023.291931
Authors
Shimaa I. Hassan¹; Lamiaa Elrefaei¹; Mina Shoukrey Andraws²
¹Electrical Engineering Department, Faculty of Engineering at Shoubra, Benha University, Cairo, Egypt
²Engineering Department, Nuclear Research Centre, Egyptian Atomic Energy Authority, Cairo, Egypt
Abstract
In this paper, different machine learning algorithms, ensemble algorithms, and deep learning algorithms are applied to Arabic tweets to detect whether it human-generated or not. The tweets are used twice as preprocessed and nonpreprocessed to measure the effectiveness of Arabic preprocessing in the classification process. The data is also tokenized with various methods like unigram, trigram, and Term Frequency–Inverse Document Frequency. The experiments show that the support vector machine with the non-preprocessed tweets and unigram tokenization has the best performance of 83.11% and a precision of 0.9516 while it predicts the spam or not in a relatively small time.
Keywords
Machine Learning; Ensemble; Deep Learning; Arabic Tweets; Twitter spam

Statistics Article View: 212 PDF Download: 664