Arabic Tweets Spam Detection Based on Various Supervised Machine Learning and Deep Learning Classifiers | ||||
MSA Engineering Journal | ||||
Volume 2, Issue 2 - Serial Number 6, March 2023, Page 1099-1119 PDF (634.03 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/msaeng.2023.291931 | ||||
![]() | ||||
Authors | ||||
Shimaa I. Hassan1; Lamiaa Elrefaei1; Mina Shoukrey Andraws2 | ||||
1Electrical Engineering Department, Faculty of Engineering at Shoubra, Benha University, Cairo, Egypt | ||||
2Engineering Department, Nuclear Research Centre, Egyptian Atomic Energy Authority, Cairo, Egypt | ||||
Abstract | ||||
In this paper, different machine learning algorithms, ensemble algorithms, and deep learning algorithms are applied to Arabic tweets to detect whether it human-generated or not. The tweets are used twice as preprocessed and nonpreprocessed to measure the effectiveness of Arabic preprocessing in the classification process. The data is also tokenized with various methods like unigram, trigram, and Term Frequency–Inverse Document Frequency. The experiments show that the support vector machine with the non-preprocessed tweets and unigram tokenization has the best performance of 83.11% and a precision of 0.9516 while it predicts the spam or not in a relatively small time. | ||||
Keywords | ||||
Machine Learning; Ensemble; Deep Learning; Arabic Tweets; Twitter spam | ||||
Statistics Article View: 190 PDF Download: 637 |
||||