Study of different Statistical Machine Learning Techniques for Text Sentiment Classification | ||||
Fayoum University Journal of Engineering | ||||
Article 4, Volume 5, Issue 1, January 2022, Page 66-73 PDF (334.35 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/fuje.2022.124088.1013 | ||||
View on SCiNiTO | ||||
Authors | ||||
Abdelrahman Nadi Taha 1; Rania Ahmed Abuelsoud2 | ||||
1Teaching Assistant, Electrical Engineering Department, Faculty of Engineering, Fayoum University, Fayoum 63514, Egypt | ||||
2Professor of Electrical Engineering - Faculty of Engineering - Fayoum University - Fayoum 63514, Egypt | ||||
Abstract | ||||
Text classification is an important task in NLP for various applications from movie review classification to market analysis. NLP as a tool provides the capability to process huge amount of text and come up with conclusions. In this paper we inves-tigate statistical machine learning for NLP for document classification. The target problem of choice is sentiment analysis, we explore various techniques for text pre-processing, feature selection and model selection to find a good fit model. This paper acts as both a system proposal and also a primer for those who to start practicing NLP, we try to provide insight and intuition about modelling choices for text classi-fication that extend even beyond the task scope to general NLP. In this paper we propose a feature based text sentiment analysis relying heavily of the BoN (Bag of N-grams) model and utilizing these features with a statistical ML classifier. We use the IMDB movie review dataset (Maas et al. 2011) for benchmarking. | ||||
Keywords | ||||
Machine Learning; Sentiment Analysis; Natural Language Processing; IMDB Sentiment Analysis; Text Classification | ||||
Statistics Article View: 140 PDF Download: 151 |
||||