Voting Ensemble Learning for Multilingual Sentiment Analysis: A Comparative Study

Salah, Ahmad; Elhadidi, Eslam Ashraf; Abdallah, Marwa M; Darwish, Saad M.

doi:10.21608/jocc.2025.446637

	Voting Ensemble Learning for Multilingual Sentiment Analysis: A Comparative Study
Journal of Computing and Communication
Article 3, Volume 4, Issue 2, July 2025, Page 32-44 PDF (814.54 K)
Document Type: Original Article
DOI: 10.21608/jocc.2025.446637
View on SCiNiTO
Authors
Ahmad Salah¹; Eslam Ashraf Elhadidi²; Marwa M Abdallah³; Saad M. Darwish⁴
¹Faculty of Computers and Informatics, Zagazig University
²70, Maktabat Elghad Street, Elzera'a, Mubarak district
³Elzera'a, Mubarak district
⁴Institute of Graduate Studies and Research, Alexandria University
Abstract
Multilingual sentiment analysis (MSA) faces many challenges due to language differences, cultural considerations, and data sparsity across languages. Although ensemble learning shows promise for improving robustness and accuracy by combining classifiers, there are relatively few systematic comparisons of different ensemble approaches, including Max Voting and bagging, boosting, and stacking using modern transformer models for MSA. This study fills that gap with an empirical examination in its contribution here. We assessed the Max Voting ensembles based on diverse Transformer models (e.g., LaBSE, DistilBERT, and XLM-R variants) that were fine-tuned on a multilingual dataset, specifically Twitter. We took our time to compare Max Voting against bagging, boosting (XGBoost), and stacking ensembles to analyze the scenarios where Max Voting performed best, most notably when the base models had solid and stable predictions. The study shows that Max Voting gives a competitive and consistent performance that rarely exceeds optimal performance with only two to four models, thus ensuring efficiency. While complexity methods, like boosting and stacking, can achieve higher performance in a few cases, Max Voting serves as a highly effective baseline that also simplifies the analysis. Additionally, this work looks to identify language-performance considerations at a baseline level, and provides readers with practical, data-driven guidance to assess ensemble development guided by levels of accuracy, computational constraints, and linguistic considerations. Ultimately, the findings can assist readers in real-world applications incorporated into multilingual domains.
Keywords
Sentiment analysis; Natural Language Processing; Machine Learning; Deep Learning; BERT


References
References [1] Lukasz Augustyniak, Szymon Wo´zniak, Marcin Gruza, Piotr Gramacki, Krzysztof Rajda, Mikolaj Morzy, and Tomasz Kajdanowicz. Massively multilingual corpus of sentiment datasets and multi-faceted sentiment classification benchmark. Advances in Neural Information Processing Systems,36:38586–38610, 2023. [2] Mohammad Azad, Tasnemul Nehal, and Mikhail Moshkov. A novel ensemble learning method using majority based voting of multiple selective decision trees. Computing, 107, 12 2024. [3] Francesco Barbieri, Luis Espinosa Anke, and Jose Camacho-Collados. XLM-T: Multilingual language models in Twitter for sentiment analysis and beyond. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 258–266, Marseille, France, June 2022. European Language Resources Association. [4] Eslam Ashraf Elhadidi, Ahmad Salah, Marwa Abdellah, and Saad M. Darwish. Sentiment analysis: a comparison of deep learning neural network algorithms with ensemble learning algorithms. Journal of Information Systems Engineering and Management, 10(34s), 2025. [5] Fatema Tuj Johora Faria, Laith H. Baniata, Mohammad H. Baniata, Mohannad A. Khair, Ahmed Ibrahim Bani Ata, Chayut Bunterngchit, and Sangwoo Kang. Sentimentformer: A transformer-based multimodal fusion framework for enhanced sentiment analysis of memes in under-resourced bangla language. Electronics, 14(4), 2025. [6] Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, and Wei Wang. Language-agnostic bert sentence embedding. arXiv preprint arXiv:2007.01852, 2020. [7] Tom´aˇs Filip, Martin Pavl´ıˇcek, and Petr Sos´ık. Fine-tuning multilingual language models in twitter/x sentiment analysis: a study on easter European v4 languages. arXiv preprint arXiv:2408.02044, 2024. [8] Dhaou Ghoul, J´er´emy Patrix, Ga¨el Lejeune, and J´erˆome Verny. A combined arabert and voting ensemble classifier model for arabic sentiment analysis. Natural Language Processing Journal, 8:100100, 2024. [9] Fatih Gurcan. Enhancing breast cancer prediction through stacking ensemble and deep learning integration. PeerJ Computer Science, 11:e2461, 2025. [10] Md Arid Hasan. Ensemble language models for multilingual sentiment analysis. arXiv preprint arXiv:2403.06060, 2024. [11] Mikael Moller Hogsgaard and Kasper Green Larsen. Improved margin generalization bounds for voting classifiers. ArXiv, abs/2502.16462, 2025. [12] Md. Mamun Hossain, Md. Moazzem Hossain, Most. Binoee Arefin, Fahima Akhtar, and John Blake. Combining state-of-the-art pre-trained deep learning models: A noble approach for skin cancer detection using max voting ensemble. Diagnostics, 14(1), 2024. [13] Rania Kora and Ammar Mohammed. An enhanced approach for sentiment analysis based on meta-ensemble deep learning. Social Network Analysis and Mining, 13(1):38, 2023. [14] George Manias, Argyro Mavrogiorgou, Athanasios Kiourtis, Chrysostomos Symvoulidis, and Dimosthenis Kyriazis. Multilingual text categorization and sentiment analysis: a comparative analysis of the utilization of multilingual approaches for classifying twitter data. Neural Computing and Applications, 35(29):21415–21431, 2023. [15] Domor Mienye and Yanxia Sun. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access, PP:1–1, 09 2022. [16] Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Sa’id Ahmad, Meriem Beloucif, Saif M Mohammad, Sebastian Ruder, et al. Afrisenti: A twitter sentiment analysis benchmark for African languages. arXiv preprint arXiv:2302.08956, 2023. [17] Neelesh Mungoli. Adaptive ensemble learning: Boosting model performance through intelligent feature fusion in deep neural networks. arXiv preprint arXiv:2304.02653, 2023. [18] Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv, abs/1910.01108, 2019. [19] Pratinav Seth, Rashi Goel, Komal Mathur, and Swetha Vemulapalli. Rsm nlp at blp-2023 task 2: Bangla sentiment analysis using weighted and majority voted fine-tuned transformers. arXiv preprint arXiv:2310.14261,2023. [20] Gaurish Thakkar, Sherzod Hakimov, and Marko Tadi´c. M2sa: multimodal and multilingual model for sentiment analysis of tweets. arXiv preprint arXiv:2404.01753, 2024. [21] ang Thin, Dai Nguyen, Dang Qui, Duong Hao, and Ngan Nguyen. ABCD team at SemEval-2023 task 12: An ensemble transformer-based system forAfrican sentiment analysis. In Atul Kr. Ojha, A. Seza Do˘gru¨oz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, and Elisa Sartori, editors, Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 324–330, Toronto, Canada, July 2023.Association for Computational Linguistics. [22] Andhra University. Zero-shot multilingual sentiment analysis using transformer-based models. International Journal of Engineering Research and Development, 2025. [23] Weikang Wang, Guanhua Chen, H. Wang, Yue Han, and Yun Chen. Multilingual sentence transformer as a multilingual word aligner. ArXiv, abs/2301.12140, 2023. [24] Chengyan Wu, Bolei Ma, Zheyu Zhang, Ningyuan Deng, Yanqing He, and Yun Xue. Evaluating zero-shot multilingual aspect-based sentiment analysis with large language models. arXiv preprint arXiv:2412.12564, 2024. [25] Wei Wu, Liang Tang, Zhongjie Zhao, and Chung-Piaw Teo. Enhancing binary classification: A new stacking method via leveraging computational geometry. arXiv preprint arXiv:2410.22722, 2024. [26] Feihong Yang, Xuwen Wang, Hetong Ma, and Jiao Li. Transformers sklearn: a toolkit for medical language understanding with transformer based models. BMC Medical Informatics and Decision Making, 21:1–8, 2021. [27] Jakub ˇSm´ıd and Pavel Kr´al. Cross-lingual aspect-based sentiment analysis: A survey on tasks, approaches, and challenges. Information Fusion, 120:103073, 2025.
Statistics Article View: 20 PDF Download: 17

Voting Ensemble Learning for Multilingual Sentiment Analysis: A Comparative Study

References