Classifiers Fusion for Arabic Named Entity Recognition | ||||
The Egyptian Journal of Language Engineering | ||||
Article 3, Volume 1, Issue 2, September 2014, Page 19-34 PDF (1.04 MB) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/ejle.2014.59923 | ||||
View on SCiNiTO | ||||
Authors | ||||
Wasim M. Abdulwasea 1; Sherif M. Abdou1; Hassanin Barhamtoshy2 | ||||
1Computers Department, Faculty of Engineering, Cairo University | ||||
2King Abdel Aziz City for Sciences | ||||
Abstract | ||||
This paper presents a new approach to Arabic Name Entity Recognition (ANER). The introduced approach uses different sets of features that are both language independent and language specific in a discriminative and generative machine learning frameworks namely, conditional random fields (CRF), support vector machines (SVM), Naive Bayes(NB), Decision Tree (DT), SVM for sequence tagging using Hidden Markov Models (SVMhmm), K-nearest neighbors(K-NN), Logistic classifier and the other SVM Weka model called (SMO). Also all these classifiers have been fused together and the fusion configuration provided more accurate ANER than any one of the classifiers when used individually. The proposed approach has been evaluated using two data sets, the first dataset is a recently published corpus called ALTEC Named Entity Corpus for Modern Standard Arabic proposed by the Arabic Language Technology Center (ALTEC), and the second dataset is a standard dataset in Arabic NER called ANERcrop proposed by Benajiba. The proposed approach proved that it outperforms state of art Arabic NER systems for both of the two data sets using the 6-fold evaluation criterion. | ||||
Keywords | ||||
Information Retrieval; Name Entity Recognition; Classifiers Fusion | ||||
Statistics Article View: 129 PDF Download: 360 |
||||