Enhancing Arabic Text Mining Using Linguistic Factors

doi:10.21608/asc.2011.158213

	Enhancing Arabic Text Mining Using Linguistic Factors
Journal of the ACS Advances in Computer Science
Article 4, Volume 5, Issue 1, 2011, Page 49-62 PDF (1.36 MB)
Document Type: Original Article
DOI: 10.21608/asc.2011.158213
View on SCiNiTO
Abstract
The World Wide Web overwhelms people with immense amount of widely distributed, interconnected, rich and dynamic hypertext information. Text mining concerns extracting knowledge from unstructured textual data. The most important task to achieve this mission is finding the rules that relate specific words and phrases. This research presents how Arabic morphology and Arabic synonymous, as linguistic factors, can be used to extract the required knowledge from Arabic texts. The contribution in this research is based on the design and implementation of a system combining morphology, synonyms, indexing and databases for Text Mining and Information Retrieval with different modes regarding morphology and synonyms. The used approach is based on preprocessing the Arabic text to convert it into semi-structured database. A suitable indexing method and an appropriate searching mechanism are used to extract the required information. The proposed model is tested and it showed a promising success. Shortage in Arabic Computational linguistics tools such as Arabic lexicon tagged with semantic features appeared.
Keywords
Data Mining; Arabic Text Mining; Arabic Natural Language Processing; Information Retrieval; Information Extraction; Database; Indexing


Statistics Article View: 84 PDF Download: 122