Enhancing Arabic Text Mining Using Linguistic Factors | ||||
Journal of the ACS Advances in Computer Science | ||||
Article 4, Volume 5, Issue 1, 2011, Page 49-62 PDF (1.36 MB) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/asc.2011.158213 | ||||
View on SCiNiTO | ||||
Abstract | ||||
The World Wide Web overwhelms people with immense amount of widely distributed, interconnected, rich and dynamic hypertext information. Text mining concerns extracting knowledge from unstructured textual data. The most important task to achieve this mission is finding the rules that relate specific words and phrases. This research presents how Arabic morphology and Arabic synonymous, as linguistic factors, can be used to extract the required knowledge from Arabic texts. The contribution in this research is based on the design and implementation of a system combining morphology, synonyms, indexing and databases for Text Mining and Information Retrieval with different modes regarding morphology and synonyms. The used approach is based on preprocessing the Arabic text to convert it into semi-structured database. A suitable indexing method and an appropriate searching mechanism are used to extract the required information. The proposed model is tested and it showed a promising success. Shortage in Arabic Computational linguistics tools such as Arabic lexicon tagged with semantic features appeared. | ||||
Keywords | ||||
Data Mining; Arabic Text Mining; Arabic Natural Language Processing; Information Retrieval; Information Extraction; Database; Indexing | ||||
Statistics Article View: 84 PDF Download: 122 |
||||