Evaluating Parallel Ward Algorithm for Drug Discovery

Malhat, M. G.; Mousa, Hamdy

doi:10.21608/ijci.2015.33959

	Evaluating Parallel Ward Algorithm for Drug Discovery
IJCI. International Journal of Computers and Information
Article 4, Volume 4, Issue 1, June 2015, Page 29-35 PDF (567.38 K)
Document Type: Original Article
DOI: 10.21608/ijci.2015.33959
View on SCiNiTO
Authors
M. G. Malhat ¹; Hamdy Mousa ²
¹Computer Science dept., Faculty of computers and Information, Menoufia University, Egypt
²Faculty of Computer and Information Menoufia University
Abstract
Millions of compounds are now available in chemical libraries and scientists have to test these compounds against biological targets in order to identify lead compounds. The identification of lead compounds is a key step in the drug discovery process. So, there are many hierarchical clustering algorithms are developed and modified for that purpose. Ward algorithm is one of the most popular hierarchical clustering algorithms that are used in many applications in the drug discovery process because of it is accuracy. But, it has limitation to handle large data sets within a reasonable time and memory resources. In this paper, we evaluate and compare two parallel approaches to run ward algorithm. The two approaches are parallel for loop and MapReduce framework. The results shows that parallel for loop failed to reduce computational time of ward algorithm due to overhead needed for data communications. But, MapReduce framework shows considerable reduction in computational time. The parallel ward algorithm saves 17% of time using three nodes and saves 58% of time using six nodes using MapReduce.
Keywords
Drug Discovery; Hierarchical Clustering; Ward Clustering; Parallel for; MapReduce


Statistics Article View: 214 PDF Download: 313