Predicting DNA Methylation state of CpG Islands Using Machine Learning | ||||
Journal of Advanced Engineering Trends | ||||
Volume 43, Issue 2, June 2024, Page 11-17 PDF (488.54 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/jaet.2022.147975.1214 | ||||
View on SCiNiTO | ||||
Authors | ||||
Esraa Mamdouh Hashem 1; Asmaa Kamal2; Mai S. Mabrouk3; Mohamed W. Fakhre4 | ||||
1Biomedical Engineering Department, Faculty of Engineering, Misr University for Science and Technology | ||||
2College of Computing and Information Technology (CCIT), Arab Academy for Science Technology and Maritime Transport (AASTMT) Cairo, Egypt | ||||
3Misr University for science and technology | ||||
4Computer Engineering Department Arab Academy for Science, Technology & Maritime Transport Cairo, Egypt. | ||||
Abstract | ||||
DNA methylation is the primary and best understood epigenetic element that controls human health. It is an essential regulator of gene transcription. Methylation may be the head of some diseases like Parkinson's, cardiovascular, chronic kidney, cancer, and Alzheimer's. The implementation of models to predict DNA methylation has been concentrated by researchers in the bioinformatics area, according to the difficulties of predicting the methylation that is very sensitive to lifestyle or pollution changes. Recent improvements in methylation sequencing way permit the recognition of genome-wide methylated sites in DNA. In the represented work, computational methods are used to predict the methylation of DNA for every CpG locus and non-CpG locus in the whole genome, utilizing Illumina 450K array data within the 250bp region around every CpG site of the human embryonic stem cell with three classifiers including logistic regression, support vector machine, and random forest. The proposed classifiers have been evaluated. Results show that the best performance criteria came from the random forest approach giving an accuracy of 99.9% for a methylation status compared to the other two classifiers. Expressing more features will lead to higher prediction performance and wider detection coverage for methylation of CpG loci. | ||||
Keywords | ||||
DNA methylation; logistic regression; support vector machine; CpG Islands; Random forest | ||||
Supplementary Files
|
||||
Statistics Article View: 122 PDF Download: 100 |
||||