Logistic Regression Hyperparameter Optimization for Cancer Classification | ||||
Menoufia Journal of Electronic Engineering Research | ||||
Article 1, Volume 31, Issue 1, January 2022, Page 1-8 PDF (453.7 K) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/mjeer.2021.70512.1034 | ||||
View on SCiNiTO | ||||
Authors | ||||
Ahmed Hamdy Ahmed Arafa 1; Marwa Radad2; Mohammed M Badawy 3; Nawal El-Fishawy4 | ||||
1Computer Science & Engineering Dept. Faculty of Electronic Engineering Menoufia, Egypt. | ||||
2Computer Science & Engineering Dept. Faculty of Electronic Engineering Menoufia, Egypt. | ||||
3Computer Science and Engineering Dept., Faculty of Electronic Engineering, Menoufia University | ||||
4Computer Science an Engineering, Faculty Of Electronic Engineering, Menoufia University, Egypt | ||||
Abstract | ||||
In machine learning, optimization of hyperparameters aims to find the best values of model hyperparameters yielding an optimal model with minimum prediction error. It is the most important step that directly affects the performance of learned model. Many techniques have been proposed to optimize hyperparameters for different predictive models. In this paper, the performance of grid search, random search, Bayesian Tree Parzen Estimator (TPE) and Simulated Annealing (SA) optimization techniques is evaluated to determine the best hyperparameters for a logistic regression model when used in cancer classification. Wisconsin Breast Cancer Dataset (WBCD) has been used to evaluate the previously mentioned optimization techniques. The results show that Bayesian TPE outperformed other techniques in terms of number of iterations and running time. The number of iterations to get optimal parameters in TPE is less than SA by 75.75 %, and random search by 77.1%. While the time taken by TPE is better than SA, random search and grid search by 79.9%, 86.1% and 99.9% respectively. The resulted optimal hyperparameter values have been utilized to learn a logistic regression model to classify cancer using WBCD dataset. The optimized model succeeded in classifying cancer with 98.2% for test accuracy, 0.962 for kappa statistic and 0.963 for MCC metrics when evaluated using 10-fold cross validation. | ||||
Keywords | ||||
Hyperparameter Optimization; Random Search Grid Search; Tree Parzen Estimator; Simulated Annealing | ||||
Statistics Article View: 591 PDF Download: 573 |
||||