A Comparative Analysis of Logistic Regression and Random Forest for Intrusion Detection Systems: A Reproducible, Deployment-Oriented Study | ||
| Journal of Communication Sciences and Information Technology | ||
| Volume 2025, Issue 3, October 2025, Pages 1-24 PDF (1.05 M) | ||
| Document Type: Original Article | ||
| DOI: 10.21608/jcsit.2025.427990.1021 | ||
| Author | ||
| Ahmed Taisser Shawky* | ||
| El Madina Higher Institute of Administration and Technology, Egypt | ||
| Abstract | ||
| Intrusion Detection Systems (IDS) must deliver high detection rates with controllable false alarms under class imbalance, non-stationarity, and strict latency constraints. We present a rigorous, end-to-end comparison of Logistic Regression (LR) and Random Forest (RF) for IDS, emphasizing reproducibility, statistical validity, and deployability. Using standard benchmark datasets (NSL-KDD for tabular network connections; BoT-IoT and TON_IoT for modern IoT traffic), we build a unified pipeline covering feature preprocessing, class-imbalance mitigation, hyperparameter tuning, threshold calibration, and uncertainty-aware evaluation. Results show that RF consistently achieves higher recall and F1 on attack classes, especially under heterogeneous traffic, while LR offers superior interpretability and competitive precision when calibrated. We quantify operating regimes where LR is preferable (auditable environments, tight latency, scarce features) and where RF dominates (nonlinear patterns, richer features). We release a full protocol—metrics, statistical tests, ablations, and deployment guidelines—to enable reproducible benchmarking and practical adoption. Keywords: Intrusion Detection Systems; Network Security; Class Imbalance; Logistic Regression; Random Forest; Threshold Calibration; ROC/PR Analysis; Reproducibility. | ||
| Keywords | ||
| Comparative Analysis; Logistic Regression; Random Forest and Intrusion Detection Systems | ||
|
Statistics Article View: 7 PDF Download: 1 |
||