Machine Learning Algorithms to Improve Insurance Claim Prediction

Soltan, Heba; Saad, Fathy; Khadary, Hanan

doi:10.21608/hjcr.2025.431975

	Machine Learning Algorithms to Improve Insurance Claim Prediction
Horus International Journal for Commercial Research
Article 2, Volume 1, Issue 2, January 2025, Pages 20-36 PDF (938.45 K)
Document Type: Scientific research
DOI: 10.21608/hjcr.2025.431975
Authors
Heba Soltan¹; Fathy Saad²; Hanan Khadary³
¹Department of Statistics and Quantitative Methods ,Faculty of Business Administration ,Horus University, Damietta ,Egypt
²Demonstrator of Insurance and Statistics at the Faculty of Business Administration, Horus University, Damietta ,Egypt
³Department of Statistics, Nile Higher Institute of Commercial Sciences and Computer Technology
Abstract
In the insurance industry, predicting the probability of a policyholder making a claim in the near future is important for insurance companies. This prediction helps improve companies’ risk management and determine policy prices more accurately. This research aims to use data from motor insurance companies that include various features about policyholders and insured vehicles to explore the application of six popular machine learning algorithms to predict whether a policyholder will make a claim in the next six months; these algorithms areRandom Forest, Adaboost, SVM, Naive Bayes, KNNand logistic regression. The performance of each of them will be analyzed in terms of classification accuracy, sensitivity and specificity to determine the most suitable one in the prediction process. This study also aims to provide results that insurance companies can benefit from on how to use machine learning techniques to improve their ability to predict and manage each company’s insurance portfolio efficiently. This research will help insurance companies plan well and price their products better, and reduce the financial risks associated with future claims. Predicting the probability of insurance claims is a very important challenge for the insurance industry, as it enables companies to manage risks well and price policies more accurately. This study will achieve the use of six popular machine learning algorithms (Random Forest, Adaboost, SVM, Naive Bayes, KNN(Logistic Regression) to predict whether policyholders will file a claim in the next six months using auto insurance data. This study will provide findings on the relative strengths and weaknesses of each method, providing insurers with valuable guidance on selecting the best and most appropriate model for their corporate forecasting process. The aim of this research is to provide insurance companies with the knowledge to leverage the provided analytics and take advantage of the machine learning results, ultimately leading to more effective risk management, better pricing strategies and reduced exposure to financial risks due to future claims. This study will demonstrate the impact of the insurance industry’s ability to make data-driven decisions and improve overall operational efficiency. It aims to reduce the potential failure of machine learning algorithms in predicting insurance claims.
Keywords
Insurance claim prediction; Machine learning; Random Forest; Adaboost; SVM; Naive Bayes; KNN; Logistic Regression; Automobile insurance

Statistics Article View: 159 PDF Download: 82