Type-2 Diabetes Prediction Using Machine Learning Algorithms And Ensembles with Hyperparameters

Main Article Content

Kunal Verma, Pon Harshavardhanan


Diabetes, a complicated metabolic sickness characterized with the aid of chronic hyperglycemia (high Blood Sugar), is rising as one of the main health concerns of the 21st century. The superiority of diabetes international has reached unparalleled tiers, affecting over 463 million individuals as of 2019, in keeping with the Global Diabetes Federation.

This number is predicted to rise to 700 million by 2045, reflecting an alarming upward trend. Diabetes is a chief contributor to morbidity and mortality. It’s miles answerable for approximately

4.2 million deaths every year, making it one of the pinnacle ten leading reasons of demise globally.

The present article suggests a hybrid prediction model to aid in type 2 diabetes diagnosis. This study uses the Vanderbilt bio-statistical Diabetes data set as a reference to determine the efficacy of various ML (Machine Learning) methods and strategies applied to diabetes forecasting. In this paper, we have combined ensembles such as AdaBoost, Light GBM, Cat Boost, Gradient Boost, and ML algorithms like RF (Random Forest), DT (Decision Tree), SVM (Support Vector Machine), and LR (Logistic Regression). Then, to enhance the models’ accuracy, we employed HyperParameters like Grid search CV and Randomized search CV. Following their comparative analysis, the optimal model for diabetes prediction was selected. The best model is Cat Boost with a Randomized Search CV with an accuracy of 95.7%.

Article Details