Spam Detection of SMS Messages Using Random Forest Classifier Algorithm

Main Article Content

K Ranjith Reddy, Ganpat Joshi

Abstract

As the popularity of cell phones has increased in recent years, SMS i.e., short message service has arisen as a multi-billion-dollar industry. Reduced messaging costs have led to an increase in unsolicited (spam) mobile ads. In a study in 2012, it is found that a total of 31% of text sent was spam in some Asian countries. The email filtering algorithms may not achieve the quality they are supposed to given the results.  In this paper, we use efficient random forest algorithm for the classification of the SMS spam database from the Machine Learning UCI repository is taken, which contains around 5572 samples. After preprocessing, we create the embeddings for the dataset, we then pass in the vectors to our random forest algorithm for the classification task. The results are given considering the data imbalance problem and achieving an accuracy of about 96%. The experiment results aim to differentiate between spam and ham messages by creating a sensitive and efficient classification model that provides good accuracy with fewer false positives. And finally, we have concluded our experiment with high accuracy with the Random Forest classifier.

Article Details

Section
Articles