Enhanced Multilingual Sentiment Analysis Using Ensemble Learning and Tree Structured Parzen Estimator Hyperparameter Optimization
Main Article Content
Abstract
A machine learning tool called sentiment analysis (SA) uses natural language processing (NLP) to infer people's attitudes from text. For a variety of reasons, including ambiguity, a wide range of dialects, a lack of assets, morphological variation, a lack of background information, and the concealing of sentimentality in the unspoken text, implementing Arabic SA is difficult. Convolutional neural networks (CNN) and long short-term memory (LSTM) are two deep learning models that have made major advancements in the Arabic SA sector. The enactment of single DL models has been further enhanced by hybrid models built on CNN coupled with LSTM or gated recurrent unit (GRU). In order to improve application performance, this paper uses the Tree-structured Parzen Estimator (TPE) algorithm for hyperparameter optimization of seven proposed NN models for multilingual sentiment analysis and Ensembles of various models. It also compares the differences in the model's predictive abilities. Only the tweets with negative and positive labels were included in the dataset that we acquired. The models were trained and tested using the Arabic Sentiment Tweets Dataset (ASTD), Depression Corpus of Arabic Tweets (DCAT), Arabic-Egyptian Corpus (AEC), and Hebrew Sentiment Dataset (HSD). The recommended model with TPE has the maximum accuracy for ASTD, DCAT, AEC, and HSD, with 97.7%, 92.2%, 91.1%, and 91.1%, respectively.