Machine Translation Approach for Resolution of Part of Speech Ambiguity from English to the Sanskrit Language

Archana Sachindeo Maurya ,Promila Bahadur  Divakar Yadav

PDF

Published: Jul 9, 2023

Archana Sachindeo Maurya ,Promila Bahadur Divakar Yadav

Abstract

Machine Translation is one of the most important techniques of Natural Language Processing (NLP). It is an automated process of translation through a computer system. Now, due to advancement, there exist many efficient ways to express machine translation. Machine Learning (ML) is being extensively used in MT and has become an interesting area of research in the last three years. Resolving semantic ambiguity in natural language is a major challenge in MT. ML has given promising results in terms of system learning and predicting results. The text classification technique in ML is considered one of the most important methods to resolve Word Sense Disambiguation (WSD). The role of the dataset both as training and test data is important to predict the required results. We collected a total of 2,000 sentences, which we divided into training and testing data. The dataset plays an essential role in validating the output of the system. We have also done an analysis on supervised machine learning text classification algorithms, namely Naïve Bayes’, Decision Tree, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Neural Network, Logistic Regression, and Random Forest. The accuracy of the given algorithms ranges between sixty-eight to eighty-four percent. Further, we have also developed a hybrid model. In this proposed model, we have combined the Naïve Bayes’, Support Vector Machine, and Decision Tree algorithms to achieve better results.

We have analyzed our proposed “hybrid model” for prediction of POS ambiguity. The proposed model has a reported success rate of eighty-five percent. All the algorithms and hybrid model are tested on the ML tool WEKA. The “hybrid model” is also tested on the programming language Java. The accuracy of the algorithms and model is reported using the ten-fold cross-validation method.

The model has also reported high precision, recall, and F-score in comparison to all other supervised machine learning classification algorithms. The correctness of the algorithm and model is tested in terms of the total number of correct POS predicted. The algorithms and proposed model are analyzed with the help of a machine learning tool named as “AmbiF”.

Issue

Vol. 44 No. 7 (2023): Issue 7

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details