Straggler Task Prediction Based Balanced Hadoop for Big Data Using Hladagrad-Enn and Lcv-Choa Based Backup Node Selection

Main Article Content

Hetal A. Joshiara , Chirag S. Thaker

Abstract

Data analysis speed is of great significance in Big Data processing.  For big data analytics, one of the popular frameworks is Hadoop. Grounded on the MapReduce (MR) model, Hadoop runs applications on a cluster of huge commodities numbers along with less expensive computing nodes since it is a distributed computing framework. Owing to unbalanced load, inaccurate node selection, and unidentified straggler tasks, there remain challenges like high response time, runtime or execution time, and poor handling of resources. The work has developed a Straggler Task Prediction centered Balanced Hadoop (STP-BHADOOP) framework to rectify the existing problem. The computational infrastructure of Big Data’s efficiency is enhanced by this framework; thus, accelerating data processing to identify straggler tasks by Speculative Execution (SE). Using the ROS-Flubber (Random Over Sampler based flubber) technique, Hadoop computing is utilized here by distributing a balanced load, totally named Balanced Hadoop (BHadoop). To make a structured data format, extraction of task-centered features and preprocessing are eventuated. To predict the straggler task, the most relevant data is chosen by using the XI-MO technique.  Finally, using the HLAdagrad-ENN technique, the prediction is eventuated centered on selected data. The node-centered features are extracted after the straggler task prediction. If a straggler task is identified, nodes selection takes place using the LCV-ChOA technique with the aid of extracted node features. The proposed framework achieves a low runtime, and response time and predicts the straggler task with better precision and recall value, when analogized with prevailing methods is obtained from experimental outcomes.

Article Details

Section
Articles