Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Sundara Kumar, M.R.a; * | Mohan, H.S.b
Affiliations: [a] Research Scholar, Department of ISE, New Horizon College of Engineering, Bangalore, Visvesvaraya Technological University, Belagavi, India | [b] Department of CSE (Data Science), RNS Institute of Technology, Bangalore, Visvesvaraya Technological University, Belagavi, India
Correspondence: [*] Corresponding author. M.R. Sundara Kumar, Research Scholar, Department of ISE, New Horizon College of Engineering, Bangalore, Visvesvaraya Technological University, Belagavi-590018, Bangalore, India. E-mail: sundar.infotech@gmail.com.
Abstract: Big Data Analytics (BDA) is an unavoidable technique in today’s digital world for dealing with massive amounts of digital data generated by online and internet sources. It is kept in repositories for data processing via cluster nodes that are distributed throughout the wider network. Because of its magnitude and real-time creation, big data processing faces challenges with latency and throughput. Modern systems such as Hadoop and SPARK manage large amounts of data with their HDFS, Map Reduce, and In-Memory analytics approaches, but the migration cost is higher than usual. With Genetic Algorithm-based Optimization (GABO), Map Reduce Scheduling (MRS) and Data Replication have provided answers to this challenge. With multi objective solutions provided by Genetic Algorithm, resource utilization and node availability improve processing performance in large data environments. This work develops a novel creative strategy for enhancing data processing performance in big data analytics called Map Reduce Scheduling Based Non-Dominated Sorting Genetic Algorithm (MRSNSGA). The Hadoop-Map Reduce paradigm handles the placement of data in distributed blocks as a chunk and their scheduling among the cluster nodes in a wider network. Best fit solutions with high latency and low accessing time are extracted from the findings of various objective solutions. Experiments were carried out as a simulation with several inputs of varied location node data and cluster racks. Finally, the results show that the speed of data processing in big data analytics was enhanced by 30–35% over previous methodologies. Optimization approaches developed to locate the best solutions from multi-objective solutions at a rate of 24–30% among cluster nodes.
Keywords: Big data analytics, hadoop distributed file system, non-dominated sorting genetic algorithm, map reduce scheduling based non-dominated sorting genetic algorithm, map reduce scheduling, genetic algorithm-based optimization
DOI: 10.3233/JIFS-240069
Journal: Journal of Intelligent & Fuzzy Systems, vol. 46, no. 4, pp. 10863-10882, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl