Advances, Systems and Applications
From: MapReduce scheduling algorithms in Hadoop: a systematic study
Reference | Year | Key Ideas | Advantages | Disadvantages | Comparison Algorithms | Evaluation Techniques |
---|---|---|---|---|---|---|
Jabbari et al. [43] | 2021 | A cost-efficient resource provisioning and scheduling approach for deadline-sensitive MapReduce computations in cloud environment | Guaranteeing the deadline. Reducing the total hiring cost | Not considering data locality. Not considering data distribution | No comparison, different input datasets in different possible scenarios | Simulation |
Shao et al. [44] | 2018 | EFS: efficient jobs scheduling approach for big data applications | Meeting the job deadline | Not considering data distribution and replication dependencies | AlwaysOn, OPT, AutoScale | Simulation: using Scheduler Load Simulator (SLS) |
Lin et al. [45] | 2019 | DGIA: deadline-constrained and influence-aware design for allocating MapReduce jobs in cloud computing systems | Meeting the job deadline Considering the performance influence over existing tasks | Not considering data locality | O-Hadoop, OR-Hadoop, BGMRS, SDHP, EDFLWF | Simulation: 1000 nodes |
Chen et al. [46] | 2015 | DCMRS: Using the bipartite graph modelling to obtain the optimal scheduling solution | Reducing job execution time Dynamically adjust the task deadlines The low proportion of jobs miss their deadline | No improving data locality No ensuring the deadline of all jobs was met High computational time | ORP, AUMD, ADAPT, MDF, MLF | Implementation: Hadoop Cluster with 24 nodes (4 PMs on Hadoop cluster, 20 VMs on Cloud) Simulation: on the MatLab |
Tang et al. [47] | 2013 | MTSD: Using a node classification algorithm to classify the nodes according to processing capacity | Meeting the deadline constraints Increasing map task’s data locality Reducing completion time Improving the precision of task’s remaining time evaluation | No solving the reduce task scheduling problem No providing deadline guarantees for the jobs. Using a static manner to divide the job deadline into task deadlines | Fair, FIFO | Implementation: in version of Hadoop-0.21 |
Verma et al. [48] | 2011 a | ARIA: Using the job profile to estimate the job completion time and the number of resources required for job completion within the deadline | Automatically allocate the resources to the job for meeting the deadline Increasing resource utilization Considering heterogeneous environments | Not considering Map deadline and reduce deadline Not considering node failures | No comparison, different input datasets in different possible scenarios | Implementation: on Hadoop cluster using Hadoop 0.20.2 Simulation: a discrete event simulator |
Polo et al. [49] | 2013 | Estimating the completion time of jobs, based on the average task length of the completed tasks | Dynamically allocate resources to the jobs for meeting their deadline Increasing system throughput Maximizing data locality Considering hardware heterogeneity | Does not interrupt tasks that are already executing Not considering the differences between the Map and reduce tasks | Fair, Basic Adaptive Scheduler | Implementation: on Hadoop 0.21 |