Skip to main content

Advances, Systems and Applications

Table 3 Deadline-aware schedulers in heterogeneous clusters and their properties

From: MapReduce scheduling algorithms in Hadoop: a systematic study

Reference

Year

Key Ideas

Advantages

Disadvantages

Comparison Algorithms

Evaluation Techniques

Jabbari et al. [43]

2021

A cost-efficient resource provisioning and scheduling approach for deadline-sensitive MapReduce computations in cloud environment

Guaranteeing the deadline. Reducing the total hiring cost

Not considering data locality. Not considering data distribution

No comparison, different input datasets in different possible scenarios

Simulation

Shao et al. [44]

2018

EFS: efficient jobs scheduling approach for big data applications

Meeting the job deadline

Not considering data distribution and replication dependencies

AlwaysOn, OPT, AutoScale

Simulation: using Scheduler Load Simulator (SLS)

Lin et al. [45]

2019

DGIA: deadline-constrained and influence-aware design for allocating MapReduce jobs in cloud computing systems

Meeting the job deadline

Considering the performance influence over existing tasks

Not considering data locality

O-Hadoop, OR-Hadoop, BGMRS, SDHP, EDFLWF

Simulation:

1000 nodes

Chen et al. [46]

2015

DCMRS: Using the bipartite graph modelling to obtain the optimal scheduling solution

Reducing job

execution time

Dynamically adjust the task deadlines

The low proportion of jobs miss their deadline

No improving data locality

No ensuring the deadline of all jobs was met

High computational time

ORP, AUMD, ADAPT, MDF, MLF

Implementation:

Hadoop Cluster with 24 nodes (4 PMs on Hadoop cluster, 20 VMs on Cloud)

Simulation:

on the MatLab

Tang et al. [47]

2013

MTSD: Using a node classification algorithm to classify the nodes according to processing capacity

Meeting the deadline constraints

Increasing map task’s data locality

Reducing completion time

Improving the precision of task’s remaining time evaluation

No solving the reduce task scheduling problem

No providing deadline guarantees for the jobs. Using a static manner to divide the job deadline into task deadlines

Fair, FIFO

Implementation: in version of Hadoop-0.21

Verma et al. [48]

2011 a

ARIA: Using the job profile to estimate the job completion time and the number of resources required for job completion within the deadline

Automatically allocate the resources to the job for meeting the deadline

Increasing resource utilization

Considering heterogeneous environments

Not considering Map deadline and reduce deadline

Not considering node failures

No comparison, different input datasets in different possible scenarios

Implementation:

on Hadoop cluster using Hadoop 0.20.2

Simulation:

a discrete event simulator

Polo et al. [49]

2013

Estimating the completion time of jobs, based on the average task length of the completed tasks

Dynamically allocate resources to the jobs for meeting their deadline

Increasing system throughput

Maximizing data locality

Considering hardware heterogeneity

Does not interrupt tasks that are already executing

Not considering the differences between the Map and reduce tasks

Fair, Basic Adaptive Scheduler

Implementation:

on Hadoop 0.21