Skip to main content

Advances, Systems and Applications

Table 5 Summary of further analysis

From: Cloud resource management: towards efficient execution of large-scale scientific applications and workflows on complex infrastructures

Work

Data transfers and imbalance

Dynamic scheduling

Hybrid and Multicloud

Workflow support

PANDEY et al., 2010

Transfers are evaluated via workflow DAG and resource allocation; transfer imbalance is not addressed.

Only addresses fluctuations in the transfer costs. Other aspects such as performance fluctuations and reliability are not mentioned.

No explicit support or experiments.

Modeled as DAGs; richer characterizations are not supported.

LIN; LU, 2011

Transfer capacity of nodes in the same network are assumed to be uniform. Transfer imbalance is discarded.

Not addressed.

No explicit support or experiments

Supported; no details included.

XU et al., 2009

Transfers and data properties are not explicitly addressed.

Not addressed.

No explicit support or experiments.

Multiple workflows supported via common merging point; simple DAG modeling.

WEISSMAN; GRIMSHAW, 1996

Data locality is a scheduling constraint; worker must be assigned closer to data.

Two levels: local and global. Rescheduling is first handled on local level. Details are not provided.

Design for wide-area systems (pre-dates cloud computing).

No explicit support.

CHEN; ZHANG, 2009

Data communication and transfers are not explicitly addressed.

Not addressed.

No explicit support or experiments.

Simplified DAG model without edge costs.

RODRIGUEZ; BUYYA, 2014

Rigidly modeled; fixed costs for transfers and no cost for local I/O.

Not addressed.

Not addressed.

DAG with fixed transfer costs and computation costs based on FLOPS.

FARD et al., 2012

Transfers are considered but contention effects are not. Energy calculations ignore transfer times.

Not addressed.

No explicit support or experiments.

DAG with fixed transfer costs; not details on task costs.

MALAWSKI et al., 2012

Algorithm does not consider the size of input data; transfer time is part of computation.

Initial scheduling plus periodic adjusting depending on amount of idle resources.

No explicit support or experiments.

DAG with fixed transfer costs and computation costs with slight variability.

SAKELLARIOU; ZHAO, 2004

Linear variation to amount of input data size.

Immediately before execution of tasks and bound to a condition to minimize number of reschedules.

Not addressed; solution originally designed for grids.

DAG with computation and transfer costs modeled with linear variation w.r.t. amount of input.

WANG; CHEN, 2012

Not addressed. DAG does not specify transfer costs.

Not addressed.

No explicit support.

DAG with tasks and implicit costs. No transfer costs and no more complex characterization.

POOLA et al., 2014a

Based on data size and one value for network bandwidth.

Not addressed.

No explicit support.

DAG with task cost based on number of instructions.

BITTENCOURT; MADEIRA, 2011

Based on data size and fixed network bandwidth values among nodes.

Two-step scheduling: static, then including public cloud to address deadline.

Initial scheduling step considers private resources; public resources are used if necessary.

DAG with compute cost based on number of instructions.

VECCHIOLA et al., 2012

Not specified.

Not addressed.

Public resources used if necessary.

Supported, but no details provided.