 Research
 Open Access
 Published:
EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud
Journal of Cloud Computing volume 11, Article number: 13 (2022)
Abstract
A workflow is an effective way for modeling complex applications and serves as a means for scientists and researchers to better understand the details of applications. Cloud computing enables the running of workflow applications on many types of computational resources which become available ondemand. As one of the most important aspects of cloud computing, workflow scheduling needs to be performed efficiently to optimize resources. Due to the existence of various resource types at different prices, workflow scheduling has evolved into an even more challenging problem on cloud computing. The present paper proposes a workflow scheduling algorithm in the cloud to minimize the execution cost of the deadlineconstrained workflow. The proposed method, EDQWS, extends the current authors’ previous study (DQWS) and is a twostep scheduler based on divide and conquer. In the first step, the workflow is divided into subworkflows by defining, scheduling, and removing a critical path from the workflow, similar to DQWS. The process continues until only chainstructured subworkflows, called linear graphs, remain. In the second step which is linear graph scheduling, a new merging algorithm is proposed that combines the resulting linear graphs so as to reduce the number of used instances and minimize the overall execution cost. In addition, the current work introduces a scoring function to select the most efficient instances for scheduling the linear graphs. Experiments show that EDQWS outperforms its competitors, both in terms of minimizing the monetary costs of executing scheduled workflows and meeting userdefined deadlines. Furthermore, in more than 50% of the examined workflow samples, EDQWS succeeds in reducing the number of resource instances compared to the previously introduced DQWS method.
Introduction
Nowadays, cloud computing offers an opportunity to execute scientific applications composed of hundreds or thousands of interrelated tasks [1]. In this context, the workflow model is an effective way to construct such complex applications. It consists of application tasks specified by nodes and the connection lines among nodes which create dependencies among these tasks in a directed acyclic graph (DAG) [2]. The workflow scheduling problem in the cloud aims to assign the tasks to computing resources in order to preserve task precedence while meeting some performance criteria [3].
Besides the total execution time of workflows, most research on workflow scheduling in the cloud has focused on optimizing the total usage cost of computing resources offered by cloud providers [2]. Moreover, faster and more powerful computing resources in the cloud are usually more expensive than slower ones. Therefore, the execution cost can be affected by employing powerful computing resources as this decreases the workflow execution time. Thus, the tradeoff between cost and time is a major challenge of workflow scheduling in the cloud [4]. To address this tradeoff, two common methods are employed: either minimizing the total execution time under a budget constraint or minimizing the monetary cost under a deadline constraint.
Depending on user requirements and the workflow application, both cost and time can be considered as constraints or optimization objectives. Some workflows are structured in such a way that the results can be used whenever the execution of the workflow is completed before the deadline. The workflows of medical simulations and weather forecasting are examples of deadlineconstrained applications [4]. To address such deadlineconstrained applications, the current study proposes a new workflow scheduling algorithm focused on minimizing the total execution cost while respecting the userdefined deadline.
The proposed algorithm, named enhanced divide and conquer workflow scheduling (EDQWS), is an extension of the present authors’ earlier study [5] which introduced a type of a divide and conquerbased approach. Similar to this previous work, in EDQWS, a large workflow is broken into a number of subworkflows by identifying the critical path of a workflow and then scheduling and removing it. The same series of steps are repeated in all subworkflows. Creating subworkflow with a chain structure, called linear graph, is the stop condition of the algorithm. The present authors extend their previous method of scheduling linear graphs. Since the scheduling of linear graphs as solvable problems is performed in the final step, the linear graphs resulting from different stages of the division process are independent of each other and can therefore be separately scheduled. However, the specific order or combination of their scheduling can affect the selected resource types and the total execution cost. To this end, the current study proposes a scoring function for determining the linear graph combination score on different resources. At each stage, the combination with the highest score is selected and the new combined graph is replaced by the original ones. This combination is binary and will continue until the scoring function reaches a negative score for all new combinations. Owing to the large number of linear graphs, the score calculation is performed in parallel. The present study also introduces a new merge list algorithm that combines the tasks of linear graphs with respect to the laxity of the tasks. To evaluate the performance of the proposed method, it is compared with several stateoftheart scheduling algorithms. The experimental results determine that the presented method outperforms others in total execution cost and the success rate of meeting deadlines. Moreover, the introduced approach can reduce the number of instances required for scheduling when compared to the previous work of the current authors. The rest of the present article is organized as follows: Section 2 reviews related workflow scheduling. Section 3 formulates the scheduling model and describes its workflow application and resource model. Details of the proposed algorithm are provided in Section 4. Section 5 discusses the experimental results. Finally, the conclusion and future works are presented in Section 6.
Related work
In recent decades, workflow scheduling has been extensively investigated by academia and industrial researchers. For a traditional distributed system, such as a grid and cluster, most existing research in workflow scheduling focuses on how to minimize the workflow execution time. However, workflow scheduling in the cloud environment is mainly a multiobjective problem. Thus, aside from the execution time, various criteria, such as monetary cost, energy usage, reliability, and security, are considered as QoS requirements [6,7,8,9,10]. Among these, monetary cost and execution time are substantial requirements for workflow scheduling algorithms [2, 11,12,13,14,15]. Normally, for multiobjective cases in which some objectives must be optimal, it is difficult to solve the workflow scheduling problem in the cloud due to its NPcompleteness [16]. Therefore, various metaheuristic and heuristic techniques have been adopted to obtain nearoptimal solutions. This section briefly reviews several wellknown heuristics and metaheuristic workflow scheduling algorithms related to the present study’s proposed method. As the main category, deadlineaware workflow scheduling algorithms are first reviewed. This is followed by a discussion on budgetaware and then multiobjective schedulers.
Deadlineaware workflow scheduling
Malawski et al. [17] present a mathematical model to optimize the workflow scheduling cost under a deadline constraint. Their method considers a multicloud environment and formulates the scheduling problem as a mixedinteger program (MIP). Abrishami et al. [18] utilize the Partial Critical Path (PCP) concept to develop a deadlineconstrained workflow scheduler, named ICPCP, in a cloud environment. ICPCP [18] aims to minimize the overall execution cost of the workflow by determining a sequence of tasks as the partial critical paths (PCPs) and mapping all of these tasks to the same VM instance. The preference of the algorithm is to utilize the already leased instances which are able to meet the deadline. ICPCP distributes the overall deadline to the PCPs. The Enhanced ICPCP with Replication (EIPR) algorithm [19] attempts to further reduce costs by replicating tasks during the idle times of instances and eliminating some communications. Its experimental results show that the probability of meeting deadlines increases via task replications. The Deadline Constrained Critical Path (DCCP) [20] is a listbased scheduling algorithm on the cloud that aims to meet the userdefined deadline while minimizing the overall workflow execution cost. In the preprocessing step, tasks are partitioned into different levels, to each of which a subdeadline is assigned. These deadlines are distributed nonuniformly among all levels so that the levels with a longer task execution time receive a longer subdeadline. In the task prioritization step, DCCP utilizes a concept called the Constrained Critical Path (CCP) to assign all tasks on a path to one resource in order to reduce the communication time of the whole workflow. DCCP finds all CCPs and creates a list based on their modified rank. In each step, the ready tasks of each CCP are mapped to an appropriate resource and other tasks remain for the next steps. Rodriguez and Buyya [21] propose a metaheuristic scheduler in the cloud that intends to minimize the execution cost for deadlineconstrained workflows. In this algorithm, resource provisioning and task assignment are integrated as a particle swarm optimization problem. The algorithm produces a nearoptimal schedule that determines the number and types of VMs with their leasing period and task assignment. Guo et al. [1] also introduce a PSO^{Footnote 1}based algorithm for scheduling a deadlineconstrained workflow across multiple clouds. Their algorithm minimizes the execution cost of the workflow while meeting the userdefined deadline. Furthermore, the algorithm optimizes the performance for both computation cost and data transfer cost across multiple clouds. Proportional Deadline Constrained (PDC) [12] is a workflow scheduling algorithm on the cloud that attempts to meet the userdefined deadline while minimizing the execution cost. In the preprocessing step, PDC partitions tasks into different levels and assigns a subdeadline to each level. The userdefined deadline is distributed nonuniformly among all levels so that levels with a longer task execution time receive a longer subdeadline. PDC creates a list of ready tasks and prioritizes them according to a downward rank. In [2], two schedulers, namely LACO^{Footnote 2} and ProLiS,^{Footnote 3} are presented to schedule the deadlineconstrained workflow application. ProLiS is a list scheduling algorithm that performs deadline distribution based on the new definition of the probabilistic upward rank. LACO is a metaheuristic algorithm that accomplishes the cost optimization of a deadlineconstrained workflow based on ant colony optimization. Deadline distribution and service selection in LACO is the same as in ProLiS.
Budgetaware workflow scheduling
A budget indicates the maximum amount of money that users are willing to pay for the execution of a workflow application in cloud resources [3]. In [22], the authors propose a Heterogeneous Budget Constrained Scheduling (HBCS) algorithm that minimizes the total workflow while meeting the user’s specified budget. The HBCS defines an attribute called worthiness which combines the time and cost factors for the current task resource selection. Faragardi et al. [4] introduce Greedy Resource Provisioning and a modified HEFT (GRPHEFT) for minimizing the workflow execution time subject to a budget constraint. They propose a greedy algorithm to list the instance types according to their efficiency rate and modify the HEFT [23] algorithm so that it considers a budget constraint. In [24], Wu et al. present a heuristic algorithm, called PCPB, to schedule a workflow with a budget constraint. PCPB implements the idea of balancing a budget among the partial critical paths according to their parallel or sequential structural nature. Budget distribution is performed based on the binary search method.
Multiobjective workflow scheduling
In this category, most strategies try to find a suitable mapping of workflow tasks to cloud resources that respects deadline and budget constraints at the same time. Budget and Deadline Constraint Heterogeneous Earliest Finish Time (BDHEFT) [15] is a multiobjective algorithm proposed to schedule workflow applications on a cloud. BDHEFT leverages the upward ranks to assign a priority to each task. In addition, a set of best possible resources is constructed for each selected task via the following six variables: Spare Workflow Budget (SWB), Spare Workflow Deadline (SWD), Current Task Deadline (CTD), Current Task Budget (CTB), Adjustment Factor (BAF), and Deadline Adjustment Factor (DAF). By considering the spare deadline and spare budget of each task, the resource is selected from the best possible resource set for each task, in which the overall execution time and execution cost of the workflow execution are simultaneously minimized. Durillo and Prodan propose the multiobjective heterogeneous earliest finish time (MOHEFT) algorithm [25] as an extension of HEFT [23]. MOHEFT computes a set of Paretobased solutions from which users can select the best one. As noted by the authors, most of the solutions computing the Paretofront are based on genetic algorithms. The algorithm is generic in terms of the number and types of objectives and so the makespan and overall cost of the workflow applications can be optimized. Wu et al. [26] present a PSObased strategy for workflow scheduling in the clouds. Their aim is to reduce either the makespan or cost while satisfying either the budget or deadline constraints. The elasticity of resource provision is ignored and it is assumed that several initialized VMs are available in advance. In [27], a heuristicbased scheduling algorithm is proposed to schedule the workflow under deadline and budget constraints. The algorithm utilizes a novel tradeoff factor between time and cost to determine the most viable scheduling and the most appropriate instance type for provisioning.
Problem statement
The present study addresses the problem of the cost optimization of deadline constrained workflow scheduling in the cloud. This section first explains the workflow model, resource model, and definitions related to this problem. Subsequently, the problem formulation is presented.
Workflow model
A workflow application can be modeled as a directed acyclic graph (DAG), G = (V, E), where V = {t_{1}, t_{2}, …, t_{n}} is a set of all workflow tasks illustrated by graph vertices and E = {e_{i, j} = (t_{i}, t_{j}) t_{i}, t_{j} ∈ V} represents the dependencies among the tasks. Each e_{i, j} indicates the precedence between t_{i} and t_{j}, meaning that t_{j} can be performed when t_{i} is completed. Besides the dependencies, the data transmission among tasks is represented by the weight attached to e_{i, j} . Furthermore, data_{i, j} shows the amount of data transferred to t_{j} after t_{i} is completed; hence, the execution of t_{j} can only start after data_{i, j} has already been made available. On the other hand, a task can be executed if all its predecessors are terminated. Note that, on each edge, e_{i, j}, t_{i} is a predecessor of t_{j} and t_{j} is a successor of t_{i}. Each task may have one or more predecessors and successors except for t_{entry} and t_{exit} . t_{entry} is a task with no predecessor and t_{exit} is a task with no successor. To generalize the workflow with one entry and one exit, two dummy tasks, t_{entry} and t_{exit}, with zero execution time and without data transmission, are added to the beginning and the end of the workflow, respectively. Fig. 1 illustrates a sample workflow represented by a DAG.
Resource model
The present article considers IaaS as a cloud service provider. IaaS provides a variety of computational resources with different costs via virtual machines (VMs) that feature different processing capabilities, memory, and storage. VMs with higher processing capabilities are assumed to have higher costs. A running virtual machine is called an instance and users can request infinite instances from the cloud service provider. In the current study, instances are provisioned ondemand and the pricing model is considered to be payasyougo hourlybased, which is widely used by large public cloud providers [4]. In this model, a user must pay for the whole hour even though use of an instance is for less than an hour. The current work denotes VM = {vm_{1}, vm_{2}, …, vm_{n}} as a set of heterogeneous computational resources offered by the cloud service provider via VM and Cost = {C_{1}, C_{2}, …, C_{n}} as the cost of using each VM for an hour. It is also assumed that all computational resources are in the same region. Thus, the average bandwidth between instances is roughly identical and internal data transfer is free of charge [2].
Definitions
Task execution time
Given that the VMs offered by the service provider are heterogeneous, each task has a different execution time based on the type of VM running on it. Therefore, the execution time of task t_{i} on vm_{j} is denoted by w_{i, j}. It should be noted that w_{i, j} is the worstcase execution time of t_{i} on vm_{j} and it is assumed that only one task can be executed on each VM instance at any time.
Communication time
The communication time between two tasks is the amount of time necessary to transfer data from t_{i} to t_{j}, as shown in (1). When both tasks are executed on the same instance, the communication time becomes zero.
where Inst(t_{i}) represents the instance on which t_{i} is mapped, data_{i, j} shows the amount of data transferred to t_{j} after t_{i} ‘s completion, and BW is the bandwidth between the two instances.
Earliest start time (EST) and earliest finish time (EFT)
For each unscheduled task, t_{i}, EST(t_{i}) is defined as the earliest time when t_{i} can start its execution after all its predecessors have finished and after having received the associated data. EST is calculated as follows:
where CM_{j, i} is the communication time and W(t_{j}) is the shortest execution time of t_{j}, which is defined as follows:
According to the earliest start time definition, the earliest finish time of each unscheduled task, t_{i}, can be calculated as follows:
Actual start time (AST) and actual finish time (AFT)
After assigning a task to the desired instance, AST and AFT are obtained for each task. These values can be different from the EST and EFT that are determined before scheduling. Equation (5) shows the relation between these parameters.
where j is the type of instance to which t_{i} is assigned.
Critical path and critical tasks
The critical path (CP) in a workflow is the path from the entry task to the exit task which has the maximum summation of task execution times and intertask communication times among all paths [23]. All the tasks on a critical path are called critical tasks.
Linear graph and nonlinear graph
The present paper divides the workflow graphs into two categories: linear graphs and nonlinear graphs. A linear graph is a connected graph with a chain structure. Each vertex in this graph, which corresponds to a task in the workflow, has exactly one parent and one child, except for the entry task, which has no parent, and the exit task, which has no child. If a graph contains other structures, it is called a nonlinear graph. Figure 2 provides a graphic representation of linear and nonlinear graphs.
Problem formulation
The issue presented in the current article is the deadlineconstrained cost optimization of workflow scheduling in an IaaS cloud environment. Considering wf as the workflow and D as the workflow deadline, this subsection first defines the overall workflow execution cost and then formulates the problem.
The total execution cost of a workflow, Cost_{total}(wf), is the summation of all the costs of the used instances during scheduling, as follows:
where Inst_{i} is the i^{th} leased instance and n is the total number of leased instances. Cost(Inst_{i}) depends on two parameters: the total time of utilizing Inst_{i} and the price of Inst_{i} for an hour. As mentioned earlier, there is an hourlybased price model and the user must pay for the whole hour even though the instance is used for less than an hour. As a result, the instance cost is calculated as follows:
where C(Inst_{i}) is the specified cost for utilizing Inst_{i} for 1 hour. According to the previous definitions, the present study’s optimization scheduling problem is applied to achieve a mapping of tasks to suitable instances so as to minimize the total monetary cost while not exceeding the deadline constraint. Relation (8) shows our problem formulation.
Proposed method
The proposed solution for the cloud workflow scheduling problem is a static method that aims to minimize the execution cost of the workflow while meeting its deadline. For minimizing the execution cost, the introduced method focuses on the repeated use of the critical path concept. Dividing the initial workflow into several subworkflows makes it possible to define the critical path in each resulting subworkflow. According to the workflow structure and the dependencies among tasks, the original divide and conquer method is modified to be consistent with the goals of the current study. The proposed algorithm consists of two main phases: the division phase and the linear graph scheduling phase. In the division phase, the workflow division algorithm is presented to perform the workflow dividing operations according to the current authors’ previous research (DQWS) [5]. The division of the workflow is performed by determining, scheduling, and removing the workflow critical path. By removing the critical path, the workflow leftovers are divided into one or more smaller subworkflows and this process is repeated for each subworkflow. The creation of subworkflows as the structure of the linear graphs is the stop condition of the divide and conquer process and the subworkflow scheduling is considered as a small solvable problem. By scheduling linear graphs with the scoring method, the proposed linear graph scheduling phase takes on a completely different approach from the current authors’ previous work. As the difference between the proposed method and the present authors’ past work (DQWS) lies in this second phase of the introduced algorithm, Section 4–1 shall first present the generalities of the proposed workflow division algorithm while Section 4–2 will go on to explain the details of the proposed linear graph scheduling algorithm.
Workflow division algorithm
The workflow division algorithm follows different steps to divide the initial workflow into subworkflows. Similar to DQWS [5], the proposed algorithm (Fig. 3) first determines the workflow’s critical path. This is achieved by the Find _ CriticalPath function shown in Fig. 4. In determining the critical path, it is important to identify the critical parent of each task. The critical parent of t_{i} is one of its parents, such as t_{p}, which maximizes the expression, EFT(t_{p}) + CM_{p, i}. CM_{p, i} is the communication time between t_{p} and t_{i}.
After determining a critical path, a set of possible resources is defined to schedule it by the Find _ PossibleResources function. This function searches among a variety of resource types offered by the service provider that are capable of scheduling the critical path tasks before the userdefined deadline and creates a set of these resource types as a possible resource set. The cheapest resource type from this set is then selected and the critical path tasks are preassigned on an instance of the selected resource. Because the critical path is eliminated from the workflow in the next steps, it is necessary to finalize the scheduling of the critical path task. The Check _ Subpaths function is responsible for finalizing critical path scheduling. To accomplish this, the Check _ Subpaths function checks all subpaths leading to the critical path and examines the possibility that each subpath can be scheduled by at least one type of resource. The Find _ Subpaths function determines the subpaths leading to each critical task, as shown in Fig. 5.
The successful output of the Check _ Subpaths function indicates that the critical path scheduling on the selected resource is finalized. Fig. 6 presents the details of this function. After finalization of the critical path schedule, the tasks in this path are removed from the workflow and the workflow is divided into one or more connected graphs. Depending on their structure, each of the resulting graphs is added to one of the linear graph or nonlinear graph sets. This operation is performed for all graphs in the nonlinear graph set.
Linear graph scheduling algorithm
Since the linear graphs obtained from different stages of the workflow division are independent of each other, there is no requirement to observe a specific order in their execution. However, a specific arrangement or combination in their scheduling can be effective in choosing the resource type, which, in turn, will affect the workflow execution cost. In contrast to the current authors’ previous study (DQWS) [5], which employs a greedy method to schedule linear graphs, the present paper utilizes a scoring function for its linear graph scheduling. As shown in Fig. 7, the proposed algorithm consists of three different phases: the initialization phase, the internal combination phase, and the external combination phase. The following presents the details of these different algorithm phases.
Initialization phase
The objective of the algorithm initialization phase (Lines 2–9) is to determine the candidate resources for each linear graph and distribute the linear graphs in the sets associated with their candidate resources. A candidate resource is a resource type whose linear graph execution cost is the lowest among the other resource types. Each linear graph may have more than one candidate resource. At the beginning of this phase, a dependent set \(, {S}_{R_i}\), is defined for each resource type, R_{i}, offered by the service provider and, for each linear graph, candidate resources are determined. Then, each linear graph is prescheduled on a new instance of each candidate resource type and then added to the sets that belong to its candidate resources.
Internal combination phase
The objective of the proposed algorithm’s second phase (Lines 10–19) is to reduce the workflow execution cost, as performed by pairwise combinations of linear graphs. During the process of combination, the preference of the algorithm is lowering the cost and, as a result, reducing the number of instances required for scheduling. In this phase, in each of the defined sets, \({S}_{R_i}\), all pairwise combinations of linear graphs are determined and the score of each combination is calculated by the Score _ Combination function. As shown in (9), the combined score is calculated as the difference in execution costs when both linear graphs are scheduled on one sample and when each is scheduled on a separate sample.
Due to the large number of linear graphs in each set, determining all pairwise combinations of the linear graphs as well as calculating the score of each combination is quite time consuming. For this reason, a parallel algorithm performs this function. Subsection 4–2–21 provides the details of the Score _ Combination function. After the score of all combinations is calculated, the combination with the highest score and its set are determined and the combined graph is added to the selected set. Then the initial linear graphs making up the combined graph are removed from all sets. Since the sets are not disjoint, the initial linear graphs must be removed from all the sets. Determining the pairwise combinations of linear graphs and selecting the best combination is repeated until the score of all combinations in all sets is negative.
Details of the scoring function
The score of each binary combination is calculated by the Score _ Combination function. Figure 8 presents the pseudocode of this function. At first, the possibility of combining two linear graphs is explored. If this is not possible, the combination cost is considered infinite.

Definition: G_{1} and G_{2} are linear graphs and D_{1} and D_{2} are their deadlines, respectively. These linear graphs are uncombinable if the total execution time of their tasks is greater than both D_{1} and D_{2}.
To determine the execution cost of the combinable graphs, two linear graphs are first prescheduled on one instance. Prescheduling is performed in two different ways: prescheduling for graphs with a time overlap and prescheduling for graphs without a time overlap.

Definition: Linear graphs, G1 and G2, are not timeoverlapped graphs if:
In this case, by prescheduling the tasks of both graphs on one instance, the order, start time, and finish time of the tasks does not change.
To preschedule linear graphs with a time overlap, the present study provides a new merge list algorithm. The proposed merge list algorithm receives two lists as input. Each list includes the tasks of one linear graph that are arranged from entry task to exit task. The output of this algorithm will also be a list that specifies the final sequence of the tasks in the combined graph. The current work’s contribution to the merge list algorithm is the header selection method. To select a header, the laxity of all tasks in both graphs is first determined and the following conditions are considered:

a.
Selecting each header creates negative laxity for one or more tasks of the other list.

b.
Selecting one of the headers creates negative laxity for one or more tasks of the other list.

c.
Neither of the choices leads to negative laxity.
In case (a), it is not possible to combine two linear graphs; hence, the algorithm terminates. In case (b), a header is selected that does not create a negative laxity. In case (c), a header is chosen with an earlier start time in its initial linear graph and, if the start time of both headers is the same, then a header with lower laxity is selected. Fig. 9 shows an illustrative example for combining two linear graphs, G_{1} and G_{2}, with the proposed modified merge list algorithm.
After examining the possibility of combining linear graphs, the execution cost of the combined graphs is calculated by (7).
External combination phase
In this phase (Lines 20–27), the proposed algorithm aims to transfer the scheduled tasks of the less powerful instances to the idle times of the more powerful ones. This transfer can reduce the number of used instances and consequently the overall execution cost. For this purpose, in each stage, the idletime periods of the most powerful utilized instance are determined. For each idle time, the present study looks for an appropriate linear graph that can be scheduled on the idle time before its deadline. After the desired linear graph is determined, its tasks are transferred and the original instance is removed from the required instances.
Evaluation
This section first describes the current work’s experimental setting and evaluation criteria, and then compares the proposed method with stateoftheart approaches that are similar in terms of objectives. The methods used for comparison are ICPCP [18], BDHEFT [15], PDC [12], and the present authors’ previous study, DQWS [5]. All of these algorithms are reviewed in the Related Work section.
Experimental setting and evaluation criteria
To evaluate the performance of the proposed algorithm, different simulation scenarios are run. Simulation is a wellaccepted approach for evaluating workflow scheduling algorithms. With simulation, it is possible to test the performance of the algorithms under a controlled setting [27] . In the present study’s simulation, which is performed by MATLAB, the service provider offers five different VM types with different costs and processing powers. The characteristics of the VMs are based on the USeast Amazon region and are presented in Table 1. Unlimited instances of each VM type are assumed. VMs are in a single data center and the average bandwidth between instances is fixed at 20 MBps [28].
The current study conducts its experiments using four different scientific workflows: CyberShake, Epigenomics, LIGO, and Motif. These workflows are diverse in terms of structure, computational characteristics, and communication data. In addition, the workflows are employed in different scientific areas, such as earthquake science, biology, gravitational physics, and genetics. Full descriptions of these workflows are provided in [29, 30]. Fig. 10 depicts the structure of each workflow in a relatively small size. These workflows are provided by the Pegasus workflow generator^{Footnote 4} as a DAX (Directed Acyclic Graph in XML) format and, for each workflow, the details, including the DAG, the sizes of data transfer, and the tasks execution time, are published. These workflows have been widely used for evaluating the performance of scheduling algorithms, and thus they are included in the current study’s experiments. In order to determine the execution time of each workflow task by each VM type, we assume the published execution time for each task is calculated for the VM with ECU = 1. With this assumption, the execution time of each task on the other VM types is calculated by dividing its published execution time by the ECU value of the VM which is shown in Table 1. As shown in Table 2, the present work chooses four different sizes for each workflow type and generates 50 random samples from each type and size of workflow.
To assess the performance of the compared algorithms, it is necessary to determine the acceptable values for each workflow’s deadline. The concept of the fastest schedule is utilized to determine deadlines proportionate to each workflow structure. The fastest schedule is the scheduling of each workflow task on a distinct instance of the fastest VM type without considering the communication time between tasks [18]. The makespan obtained by the fastest schedule for each workflow, makespan_{LB}, is the lower bound of the overall workflow execution time. Using makespan_{LB}, the present study calculates a variation for a deadline, from tight to relaxed, as follows:
α starts from 2 and is increased by 2 up to a value of 10. In this way, the impact of different deadlines (from tight to relaxed) is evaluated on the performance of each algorithm.
To evaluate the performance of the algorithm, the present work analyzes the following metrics in its experiments: normalized cost, success rate, and instance reduction. Each metric is explained as follows:

Normalized cost: The overall workflow execution cost obtained by each of the algorithms is a suitable criterion for evaluating the compared algorithms. Because of the difference in the structure and characteristics of the benchmark workflows, the current paper employs a normalized cost metric for comparison. The normalized cost of executing each workflow with any of the benchmark algorithms is obtained by dividing its overall execution cost, cost(wf), by its scheduling cost based on the cheapest scheduling strategy(cost_{lowest}(wf)) [18], as follows:

Success rate: To evaluate the capability of each algorithm to meet the deadline constraints, the present study defines the metric of success rate as the ratio between the number of successful schedules and the total number of schedules.

Instance reduction: To compare the performance of the proposed algorithm against that of the current authors’ previous research (DQWS), an instance reduction parameter is introduced. This parameter shows the reduction factor in the percentage of the number of instances needed for scheduling each workflow type by EDQWS in comparison to DQWS. Since the present study’s focus in the linear graph scheduling phase is the combination of linear graphs, reducing the number of instances, in addition to minimizing the cost, can serve as a metric for comparing the two methods. A reduction in the number of instances required to schedule workflows can lower the probability of instance failure. This is also helpful for cloud providers which have a limitation in the number of requested instances.
Experimental result
The performance of the proposed algorithm is evaluated in three separate subsections. In the first subsection, the execution cost of each workflow type is examined by all compared algorithms. In the second subsection, the success rate of all compared algorithms is evaluated. In the third subsection, the two methods, DQWS and EDQWS, are compared in more detail. For this purpose, a comparison is made between the average cost reduction in EDQWS and in DQWS. Additionally, the number of instances required for each algorithm to schedule each workflow type is studied.
Normalized cost analysis
In this subsection, the execution cost of each workflow is examined by all the compared algorithms. In these experiments, 50 random samples are generated from each type and size of the workflows, and each random sample is scheduled by all algorithms for five different deadline factors. Given that the results of running each algorithm on different workflow sizes are relatively similar, only the results of implementing the largest size of each workflow type are reported. Fig. 11 presents the normalized cost of all of the compared algorithms when executing the four workflow types (CyberShake, Epigenomics, LIGO, and Motif). The normalized cost values are the mean value of the 50 random samples generated for each workflow type. Besides the mean values, the standard deviation is also provided in Fig. 11. The reporting of ‘Fail’ in each graph indicates that the specified algorithm was unable to schedule the workflow within the given deadline factor.
Figure 11.a presents the results of the CyberShake workflow execution by the five compared algorithms. As observed in Fig. 11.a, none of the compared algorithms are able to schedule this type of workflow within the deadline factor of 2. The lowest execution cost for this type of workflow is obtained by the EDQWS algorithm. In this type of workflow, DQWS and EDQWS succeed in creating multiple linear graphs with a small number of tasks and short execution times. As a result, utilizing the idle time of instances for scheduling linear graphs can reduce the execution cost of this workflow type.
Although the two algorithms utilize the same linear graphs, a comparison of their normalized cost reduction indicates that EDQWS’ consideration of different linear graph combinations in instance type selection leads to more appropriate solutions than DQWS’ usage of the greedy method.
As seen in Fig. 11.a, the PDC algorithm failed to schedule any random samples of the CyberShake workflow within the specified deadlines. The failure of PDC in scheduling this type of workflow is due to the way in which it sets the initial deadline. PDC determines the initial deadline for each workflow by considering the communication time between tasks. If the userdefined deadline is sooner than the initial deadline, the workflow will not be scheduled. As mentioned earlier, for determining the deadline, the proposed algorithm employs the concept of the fastest schedule which eliminates the communication time between tasks during the deadline calculation. Because the execution time of CyberShake tasks are short and the communication time between them is long, the deadlines calculated by the fastest schedule are shorter than the PDC initial deadline. Therefore, the results of the PDC algorithm are not included in Fig. 11.a.
In Epigenomics, the possibility that DQWS and EDQWS will utilize the resource idle time is less than that of the other workflows because of the high execution time of tasks. As seen in Fig. 11.b, the normalized cost obtained by DQWS, EDQWS, and PDC is almost the same, while BDHEFT reports a higher execution cost than the other algorithms. For the deadline factors that ICPCP is successful in meeting, its normalized execution cost nears that of DQWS, EDQWS, and PDC.
Figure 11.c shows the results of LIGO scheduling by all of the compared algorithms. In this type of workflow, the normalized cost obtained by EDQWS is the lowest among the other algorithms. Due to the existence of several parallel tasks in the structure of this workflow, the creation of multiple linear graphs has increased the possibility of utilizing instance idle times. Furthermore, the usage of the scoring method and examination of linear graph combinations may be the reason why the normalized cost reduction in EDQWS is greater than that of the greedy method.
Figure 11.d provides the results obtained from the present study’s simulations for the Motif workflow. As seen, ICPCP, DQWS, and EDQWS outperform the BDHEFT and PDC algorithms. Based on the results obtained from ICPCP, DQWS, and EDQWS, it is observed that, in tight deadlines, the DQWS and EDQWS algorithms perform better while, in relaxed deadlines, ICPCP’s performance is superior. For motif scheduling, EDQWS performs much like DQWS and the normalized cost of these two algorithms is the same. A possible reason for this similarity may be that the motif structure creates singletask linear graphs during the division phase. Subsection 5–23 shall compare these two methods in more detail.
Success rate analysis
By considering different deadline factors, this section examines the success rate of the compared algorithms for each workflow type. All random samples created from any type of workflow are included in the success rate calculation. Fig. 12 provides the success rates of the five algorithms (EDQWS, DQWS, PDC, BDHEFT, and ICPCP). As the results show, the success rates of EDQWS and DQWS are exactly the same for all deadline factors and the different workflows. Given that these two algorithms employ the same method to create linear graphs, it can be concluded that the method of scheduling linear graphs does not impact the success or failure of the schedule but only affects the workflow execution cost.
As shown in Fig. 12, for all workflow types, the proposed method has a higher success rate than that of the PDC, ICPCP, and BDHEFT algorithms. PDC’s low success rate may be explained by the fact that this algorithm does not accept most of the deadline factors and so is unable to schedule the workflow.
The comparison of DQWS and EDQWS
This section compares the current authors’ newly introduced EDQWS algorithm and their previous algorithm, DQWS. As mentioned earlier, the steps for creating linear graphs in these two algorithms are similar, but their methods differ in the linear graph scheduling phase. For scheduling linear graphs, DQWS utilizes a greedy method while the present paper’s EDQWS algorithm employs a scoring method. To evaluate the performance of these two algorithms, two comparisons are made. First, the number of instances required by EDQWS and DQWS to schedule each workflow type is compared. Second, the difference between these two algorithms’ normalized cost and the number of instances are examined.
Figure 13 shows the number of instances required by DQWS and EDQWS to schedule the CyberShake, Epigenomics, LIGO, and Motif workflows. It should be noted that the number of instance values is the mean value of 50 random samples generated for each workflow type. As seen in Fig. 13, in comparison to DQWS, the number of instances required for scheduling CyberShake and Epigenomics is greatly reduced by the EDQWS algorithm. However, this reduction is not observed in the LIGO and Motif workflows.
Table 3 summarizes the results obtained from the above experiments. This table presents three different parameters for each algorithm: the average percentage of cost reduction in EDQWS compared to DQWS, the average percentage of the reduction in the number of instances in EDQWS versus DQWS, and the percentage of experiments in which EDQWS succeeds in reducing the number of instances in comparison to DQWS.
As seen in Table 3, the average normalized cost for the three types of workflow (CyberShake, Epigenomics, and LIGO) is reduced by the proposed algorithm when compared to the DQWS algorithm. Also, the current work’s algorithm succeeds in reducing the average number of instances required to schedule CyberShake and Epigenomics. In LIGO, although the average number of instances in the proposed algorithm increases slightly, the number of instances in more than half of the experiments decreases in comparison to DQWS. In Motif, despite the same normalized cost in both algorithms, the number of instances decreases in a small percentage of the proposed method’s experiments.
Conclusion
The present paper proposes a static scheduling algorithm called EDQWS, which is an extension of a previous study by the current authors [5]. EDQWS is a twophase workflow scheduler based on divide and conquer and aims to minimize the overall workflow execution cost by considering a userdefined deadline. In the first phase, similar to the present authors’ previous research, the division of workflow into subworkflows is achieved by determining and scheduling the critical path and removing it from the workflow. By eliminating the critical path, the workflow is divided into several subworkflows, each of which undergoes this same division. The stop condition is to attain a subworkflow with a chain structure called a linear graph. For scheduling linear graphs in the second phase, the current work proposes a new merging algorithm to combine the resulting linear graphs, reduce the number of used instances, and minimize the overall execution cost. Also introduced is a scoring function to select the most efficient instances for scheduling the linear graphs.
The experiments are conducted with four wellknown workflows that determine whether EDQWS has an overall better performance than the stateoftheart algorithms, ICPCP, PDC, BDHEFT, and DQWS. In terms of the normalized cost parameter, EDQWS shows acceptable results when compared to the other methods. As for the success rate parameter, EDQWS and DQWS are completely the same in all deadline factors and for different workflows. Given that these two algorithms use the same method in creating linear graphs, the method of scheduling linear graphs has no effect on the success or failure of the schedule and only affects the workflow execution cost. However, the success rate of both algorithms is higher than that of the other methods, especially under tight deadlines. In comparing the performance of EDQWS with that of the present authors’ previous research (DQWS), the results show that, in more than 50% of the examined workflow samples, the number of resource instances decreases in EDQWS in comparison to DQWS. Reducing the number of resource instances in addition to decreasing the probability of instance failure also leads to a reduction in the overall execution cost of the examined workflows. According to the above description, it can be concluded that the definition of the scoring function and relying on it to combine linear graphs and select virtual machine types has led to a more appropriate selection of VM types than the other baseline methods. The use of our new merge list algorithm as well as the policy for transferring the task from the less powerful instances to the idle times of the more powerful ones has also had a significant impact on improving the results. The new merge list algorithm combines the tasks of linear graphs with respect to the laxity of tasks and improves the scheduling of tasks by changing the mapping of prescheduled tasks on the instances. By transferring the prescheduled tasks from the less powerful instances to the idle times of the more powerful ones, the algorithm can use these idle times for which there is no need to pay extra, and remove less powerful instances from the list of required instances. For future work, the current authors intend to extend their merging algorithm for more than two linear graphs. Furthermore, a nongreedy algorithm shall be proposed for selecting instance idletime to transfer scheduled tasks in the external combination phase.
Availability of data and materials
The data used during the current study are available from the corresponding author on reasonable request.
Notes
Particle swarm optimization (PSO)
List scheduling ant colony optimization (LACO)
Deadlineconstrained probabilistic list scheduling (ProLiS)
Abbreviations
 DAG:

Directed Acyclic Graph
 MIP:

Mixed Integer Programing
 VM:

Virtual Machine
 CP:

Critical Path
 DAX:

Directed Acyclic Graph in XML
References
Guo W, Lin B, Chen G, Chen Y, Liang F (2018) Costdriven scheduling for deadlinebased workflow across multiple clouds. IEEE Trans Netw Serv Manag 15(4):1571–1585. https://doi.org/10.1109/TNSM.2018.2872066
Wu Q, Ishikawa F, Zhu Q, Xia Y, Wen J (2017) Deadlineconstrained cost optimization approaches for workflow scheduling in clouds. IEEE Trans Parallel Distributed Syst 28(12):3401–3412. https://doi.org/10.1109/TPDS.2017.2735400
Rodriguez MA, Buyya R (2017) A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments. Concurrency Comput 29(8). https://doi.org/10.1002/cpe.4041
Faragardi HR, Saleh Sedghpour MR, Fazliahmadi S, Fahringer T, Rasouli N (2020) GRPHEFT: a budgetconstrained resource provisioning scheme for workflow scheduling in IaaS clouds. IEEE Trans Parallel Distributed Syst 31(6):1239–1254. https://doi.org/10.1109/TPDS.2019.2961098
Khojasteh Toussi G, Naghibzadeh M (2021) A divide and conquer approach to deadline constrained costoptimization workflow scheduling for the cloud. Clust Comput. https://doi.org/10.1007/s1058602003223x
Singh V, Gupta I, Jana PK (2020) An energy efficient algorithm for workflow scheduling in IaaS cloud. J Grid Comput 18(3):357–376. https://doi.org/10.1007/s10723019094902
Garg N, Singh D, Goraya MS (2021) Energy and resource efficient workflow scheduling in a virtualized cloud environment. Clust Comput 24(2):767–797. https://doi.org/10.1007/s10586020031494
Jiang J, Lin Y, Xie G, Fu L, Yang J (2017) Time and energy optimization algorithms for the static scheduling of multiple workflows in heterogeneous computing system. J Grid Comput 15(4):435–456. https://doi.org/10.1007/s1072301793915
Sreenu K, Sreelatha M (2019) Wscheduler: whale optimization for task scheduling in cloud computing. Clust Comput 22:1087–1098. https://doi.org/10.1007/s1058601710555
Wang S, Li K, Mei J, Xiao G, Li K (2017) A reliabilityaware task scheduling algorithm based on replication on heterogeneous computing systems. J Grid Comput 15(1):23–39. https://doi.org/10.1007/s1072301693867
Kalyan Chakravarthi K, Shyamala L, Vaidehi V (2020) Budget aware scheduling algorithm for workflow applications in IaaS clouds. Clust Comput 23(4):3405–3419. https://doi.org/10.1007/s10586020030951
Arabnejad V, Bubendorfer K, Ng B (2017) Scheduling deadline constrained scientific workflows on dynamically provisioned cloud resources. Futur Gener Comput Syst 75:348–364. https://doi.org/10.1016/j.future.2017.01.002
Rizvi N, Ramesh D (2020) Fair budget constrained workflow scheduling approach for heterogeneous clouds. Clust Comput 23(4):3185–3201. https://doi.org/10.1007/s10586020030791
Cao, S., Deng, K., Ren, K., Li, X., Nie, T., and Song, J.: ‘A deadlineconstrained scheduling algorithm for scientific workflows in clouds’, in Editor (Ed.)^(Eds.): ‘Book A deadlineconstrained scheduling algorithm for scientific workflows in clouds’ (Institute of Electrical and Electronics Engineers Inc., 2019, edn.), pp. 98–105
Verma A, Kaushal S (2015) Costtime efficient scheduling plan for executing workflows in the cloud. J Grid Comput 13(4):495–506. https://doi.org/10.1007/s1072301593449
Ullman JD (1975) NPcomplete scheduling problems. J Comput Syst Sci 10(3):384–393. https://doi.org/10.1016/S00220000(75)800080
Malawski M, Figiela K, Bubak M, Deelman E, Nabrzyski J (2015) Scheduling multilevel deadlineconstrained scientific workflows on clouds based on cost optimization. Sci Program 2015. https://doi.org/10.1155/2015/680271
Abrishami S, Naghibzadeh M, Epema DHJ (2013) Deadlineconstrained workflow scheduling algorithms for infrastructure as a service clouds. Futur Gener Comput Syst 29(1):158–169. https://doi.org/10.1016/j.future.2012.05.004
Calheiros RN, Buyya R (2014) Meeting deadlines of scientific workflows in public clouds with tasks replication. IEEE Trans Parallel Distributed Syst 25(7):1787–1796. https://doi.org/10.1109/TPDS.2013.238
Arabnejad, V., Bubendorfer, K., Ng, B., and Chard, K.: ‘A Deadline Constrained Critical Path Heuristic for CostEffectively Scheduling Workflows C3  Proceedings  2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015’, in Editor (Ed.)^(Eds.): ‘Book A Deadline Constrained Critical Path Heuristic for CostEffectively Scheduling Workflows C3  Proceedings  2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing, UCC 2015’ (Institute of Electrical and Electronics Engineers Inc., 2015, edn.), pp. 242–250
Rodriguez MA, Buyya R (2014) Deadline based resource provisioningand scheduling algorithm for scientific workflows on clouds. IEEE Trans Cloud Comput 2(2):222–235
Arabnejad H, Barbosa JG (2014) A budget constrained scheduling algorithm for workflow applications. J Grid Comput 12(4):665–679. https://doi.org/10.1007/s1072301492947
Topcuoglu H, Hariri S, Wu MY (2002) Performanceeffective and lowcomplexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distributed Syst 13(3):260–274. https://doi.org/10.1109/71.993206
Wu F, Wu Q, Tan Y, Li R, Wang W (2016) PCPB2: partial critical path budget balanced scheduling algorithms for scientific workflow applications. Futur Gener Comput Syst 60:22–34. https://doi.org/10.1016/j.future.2016.01.004
Durillo JJ, Prodan R (2014) Multiobjective workflow scheduling in amazon EC2. Clust Comput 17(2):169–189. https://doi.org/10.1007/s1058601303250
Wu, Z., Ni, Z., Gu, L., and Liu, X. (2010). ‘A revised discrete particle swarm optimization for cloud workflow scheduling C3  Proceedings  2010 International Conference on Computational Intelligence and Security, CIS 2010’, in Editor (Ed.)^(Eds.): ‘Book A revised discrete particle swarm optimization for cloud workflow scheduling C3  Proceedings  2010 International Conference on Computational Intelligence and Security, CIS 2010’, pp. 184–188
Arabnejad V, Bubendorfer K, Ng B (2019) Budget and deadline aware escience workflow scheduling in clouds. IEEE Trans Parallel Distributed Syst 30(1):29–44. https://doi.org/10.1109/TPDS.2018.2849396
Palankar, M.R., Iamnitchi, A., Ripeanu, M., and Garfinkel, S. (2008). ‘Amazon S3 for science grids: a viable solution?’, in Editor (Ed)^(Eds): ‘Book Amazon S3 for science grids: a viable solution?’, pp. 55–64
Juve G, Chervenak A, Deelman E, Bharathi S, Mehta G, Vahi K (2013) Characterizing and profiling scientific workflows. Futur Gener Comput Syst 29(3):682–692. https://doi.org/10.1016/j.future.2012.08.015
Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M.H., and Vahi, K. (2008). ‘Characterization of scientific workflows C3–2008 3rd Workshop on Workflows in Support of LargeScale Science, WORKS 2008’, in Editor (Ed.)^(Eds.): ‘Book Characterization of scientific workflows C3–2008 3rd Workshop on Workflows in Support of LargeScale Science, WORKS 2008’, pp
Acknowledgments
Not applicable.
Funding
This work has no funding.
Author information
Authors and Affiliations
Contributions
All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Khojasteh Toussi, G., Naghibzadeh, M., Abrishami, S. et al. EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud. J Cloud Comp 11, 13 (2022). https://doi.org/10.1186/s13677022002848
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13677022002848
Keywords
 Workflow scheduling
 Cloud computing
 Critical path
 Merging algorithm
 Divide and conquer
 Scoring function