Task scheduling and resource allocation in cloud computing using a heuristic approach

Gawali, Mahendra Bhatu; Shinde, Subhash K.

doi:10.1186/s13677-018-0105-8

Research
Open access
Published: 08 February 2018

Task scheduling and resource allocation in cloud computing using a heuristic approach

Mahendra Bhatu Gawali¹ &
Subhash K. Shinde²

Journal of Cloud Computing volume 7, Article number: 4 (2018) Cite this article

39k Accesses
129 Citations
22 Altmetric
Metrics details

Abstract

Cloud computing is required by modern technology. Task scheduling and resource allocation are important aspects of cloud computing. This paper proposes a heuristic approach that combines the modified analytic hierarchy process (MAHP), bandwidth aware divisible scheduling (BATS) + BAR optimization, longest expected processing time preemption (LEPT), and divide-and-conquer methods to perform task scheduling and resource allocation. In this approach, each task is processed before its actual allocation to cloud resources using a MAHP process. The resources are allocated using the combined BATS + BAR optimization method, which considers the bandwidth and load of the cloud resources as constraints. In addition, the proposed system preempts resource intensive tasks using LEPT preemption. The divide-and-conquer approach improves the proposed system, as is proven experimentally through comparison with the existing BATS and improved differential evolution algorithm (IDEA) frameworks when turnaround time and response time are used as performance metrics.

Introduction

Cloud computing is an accelerating technology in the field of distributed computing. Cloud computing can be used in applications that include storing data, data analytics and IoT applications [1]. Cloud computing is a technology that has changed traditional ways in which services are deployed by enterprises or individuals. It provides different types of services to registered users as web services so that the users do not need to invest in computing infrastructure. Cloud computing provides services such as IaaS (Infrastructure as a Service), PaaS (Platform as a Service), and SaaS (Software as a Service) [2]. In each type of service, the users are expected to submit the requests to the service provider through the medium of the Internet. The service provider is responsible for managing the resources to fulfill the requests generated by users. Service Providers employ scheduling algorithms to schedule the incoming request (tasks) and to manage their computing resources efficiently. Task scheduling and resource management permit providers to maximize revenue and the utilization of resources up to their limits. In practice, in terms of the performance of cloud computing resources, the scheduling and allocation of resources are important hurdles. For this reason, researchers have been attracted to studies of task scheduling in cloud computing. Task scheduling is the process of arranging incoming requests (tasks) in a certain manner so that the available resources will be properly utilized. Because cloud computing is the technology that delivers services through the medium of the Internet, service users must submit their requests online. Because each service has a number of users, a number of requests (tasks) may be generated at a time. Systems that do not employ scheduling may feature longer waiting periods for tasks moreover, some short-term tasks may terminate, due to the waiting period. At the time of scheduling, the scheduler needs to consider a number of constraints, including the nature of the task, the size of the task, the task execution time, the availability of resources, the task queue, and the load on the resources. Task scheduling is one of the core issues in cloud computing. Proper task scheduling may result in the efficient utilization of resources. The major advantage of cloud computing is that it promotes proper utilization of resources [3]. Thus, task scheduling and resource allocation are two sides of a single coin. Each affects the other.

Currently, Internet users can access content anywhere and anytime, without needing to consider the hosting infrastructure. Such hosting infrastructure consists of various machines with various capabilities that are maintained and managed by the service provider. Cloud computing enhances the capabilities of such infrastructure, which can access the Internet. Cloud service providers earn profits by providing services to cloud service users.

The cloud service end user can use the entire stack of computing services, which ranges from hardware to applications. Services in cloud computing employ a pay-as-you-go basis. The cloud service end user can reduce or increase the available resources, per the demands of the applications. This is one the major advantages of cloud computing, but service users may be responsible for paying additional costs for this advantage. The cloud service user can rent the resources at any time and release them with no difficulty. The cloud service user has the freedom to employ any service based on application need. The freedom of service choice for users has led to problems; that is the next user request cannot be perfectly predicted. Thus, task scheduling and resource allocation are mandatory parts of cloud computing research. The efficiency of resource uses depends on the scheduling and load balancing methodologies, rather than the random allocation of resources. Cloud computing is widely used for solving complex tasks (user requests). In solving complex task issues, the use of scheduling algorithm is recommended. Such scheduling algorithms leverage the resources. The proposed system employs features of the Cybershake scientific workflow and the Epigenomics scientific workflow, which are described in Section Input Data.

The major contributions of this paper are summarized as follows.

1.
The analytic hierarchy process is modified to rank scientific tasks.
2.
To manage the resources given bandwidth constraints and the load on the virtual machine, the proposed system incorporates a version of the existing BATS algorithm that has been modified by introducing BAR system optimization.
3.
Bipartite graphs are utilized to map tasks to appropriate virtual machines once the condition is satisfied.
4.
A preemption methodology gives us the status of the virtual machine, and a modified divide-and-conquer methodology has been proposed to aggregate the results after tasks preemption.
5.
The proposed solution is experimentally investigated using the CloudSim simulator.

The remainder of the paper is organized as follows. Section “Introduction” provides an introduction to cloud computing and its outstanding issues, especially task scheduling and resource allocation. Section “Related work” focuses on related studies that investigate task scheduling and resource allocation. Section “Input data” describes the input data provided to the Cybershake scientific workflows and the Epigenomics scientific workflow. Section “Proposed system” addresses the architecture of the proposed system. Section "Proposed methodology” explains the proposed methodology. Section “Evaluation of the proposed heuristic approach” focuses on evaluating the proposed heuristic approach. Section “Results and discussion” describes the results and discusses the proposed system in comparison with the existing BATS and IDEA algorithms. Finally, concluding remarks and future directions are presented in Section “Conclusion”.

Related work

This section provides a brief review of task scheduling and resource allocation strategies. Many researchers have proposed solutions to overcome the problem of scheduling and resource allocation. However, further improvements can still be made. Tsai et al. [4] proposed a multi-object approach that employs the improved differential evolution algorithm. This existing method provides a cost and time model for cloud computing. However, variations in the tasks are not considered in this approach. Magukuri et al. [5] proposed a load balancing and scheduling algorithm that does not consider job sizes. The authors considered the refresh times of the server in fulfilling requests. Cheng et al. [6] introduced the scheduling of tasks based on a vacation queuing model. This methodology does not show the proper utilization of resources. Lin et al. [7] proposed the scheduling of tasks while considering bandwidth as a resource. A nonlinear programming model has been formed to allocate resources to tasks. Ergu et al. [8] proposed AHP ranking-based task scheduling. Zhu et al. [9] introduced rolling-horizon scheduling architecture to schedule real-time tasks. Authors have illustrated the relationship between task scheduling and energy conservation by resource allocation. Lin et al. [10] proposed scheduling for parallel workloads. Authors have used the FCFS approach to order jobs when resources are available. The proposed system does not focus on aborting the jobs and starvation. Ghanbari et al. [11] proposed a priority-based job scheduling algorithm for use in cloud computing. Multi criteria decisions and multiple attributes are considered. Polverini et al. [12] introduced the optimized cost of energy and queuing delay constraints. Alejandra et al. [13] proposed the use of meta-heuristic optimization and particle swarm optimization to reduce execution costs through scheduling. Keshk et al. [14] proposed the use of modified ant colony optimization in load balancing. This method improves the makespan of a job. This system does not consider the availability of resources or the weight of tasks. Shamsollah et al. [15] proposed a system based on a multi-criteria algorithm for scheduling server load. Shamsollah et al. [16] proposed a system based on priority for performing divisible load scheduling that employs analytical hierarchy process. Gougarzi et al. [17] proposed a resource allocation problem that aims to minimize the total energy cost of cloud computing systems while meeting the specified client-level SLAs in a probabilistic sense. Here, authors have applied a reverse approach that applies a penalty if the client does not meet the SLA agreements. Some authors have implemented a heuristic algorithm to solve task scheduling and resource allocation problem described above. Radojevic et al. [18] introduced central load balancing decision model for use in cloud environments; this model automates the scheduling process and reduces the role of human administrators. However, this model is deficient in determining the capabilities of nodes and, configuration details, and the complete system has no backup, thus resulting in a single point of failure. In addition, Ghanbari et al. [19] and Goswami et al. [20] focus on scheduling tasks while considering various constraints. This state-of the art motivates the authors of this study to conduct additional research on task scheduling and resource allocation.

Input data

Cybershake scientific workflow

Cloud computing is the service provider paradigm in which users submit requests for execution. Thus, the responsibility of the cloud service provider is to schedule various requests and manage resources efficiently. To the best of the authors’ knowledge, most existing work involves scheduling tasks once they enter a task queue. However, the actual procedure of scheduling tasks and resource management begins with how the service provider addresses incoming tasks. The proposed system uses Cybershake scientific workflow data as input tasks [21]. Fig. 1 shows a visualization of the Cybershake scientific workflow, which is used by the Southern California Earthquake Center (SCEC) to characterize earthquake hazards using the Probabilistic Seismic Hazard Analysis (PSHA) technique. It also generates Green strain tensors (GSTs). Table 1 shows the Cybershake seismogram synthesis tasks with their sizes and execution times. The Cybershake is a collection of various node data that are available for study [22]. The Cybershake scientific workflow sample tasks are available with task size 30,50,100 and 1000. From a computational point of view, the seismogram synthesis tasks are quite demanding. The Cybershake spends a lot of time on seismogram synthesis during its execution. These types of tasks also require large amount of computational resources, such as CPU time, and memory.

Table 1 Cybershake seismogram synthesis tasks

Task scheduling and resource allocation in cloud computing using a heuristic approach

Abstract

Introduction

Related work

Input data

Cybershake scientific workflow

Epigenomics scientific workflow

Proposed system

Proposed methodology

Analytic hierarchy process

BATS+ BAR system

Bipartite graph

After allocating the tasks, the constructed bipartite is updated if any task remain to be processed

Preemption of the task

Divide-and-conquer methodology

Evaluation of the proposed heuristic approach

Experimental setup

Results and discussion

Evaluation of turnaround time

Evaluation of response time

Evaluation of CPU utilization

Evaluation of memory utilization

Evaluation of bandwidth utilization

Conclusion

Abbreviations

References

Acknowledgments

Funding

Availability of data and materials

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords