A novel hybrid of Shortest job first and round Robin with dynamic variable quantum time task scheduling technique

Cloud computing is a ubiquitous network access model to a shared pool of configurable computing resources where available resources must be checked and scheduled using an efficient task scheduler to be assigned to clients. Most of the existing task schedulers, did not achieve the required standards and requirements as some of them only concentrated on waiting time or response time reduction or even both neglecting the starved processes at all. In this paper, we propose a novel hybrid task scheduling algorithm named (SRDQ) combining Shortest-Job-First (SJF) and Round Robin (RR) schedulers considering a dynamic variable task quantum. The proposed algorithms mainly relies on two basic keys the first having a dynamic task quantum to balance waiting time between short and long tasks while the second involves splitting the ready queue into two sub-queues, Q1 for the short tasks and the other for the long ones. Assigning tasks to resources from Q1 or Q2 are done mutually two tasks from Q1 and one task from Q2. For evaluation purpose, three different datasets were utilized during the algorithm simulation conducted using CloudSim environment toolkit 3.0.3 against three different scheduling algorithms SJF, RR and Time Slice Priority Based RR (TSPBRR) Experimentations results and tests indicated the superiority of the proposed algorithm over the state of art in reducing waiting time, response time and partially the starvation of long tasks.


Introduction
The appearance of cloud computing systems represent a revolution in modern information technology (IT) that needs to have an efficient and powerful architecture to be applied in different systems that require complex computing and big-scale. Cloud is a platform that can support elastic applications in order to manage limited virtual machines and computing servers to application services at a given instance of time. The cloud is a suitable environment of multi-tenant computing which allows the users to share resources. In cloud, available resources must be checked and scheduled using an efficient task scheduler to be assigned to clients based on their requests [1][2][3].
Having an efficient task scheduler became an urgent need with the rapid growth of modern computer systems aiming to reach and achieve the optimal performance. Task scheduling algorithms are responsible for mapping jobs submitted to cloud environment onto available resources in such a way that the total response time and latency are minimized and the throughput and utilization of resources are maximized [3,4]. Conventional task scheduling algorithms as Shortest-Job-First (SJF) [5], Round Robin (RR) [6], and First-Come-First-Serve (FCFS) [7], Multilevel queue scheduling (MQ) [8], Max-Min [9] and Min-Min [10] had achieved breathtaking results over years in different computer systems types but always suffer from big dilemmas as higher waiting time in RR and FCFS and starvation in SJF and Max-Min.
Starvation problem is one of the major challenges that face task scheduling in cloud where, a task may wait for one or more of its requested resources for a very long time. Starvation is frequently brought on by lapses in a scheduling calculation, by resource spills, and can be deliberately created by means of a refusal of-administration assault. For example, if an ineffectively planned multitasking framework dependably switches between the initial two tasks while a third never gets the chance to run, then the third task is starving [11]. Many hybrids were introduced to solve the starvation problem as FCFS&RR and SRTF&RR [12] (note that RR is always a common denominator in these hybrids) but, no one till now solves or nearly approach to solve it. A hybrid of SJF and RR is one of the most used and powerful hybrids for solving starvation where we can benefit from SJF performance in reducing the turnaround time and from RR in reducing task waiting time. But the task quantum value was always the obstacle in having the optimum hybrid. Different researches were proceeded to find the best methodology to calculate the task quantum value as having small quantum leads to reducing throughput and increasing response time while having long quantum caused a high increase in turnaround time.
From this point and confessing the importance of starvation problem in task scheduling, we introduce in this paper a novel hybrid scheduling technique of SJF and RR with Dynamic Quantum, we called SRDQ applied through having two Queues to schedule processes for execution. The proposed algorithm is designed to be a unit based algorithm based on effectively queuing data structure and optimizing the execution time as possible. In this proposed technique, a time quantum value is statically and dynamically determined towards detecting the impact of quantum dynamicity over starvation and response time reduction which will be described in details in Section 3.3.
The remainder of this paper is organized as 5 main sections where, in Section 2, the concepts of task scheduling and some related work are presented. In Section 3.3, the proposed technique is presented enhanced with examples. While, the simulations settings are analyzed in Section 4. Section 5 comprises the discussion and results and finally, our main conclusions and future work are discussed in Section 6.

Related work
Task scheduling algorithms vary in their technique in scheduling tasks among cloud nodes statically, dynamically, in batches or even online, eventually they are all trying to achieve the optimal distribution of tasks overs cloud nodes. Through this section different task scheduling algorithms applied in cloud environment with suitable verification and different aims will be presented and discussed in details. As Fang et al., in [13] introduced a two levels task scheduling mechanism based on load balancing in cloud computing. Through the first level a task description of each virtual machine (VM) is created including network resources, storage resources and other resources based on the needs of the tasks created by the user applications. In the second level scheduler assigns the adequate resources to each VM considering its load to achieve load balancing among VMs. According to the authors, this task scheduling mechanism can not only meet user's requirements, but also get high resource utilization, which was proved by simulation results in the CloudSim toolkit, although this model did not consider the network bandwidth usability and its impact on VMs load.
In [14] Lin et al., proposed a Dynamic Round-Robin (DRR) algorithm for energy-aware virtual machine scheduling and consolidation during which VMs are moved from the retired physical machine to other physical machines in duty . According to the authors, the proposed algorithm was compared to GREEDY, ROUNDROBIN and POWERSAVE schedulers showing superiority in reducing the amount of consumed power although it did not consider the load and resources of the destination physical machines to which the VMs will be migrated to. Also did not mention any thing about the rules to consider when the physical machine should retire and forced its VMs to migrate.
A year after, Ghanbari et al., in [15] introduced a scheduling algorithm based on job priority named (PJSC) where each job is assigned resources based on its priority, in other words higher priority jobs gain access to resources first. Simulation results as clarified by the authors indicated that PJSC has reasonable complexity but sufferer from increasing makespan. In addition to that PJSC may cause Job starvation as the jobs with less priority may never gain access to the resources they need.
In [16] Maguluri et al., considered a stochastic model for load balancing and scheduling in cloud computing clusters. Their primary contribution was the development of frame-based non-preemptive VM configuration policies. These policies can achieve a nearly optimal throughput through selecting sufficiently long frame durations. Simulations indicate that long frame durations are not only good from a throughput perspective but also seem to provide good delay performance but also may cause starvation. Gulati et al.,in [17] studied the effect of enhancing the performance of Round Robin with a dynamic approach through varying the vital parameters of host bandwidth, cloudlet long length, VM image size and VM bandwidth. Experimental results indicated that Load had been optimized through setting dynamic round robin by proportionately varying all the previous parameters.
Agha and Jassbi in [18] proposed a RR based technique to obtain quantum time in each cycle of RR using arithmetic-harmonic mean (HARM), which is calculated by dividing the number of observations by the reciprocal of each number in series. According to the proposed technique if the burst time of a process is smaller than the previous one, HARM should be utilized for calculating quantum otherwise the arithmetic mean is utilized. The simulation results indicated that in some cases the proposed algorithm can provide better scheduling criteria and improve the average Turnaround Time (TT) and Average Waiting Time (AWT). These results according to the authors may indicate enhancement in RR performance but still missing the consideration of the arrival time of each process to verify the values of TT and AWT besides a real time implementation.
Tsai et al., in [19] introduced an optimization technique named Improved Differential Evolution Algorithm (IDEA) trying to optimize the scheduling of series of subtasks on multiple resources based on cost and time models on cloud computing environment. The proposed technique makes benefit of the Differential Evolution Algorithm (DEA) abilities in global exploration of macro-space and using fewer control parameters and Taguchi method systematic reasoning abilities in exploiting the better individuals on microspace to be potential offspring. Experimental results were conducted using five-task five-resource and the ten-task ten-resource problems indicating the effectiveness of the IDEA in optimizing task scheduling and resource allocation while considering cost compared to the original DEA, NSGA-II, SPEA2 and IBEA.
Ergu et al., in [20] introduced a model for taskoriented resource allocation where, tasks are pairwise compared based on network bandwidth, complete time, task costs, and reliability of task. Resources are allocated to tasks based on task weight calculated using analysis hierarchy process. Furthermore, an induced bias matrix is used to identify the inconsistent elements and improve the consistency ratio when conflicting weights in various tasks are assigned. According to the authors testing was proceeded using two theoretical and not real time evaluation examples which indicated that the proposed model still needs more testing.
Karthick et al., in [21] introduced a scheduling technique that dynamically schedule jobs through depicting the concept of clustering jobs based on burst time in order to reduce starvation. Compared to other traditional techniques as FCFS and SJF, the proposed technique effectively utilizes the unused free space in an economic way although it hadn't considered consumed energy and the increasing number of submitted jobs.
In [22], Lakra and Yadav, introduced a multi-objective task scheduling algorithm for mapping tasks to VMs via non-dominated sorting after quantifying the Quality of Service values of tasks and VMs. The proposed algorithm mainly considered improving the throughput of the datacenter and reducing the cost without violating the Service Level Agreement (SLA) for an application in cloud SaaS environment. The experiments results indicated an accepted performance of the proposed algorithm although it did not consider many of the Quality of Service factors including awareness of VMS energy.
While Dash et al., in [23] presented a dynamic quantum scheduling algorithm based on RR, named Dynamic Average Burst Round Robin (DABRR). The proposed algorithm was tested and compared to traditional RR, Dynamic Quantum with Re-adjusted Round Robin (DQRRR), Improved Round Robin with Varying time Quantum (IRRVQ), Self Adjustment Round Robin (SARR), and Modified Round Robin (MRR) indicating the superiority of the DABRA. However the authors did not clarify DABRA's response to new arrival processes also sorting processes in an ascending order based on their burst time may cause starvation to processes with long burst time.
Tunc et al., in [24] presented a new metric called Value of Service mainly consider completed tasks within deadline through balancing its value of completing within deadline with energy consumption. They defined the proposed metric as the sum of the values for all tasks that are executed during a given period of time including task arrival time. Each resource was assigned to a task based on the number of the homogeneous cores and amount of memory. The experiments were conducted using IBM blade server using Keyboard-Video-Mouse (KVM), indicating an improvement in performance enhanced by a noticeable reduction in energy consumption only in case of completing tasks within deadline. The authors did not really clarify how their model reacts to the increasing number of tasks especially with the same arrival time.
In the same year Abdul Razaque et al., in [25] introduced a nonlinear programming divisible task scheduling algorithm, allocating the workflow of tasks to VMs based on the availability of network bandwidth. The problem here with this algorithm, it only considered a single criteria of a network for allocating tasks to VMs while neglecting the VMs energy consumption that may cause these machines to retire forcing tasks to terminate.
Recently bio-inspired algorithms as Ant Colony Optimization (ACO), Cuckoo Search (CS), genetic algorithm (GA), Particle Swarm Optimization (PSO) and Bees Life Algorithm (BLA) etc.. have played major role in scheduling tasks over cloud nodes as Mizan et al., in [26] developed a modified job scheduling algorithm based on BLA and greedy algorithm for minimizing make span in hybrid cloud. Based on the authors claim experiments were conducted indicating that the proposed algorithm has outperformed both greedy algorithm and firefly algorithm in make span reduction.
In [27] Ge and Wei utilized genetic algorithm (GA) as an optimization technique utilized by the master node to schedule the waiting tasks to computing nodes. Before the scheduling procedure takes place all tasks in the job queue have to be evaluated first. Based on the authors the simulation results indicated reduction in make span and better balanced load across all cloud nodes for the GA over FIFO.
Raju et al., in [28] presented a hybrid algorithm combining the advantages of Ant Colony Optimization (ACO) and Cuckoo Search (CS) in order to reduce make span, based on Job execution within specified time interval. The experimental results clarified that the proposed algorithm achieved better results in terms of makspan reduction over the original ant colony optimization algorithm but not over other related algorithms as RR.
Ramezani et al., in [29] developed Multi-Objective Jswarm (MO-Jswarm) scheduling algorithm to determine the optimal task distribution over the virtual machines (VMs) attempting to balance between different conflicting objectives including task execution time, task transferring time, and task execution cost. According to the authors the proposed algorithm had the ability to enhance the QOS and to provide a balanced trade-off between the conflicted objectives.
Most of the previous studies concentrated on enhancing one or two of the Qos Standards either by minimizing or maximizing them although in cloud environment, it is highly recommended to consider various criteria as execution time, cost, bandwidth and energy consumption. Other studies even claimed to reach optimality in performance as in [41] and [42], while others have investigated task scheduling from load balancing prospective as in [43][44][45][46][47] concentrating on balancing workload with consumed energy. The experimentations results of all of these studies claimed that they had improved waiting, turnaround time or even throughput but none of them give a real solution to starvation or even approached to Table 1 Submitted tasks burst and arrival Table 2 Task quantum calculations in first round solve it. As the starvation problem is considered one of the major scheduling dilemmas, so in this paper we tried to overcome or partially overcome the starvation problem of long tasks though proposing a hybrid scheduling algorithm based on two tradition scheduling algorithms SJF and RR. These two algorithms were intentionally picked to make benefit of SJF fast secluding while solving its starvation problem using RR enhanced with dynamic quantum. Through the proposed algorithm the ready queue is split into two sub-queues Q 1 and Q 2 one for short tasks while the other is for long tasks, which will be discussed in details in next section.

SJF and RR with dynamic quantum hybrid algorithm (SRDQ)
As mentioned before, we are trying in this work to overcome the starvation problem by proposing a new hybrid scheduling technique based on SJF and RR scheduling techniques named SRDQ. SRDQ avoids the disadvantages of both of SJF and RR so that the evaluation of the performance metrics increases rather than decreases the probability of starvation occurrence as far as possible. In the RR stage of SRDQ, the time slice works on avoiding the traditional cons that lead to high waiting times and rare deadlines met. RR time slice or quantum setting is a very challenging process, as if the quantum is too short, too many context switches will lower the CPU efficiency while setting the quantum too long may cause poor response time and approximates First-Come-First-Serve (FCFS) algorithm. So in this paper, the researchers concentrated on calculating the optimal quantum interval at each round of RR algorithm while splitting the tasks ready queue into two subqueues Q 1 for short tasks and Q 2 for long tasks using the median as the threshold of the tasks length in other words the tasks longer than the median to be inserted in Q 2 while the shorter ones to Table 3 Task quantum calculations in the second round Table 4 Task quantum calculations in the third round be inserted in Q 1 . Tasks will be executed mutually, two short tasks from Q 1 and one long task from Q 2 will be executed which will lead to reducing long tasks waiting time without the disruption of the SJF in preferring short tasks. SRDQ is designed to be a unit based algorithm based on queuing data structure effectively and optimizing the execution time as possible. SRDQ involves 6 main steps as following: 1. Arrange all submitted tasks, T i , i = 1, 2,. .., number of submitted tasks, according to their burst time. 2. Compute the median, q~, of the burst times of all tasks. 3. If a burst time of a task T, B(T), is less than or equal to the median, insert T into a Q 1 otherwise insert T into Q 2 .
4. The quantum (q ij ) is calculated based on the current executed task source queue (whether it is from Q 1 or Q 2 ), and the round to be executed in, as following: where q ij is the quantum at iteration j, i:1, 2, . . ., n and, B ij is burst time of task i at iteration j, q i(j − 1) , and ∝ is a binary selector ∝={0,1}. In the first round, j = 1, q i(j − 1) is set to zero as there is no previous rounds. On the other hand, based on the source queue, ∝ will be set to either zero or one as in: a. If the resource is taken from Q 1 , ∝ will be set to one and thus Eq. (1) will be modified as follows: b. If the resource is taken from Q 2 , ∝ will be set to zero and thus Eq. (1) will be modified as follows: 5. The first two tasks of Q 1 are assigned to the resources followed by the first task of Q 2 . 6.
Step 4 is continuously repeated till the Q 1 and Q 2 are empty. 7. In case of the of a new task arrival or a task is finished q~will be updated dynamically as following:   a. In case of a new task arrival, it will be inserted in Q 1 or Q 2 based on its burst time and q~. In this case, q~will be updated as follow: Where B new is the new task burst time.
b. In case of a task is finished, q~will be updated as: Where B terminated is the finished task burst time.
For more explanation, the following illustrative example discusses the case of executing 6 tasks using SRDQ (Table 1). Table 2) q~= 13.5

Round (3)
T6 and T3 are finished in the second round so q~will be updated as following: -After finishing T6, B terminated = 15, q~= 9.73 -(9.73/ 15) = 9.08 -After finishing T3, B terminated = 23, q~= 9.08 -(9.08/ 23) = 8.47 Thus q~= 8.47 in this round (Table 4). Table 5 demonstrates the response, waiting and turnaround times of the SRSQ compared to SJF and RR. We can detect that SRDQ achieved less response time compared to SJF but with higher turnaround and waiting time. Although, RR had really achieved good response time but with comparable waiting time to SRDQ. We can finally say that SRDQ is the balancing point between SJF and RR, in which we tried to overcome or at least reduce RR and SJF problems especially the starvation dilemma.

Simulation environment
The proposed hybrid algorithm was implemented and tested in the CloudSim environment toolkit 3.0.3 which provides a generalized and extensible simulation framework that enables modeling, simulation, and experimentation of emerging Cloud computing infrastructures and application services, allowing its users to focus on specific system design issues that they want to investigate, without getting  concerned about the low level details related to Cloud-based infrastructures and services [4,48]. The simulation settings and parameters employed in the CloudSim experiments are summarized in Table 6.

Performance metrics
The following metrics were considered through the evaluation process [49]: Waiting Time: Average time a process spends in the run queue. Response Time: Average time elapsed from when a process is submitted until useful output is obtained. Turnaround Time: Average time elapsed from when a process is submitted to when it has completed.

Experimental results & discussion
For evaluation purposes, three different datasets were utilized through testing the proposed algorithm against three different scheduling algorithms: traditional SJF, traditional RR and Time Slice Priority Based RR (TSPBRR) [50]. It was tested in two cases the first SRSQ with static task quantum through each iteration, while changing from one iteration to the next and the second SRSQ with dynamic quantum through the same iteration and from one iteration to the next. The algorithms performances were evaluated based on turnaround time, waiting time and response time. Each dataset consists of randomly generated and dynamically shuffled ten tasks denoted as T 1 , T 2,…… T 10 and each task is characterized by its arrival time and burst time, as shown in Table 7.
To evaluate SRDQ a simulated Cloud computing environment consists of a single data center, a broker and a user, constructed by cloud-based interface provided by CloudSim, series of experiments are performed. The allocation of VMs (Virtual Machine) to hosts utilizes the default FCFS algorithm, while for allocating the cloudlets (tasks) to the virtual machines space-shared policy is used so that the tasks are executed sequentially in each VM. By using this policy each task unit had its own dedicated core therefore number of incoming tasks or queue size did not affect execution time of individual task units as the proposed algorithm is a non-primitive technique.
In CloudSim environment, evaluation experiments were performed in three cases using one VM, two VMs and three VMs. While the assumptions behind the proposed algorithm involve: All cloudlets which have to be processed are available, At runtime no more cloudlets are added, The environment is also static i.e. no more resources are added at runtime.
Finally the inner code of CloudSim was modified to test our proposed algorithm and also to compare it to the traditional RR and SJF. Then our own classes for the scheduling algorithm were defined to extend the basic  CloudSim classes. The same datasets were used in three different cases one VM, Two VMs and three VMs. Each dataset was used in each case as shown in the next figures. Figure 1 clarifies the experimentation results of the implemented algorithms using dataset1 on one, two and three VMs. It is noticed that SRSQ and RR have the least response time in all phases but suffer from the highest turnaround and waiting time. While SRDQ has the least waiting and turnaround time which is also nearly comparable to the SJF.
From Fig. 2, we can also notice that RR and SRSQ really achieved good comparable response time but still suffer from higher waiting and turnaround time, while TSPBRR suffered from elevated values compared to SJF and SRDQ. Finally the SRDQ is again the winner that achieved the least waiting and turnaround time.
Finally in the third and final experiments using dataset3 as shown in Fig. 3, we can notice that with the increasing number of VMs, SRDQ performance became better and exceeded the rest in all evaluation metrics, while TSPBRR performance was degrading.
A final test was done through the CloudSim environment using a randomized dataset of 10 cloudlets (tasks) with random arrival and long burst time generated by the environment given in Table 8 to detect the impact of the proposed algorithm on reducing the starvation problem.
It is noticed that the cloudlets (1, 6, 9, 3, and 8) burst time is long which means that theses cloudlets will suffer from starvation if the SJF scheduler was applied and also will suffer if the RR quantum was small. In the proposed algorithms with its two versions, we tried to balance between reducing cloudlets waiting time and increasing quantum value also tried to achieve fairness in selecting cloudlets for execution through having two short cloudlets from Q 1 and one long cloudlet from Q 2 .  Figures 4, 5 and 6 clarifies the waiting time of each cloudlets applied on a 1, 2 and 3 VMs, from which we can see that the proposed algorithm with its two versions had achieved much better reduction in cloudlets waiting time especially cloudlets with long burst time. We can also notice that SRDQ performance exceeds SRSQ or at least comparable to it. SJF achieved the worst waiting time especially with cloudlets with long burst time while TSPBRR achieved better waiting time in most cases that its traditional version.
Reducing the waiting time indicates that the average time a cloudlet spends in the run queue is reduced which leads to reducing cloudlet starvation. SRDQ achieved this waiting time reduction and thus starvation through having the two sub-queues Q1 and Q2 where Q1 for nearly short tasks and Q2 for the rest depending on the tasks median. Many tests and trials have been done by the researchers to find the best methodology for selecting tasks from Q1 and Q2 to be assigned to resources and finally found that as clarified in the algorithm that having two tasks from Q1 and one task from Q2 really have a good impact on reducing task starvation.
From the simulations results, it is obvious that SRSQ and SRDQ had achieved a good performance compared to the traditional RR and SJF and also to TSPBRR in response, turnaround and waiting time. It is also obvious that SRDQ had superiority on SRSQ in reducing waiting and turnaround times while SRSQ exceeds in reducing response time. We can assure that the proposed algorithm in its both versions (SRSQ and SRDQ) had achieved a good reduction in the waiting time of each task and also the overall waiting average, from which we can say that it leads to reducing task starvation which is one of our first priorities.
But one last issue, the experiments results had shown that dynamicity in task quantum had a good impact on reducing task waiting time and turnaround time, while the dynamicity in each task quantum from round to round had a good impact on reducing response time so we can see that SRSQ had exceeds SRDQ in response as it only depends on having a static quantum for all tasks that did not change from round to round. SRDQ works as the balancing point with, waiting, turnaround and starvation reduction especially in tasks with long burst times and with a comparable performance in response to SJF, RR and TSPBRR. From all of the above, we can surely conclude that having the optimum task quantum value is nearly impossible.

Conclusions
Achieving optimality in scheduling tasks over computing nodes in cloud computing is the aim of all researchers interested in both scheduling and cloud. Balancing between throughput, waiting time and response time may provide a way to approach scheduling optimality but on another level it may causes long tasks starvation. Most of the previous studies have concentrated only on one side either starvation or throughput but not both so in this study we have tried to develop a hybrid algorithm based on SJF and dynamic quantum RR, while concentrating on splitting the ready queue into to sub-queues Q1 for short tasks and Q2 long tasks.
Three different datasets were utilized for evaluation conducted using CloudSim environment 3.0.3 in two different versions SJF&RR with Dynamic Quantum (SRDQ) and SJF&RR with Static Quantum (SRSQ) with 1,2 and 3 virtual machines. Experimentations results indicated that the proposed algorithm has outperformed the state of art in minimizing turnaround and waiting times with comparable response time in addition to partially reducing long tasks starvation.
In the future the researchers intend to proceed their experiments in finding a better task quantum calculation methodology that balance between the static and dynamic quantum values to achieved better reduction in waiting and thus reducing task starvation.