Multiple objectives dynamic VM placement for application service availability in cloud networks

Ensuring application service availability is a critical aspect of delivering quality cloud computing services. However, placing virtual machines (VMs) on computing servers to provision these services can present significant challenges, particularly in terms of meeting the requirements of application service providers. In this paper, we present a framework that addresses the NP-hard dynamic VM placement problem in order to optimize application availability in cloud computing paradigm. The problem is modeled as an integer nonlinear programming (INLP) optimization with multiple objectives and constraints. The framework comprises three major modules that use optimization methods and algorithms to determine the most effective VM placement strategy in cases of application deployment, failure, and scaling. Our primary goals are to minimize power consumption, resource waste, and server failures while also ensuring that application availability requirements are met. We compare our proposed heuristic VM placement solution with three related algorithms from the literature and find that it outperforms them in several key areas. Our solution is able to admit more applications, reduce power consumption, and increase CPU and RAM utilization of the servers. Moreover, we use a deep learning method that has high accuracy and low error loss to predict application task failures, allowing for proactive protection actions to reduce service outage. Overall, our framework provides a comprehensive solution by optimizing dynamic VM placement. Therefore, the framework can improve the quality of cloud computing services and enhance the experience for users.


Introduction
Cloud computing has emerged as a popular paradigm that offers Application Service Providers (ASPs) such as Netflix and Spotify the ability to leverage a pool of virtual infrastructure resources for hosting their applications.By accessing resources from Cloud Service Providers (CSPs) based on workload demands, ASPs are able to realize the pay-as-you-go business model where they only pay for resources they use.This cost-effectiveness has encouraged many ASPs to migrate their applications to the cloud.However, despite its many advantages, cloud computing presents quality of service (QoS) challenges that have become top priorities for ASPs.In particular, service availability has emerged as a key non-functional requirement, denoting the percentage of time a service is available to users [1].Some users demand highly available (HA) services, with a ratio of 99.999% (aka five nines) or more of the time the service is available being the gold standard for service HA [1].Other users require service continuity in order to resume the service from its last state before interruption.CSPs are responsible for Page 2 of 20 Alahmad and Agarwal Journal of Cloud Computing (2024) 13:46 ensuring application service availability in accordance with the requirements laid out in Service Level Agreements (SLAs), which detail QoS expectations.Providing availability that is lower than the demand can have a significant negative impact on service performance and quality, resulting in significant losses, with downtime in web applications costing businesses up to $300,000 per hour [2].Conversely, providing availability above the demand can raise costs and reduce the admissibility of new applications, lowering CSP profits.Therefore, effectively managing service availability is critical for both ASPs and CSPs in order to balance the service quality with profitability.
Managing service availability in the cloud is a complex task that comes with several challenges.Applications in the cloud are composed of software components hosted on virtual computing nodes such as Virtual Machines (VMs) or containers.These components depend on physical computing nodes, also known as servers, to host them.Due to this heterogeneous stack dependency, any failure in any layer can cause a service outage at any time.Detecting and quickly recovering from a resource failure requires an efficient monitoring and management mechanism.Moreover, cloud environments are highly dynamic, with resources being frequently added or removed on the fly.Failure to provide required resources at the right time can compromise service availability and quality.On the other hand, keeping extra resources for an extended period can increase operational and maintenance costs.
Ensuring high service availability in the cloud is not an easy task that requires a careful balance between cost and performance.Numerous approaches have been proposed to address this challenge.Reactive solutions focus on addressing outages as they occur by using redundancy models to failover to standby resources when an active resource fails.While effective, this approach can be costly due to the need for additional standby resources.Proactive solutions aim to predict and prevent service failure through prediction methods and protective actions.This approach can be cost-effective but relies heavily on the accuracy of the prediction method.Protection mechanisms can also increase the availability of application components, such as using a strong VM placement strategy.However, the proximity of VMs providing the same service can lower the overall availability level of the service, making VM placement a significant challenge.Clustering VMs together can help to reduce the total number of computing servers required to host the VMs, leading to reduced energy consumption and associated costs.However, placing active and standby VMs responsible for a specific service instance on the same server can result in a service outage in the event of server failure.Alternatively, distributing VMs across servers can improve workload balancing and server performance, but may increase the number of active computing servers and associated costs.Thus, choosing the optimal placement strategy for VMs is critical to maintaining high service availability in the cloud.
The management of virtual machine (VM) placement in cloud computing presents a challenging combinatorial problem that is NP-hard.Static VM placement is only suitable when a new VM is requested, while dynamic placement involves changes to the location of VMs, triggered at any time for any reason such as elasticity and migration.Achieving multiple objectives through VM placement can further complicate the problem, which grows exponentially with the number of VMs.In this paper, we introduce the "Multiple-Objectives Dynamic VM Placement for Application Availability in Cloud" (MoVPAAC) framework, which focuses on ensuring application service availability and optimizing resource usage.The framework comprises various modules that use a set of optimization solutions to handle dynamic VM placement during deployment, scaling, and application failure, with the aim of meeting availability requirements and achieving multiple objectives.The following are the main contributions of this research work: • Introducing a formal definition for application service availability in cloud computing platforms, enabling better management and optimization of cloud resources.• The proposed Multiple-Objectives Dynamic VM Placement for Application Availability in Cloud (MoVPAAC) framework is a novel approach to dynamic VM placement that integrates several optimization goals, including minimizing power consumption and resource waste, and maximizing service uptime, while ensuring high application availability.• This work develops a prototype that showcases the proposed framework, algorithms, and methods, and provides a comparative analysis of the results against three existing VM placement solutions from the literature.The prototype provides an empirical validation of the effectiveness and efficiency of the proposed approach.
The remainder of this article is structured as follows: "Background" section provides a brief background of cloud application service availability and summarizes the problem statement."Related work" section discusses the related work."Multiple-objectives dynamic VM placement for application availability in cloud framework" section introduces the MoVPAAC proposed framework, which includes modules, problem formalization, and optimization solutions."Experiments and results" section presents the results of the experiments.Finally, "Conclusion" section summarizes the conclusion.

Background
Cloud computing allows application service providers (ASPs) to request the deployment of end-to-end application services from cloud service providers (CSPs) based on specific requirements, such as availability.In response, the CSP gives the ASP online access to a set of virtual machines (VMs) where the application components can be deployed.Each application component is a software module that provides a specific type of functionality in a specific domain, such as web hosting (using an HTTP server) or networking (using network address translation or a firewall).In a data center (DC), each VM is hosted on a single physical server, and is associated with a specific application that requires a set of resources, such as CPU and RAM.Each server has its own set of properties, including availability and capacity for each resource type.The availability of a server s j can be calculated using Eq. ( 1), where MTTF j is the mean time between two con- secutive failures of server s j , and MTTR j is the mean time to repair server s j .
To illustrate how application availability is formulated, consider the following example.Figure 1a presents an abstract model of Application app 1 , composed of three distinct functionalities provided by separate components located on different virtual machines ( vm 1 , vm 2 , and vm 3 ), each hosted on a single server.The availability requirement for app 1 is set to 0.9, and specific resource demands are requested for each VM.To simplify the illustration, we only show the CPU demand next to each VM, along with the CPU capacity and availability next to each server.For application availability, we assume that VMs providing the same application functionality cannot be collocated on the same server, and that application availability depends on the availability of all its functionalities.We model an application as a set of functionalities provided by a set of VMs, with application availability depending on the availability of all the functionalities that together provide an end-to-end application service.

Fig. 1 Applications Deployment Model in Cloud Data Center
To compute application availability, denoted by AV app a , we multiply the availability of all the functionalities that comprise the application, as defined in Eq. (2), where AV func f is the availability of the functionality func f and F a is the set of functionalities required to pro- vide application app a .The availability of func f is deter- mined by the availability of the virtual machines that provide it, which can be computed as the complement of the failure probability of all the VMs that provide func f , as defined in Eq. ( 3), where vm v provides func- tionality f and V func f is the set of all VMs that provide functionality f.The failure of vm v is equivalent to the failure of the server s j that hosts it.The failure of a server s j is defined as the complement of its availability, as in Eq. ( 4).According to Eq. ( 2), the availability of the deployed application ( app 1 ) depicted in Fig. 1a  and hosting it on ser ver 3, as depicted in Fig .1c , increases the availability of app 1 to )) * AV s 8 = 0.97 * (1 − (0.010.05))0.98= 0.95 .However, when a CPU scaling-up request is made for vm 1 with two additional units, server 1 cannot host vm 1 with the requested four CPU units because its CPU capacity is limited to three units.Therefore, vm 1 must be migrated to another server.As shown in Fig. 1d, vm 1 can be migrated to server 5, where AV app 1 = 0.96 * (1 − (0.01 * 0.05)) * 0.98 = 0.94.
To illustrate the application admissibility issue, let's consider an additional scenario where a new application, app 2 , is requested by another ASP.The appli- cation consists of three virtual machines, vm 4 , vm 5 , and vm 6 , and requires an availability of 0.88.Sup- pose that the CSP has a placement policy that deploys VMs on servers with the highest availability.In this case, app 2 would be deployed as shown in (2) . The availability of app 2 can be calculated as AV app 2 = AV s 6 * AV s 7 * AV s 4 = 0.99 * 0.99 * 0.99 = 0.97, w h i c h meets the availability requirement.However, note that the CSP is providing a much higher availability than what the ASP requires for app 2 .Now, let's assume that another ASP requests a new application, app 3 , consist- ing of three virtual machines, vm 7 , vm 8 , and vm 9 , with an availability requirement of 0.97.Based on the current state of the data center, as shown in Fig. 1e, the request for app 3 would be denied because the required availabil- ity cannot be met.This means that the CSP would lose the profit from hosting app 3 .However, if the CSP adopts a policy of providing application availability that is close to the requested level, then app 3 could be admitted to the data center.Figure 1f shows the placement of all three applications in the data center, with app 1 , app 2 , and app 3 meeting their respective availability requirements of 0.94, 0.88, and 0.97.By adopting this policy, the CSP can satisfy the requirements of multiple ASPs and maximize its profits.

Related work
The literature on cloud computing has numerous studies that focus on different aspects of virtual machine (VM) placement and application task scheduling, such as resource utilization, network performance, and operational costs.However, only a limited number of studies have explored the problem of ensuring end-to-end application service availability.In light of this, we will examine previous research that deals with VM placement and task scheduling in cloud computing, as well as approaches that ensure availability, reliability, and fault tolerance.

Availability-aware VM placement
Jammal et al. [3,4] proposed CHASE, a scheduler that takes into account high availability of application components in cloud-based systems.The authors formulated the scheduling problem as an Integer Linear Programming (ILP) model with the objective of maximizing component availability.To schedule components, CHASE selects servers with the highest availability.However, this work does not consider the problem of application admissibility.The authors used IBM ILOG CPLEX optimization solver to find the optimal scheduling plan for the components.In another work, Zhu and Huang [5] focused on the availability of Mobile Edge Computing (MEC) applications during the component placement process.The authors proposed a stochastic model to measure the cost of availability impact when changing the placement of components.The heuristic algorithms FirstFit and Best-Fit were used to place the MEC application.Lera et al. [6] proposed a two-phase placement strategy based on graph partitioning and traversal approach to address service placement in the fog computing platform for application fault tolerance.The authors optimized the placement process to improve the fault tolerance of applications.Dehury et al. [7] addressed fault tolerance for application components in the cloud.They proposed a fault tolerance strategy based on the significance of each deployed component.The ranks of components were determined based on their communication, failure rate, failure impact, and historical performance.The proposed strategy used Markov Decision Process (MDP) to determine the number of replicas of each component.
The problem of reliability of VM placement (RVMP) has been addressed in works such as [8,9].Yang et al. [8] proposed an INLP model to determine the minimum number of computing nodes required to host VMs, ensuring that the VM placement plan's availability meets the requirement and the communication delay between VMs is less than a certain threshold.To solve the RVMP model, the authors used CPLEX.Similarly, Liu et al. [9] also mapped VM placement as an ILP model, but with the additional goals of reducing communication traffic and network bandwidth in DC while increasing the reliability of hosted VMs.To solve the ILP model, the authors employed a graph k-cut approach.Yang et al. [10] created a variance-based metric to assess the risk of application availability violations during the VM placement process.The authors examined the possibility of Top-of-Rack (ToR) switch and server failures in DC and formalized VM placement as an ILP model with the goal of reducing resource power consumption while increasing application availability.
The Virtual Network Function (VNF) placement problem in the Network Function Virtualization (NFV) platform has been the subject of several research works, including [11][12][13][14][15][16][17][18][19].Ayoubi et al. [11] proposed a framework for elastic and dependable Virtual Networks (VNs) embedding in cloud environments, aiming to meet the availability requirement of VN throughout its lifetime and increase the admissibility of new VNs.The authors modeled VN as a collection of connected Virtual Network Functions (VNFs), each mapped to a single VM.The approach utilized backup VNFs and a tabu-search optimization method to achieve reliable VNF placement.Alahmad et al. [12] proposed a VNF placement model that prioritizes availability and minimizes Network Service (NS) failure probability in NFV, evaluated using CPLEX.Thiruvasagam et al. [13] tackled the placement of reliable virtual monitoring functions (vMFs) by minimizing communication delay between Service Function Chains (SFCs) in the NS while also reducing the number of vMFs.The authors used CPLEX to determine the best vMF placement strategy.Yala et al. [14] employed a genetic algorithm to determine the VNF placement in a virtual Content Delivery Network (vCDN) and to balance vCDN deployment cost and availability level.Yang et al. [15] addressed stateful VNF placement for NS fault-tolerance and modeled the problem as an optimization function, aiming to increase user request availability.In [16], the authors proposed an availability-aware SFC placement scheme for the NFV substrate network, aiming to reduce SFC's end-to-end delay.Sharma et al. [17] focused on maximizing the Telecom Service Provider's (TSP) profit by achieving high NS availability in NFV during VNF placement using redundant VNFs and a geographic placement approach.Abdelaal et al. [18] addressed the VNF Forwarding Graph (VNF-FG) deployment problem with the goals of minimizing network bandwidth, convergence time, and resource power consumption while protecting VNF service from failures using redundant VNFG.Mao et al. [19] proposed an online fault-tolerant SFC placement solution in NFV, modeled as a Markov decision process, using a deep reinforcement learning (DRL) method to maximize the number of accepted user requests.
Several works have proposed cloud fault-tolerance solutions using virtual machine (VM) placement.Li and Qian [20] focused on reducing network traffic in data centers by addressing multitenant cloud VM placement.Jammal et al. [21] addressed the issue of VM placement during live migration to reduce service downtime in the event of a failure.Zhou et al. [22,23] aimed to minimize network resource consumption and increase cloud service reliability through optimal redundant VM placement (ORVMP) using genetic algorithms.Gonzalez and Tang [24] used the FirstFit algorithm to place VM replicas for service fault tolerance.Alameddine [25] proposed a protection plan to determine number of backup VMs and placement to meet critical cloud application's availability requirements.Cost functions were also used to address VM placement.Chen and Jiang [26] proposed an adaptive selection method for fault-tolerant application service during the VM placement process.Zhang et al. [27] investigated VM placement in cloud DCs using a star topology to minimize SLA violations, power consumption, and failure rate.Tran et al. [28] proposed a proactive fault-tolerant approach for Kubernetes containerized services using Bidirectional Long Short Term Memory (LSTM) node fault prediction and container-based service stateful migration mechanism.Finally, Saxena et al. [29] proposed the fault-tolerant elastic resource management (FTERM) framework to handle cloud outages based on online Multi-Input and Multi-Output Evolutionary Neural Network (MIMO-ENN) to predict resource failure and take action.

Fault-tolerance task scheduling
Previous research studies have explored the impact of task scheduling on application task failures in cloud computing clusters.However, many of these studies fail to account for recovery measures for failed tasks or preventative measures for predicted failures.Moreover, they do not assess the application task's availability in meeting specific requirements.Our research sets itself apart by considering the migration of virtual machines (VMs) that host predicted failed tasks and ensuring that the application meets its availability requirements throughout its operational lifetime.
Several studies have proposed fault tolerance solutions for cloud application task scheduling.Guo et al. [30] developed a fault-tolerant and energy-efficient primary-back scheduling architecture for real-time tasks in a cloud environment.Marahatta et al. [31] proposed an energy-aware and fault-tolerant dynamic task scheduling scheme that reduces rejection rates by replicating tasks in case of VM failure or delay.Sun et al. [32] introduced a QoS-aware task scheduling model with fault tolerance for an edge-cloud platform, using a primary-backup redundancy approach to improve task availability while adhering to time constraints.Yao et al. [33] analyzed fault-tolerant properties of task scheduling and migrating VMs based on the Primary-Backup model and proposed a fault-tolerant elastic algorithm for task scheduling that considers host and network device faults in a cloud data center.Additionally, Yao et al. [34] presented a hybrid fault-tolerant algorithm for scheduling tasks with deadlines in a cloud platform.The algorithm selects the most suitable fault-tolerant strategy, such as task resubmission or replication, based on the characteristics of the task and available resources.Weikert et al. [35] studied node failure in IoT networks and proposed a task allocation algorithm based on multiple objective optimization.The algorithm utilizes an archive-selection mechanism to identify the most reliable assignment for the backup task in case of node failure.Overall, while previous research has examined the effect of task scheduling on application task failures in cloud computing clusters, our research goes beyond existing works by incorporating migration measures and ensuring that the application meets its availability requirements.Additionally, a range of fault tolerance strategies have been proposed for cloud application task scheduling, including energy-efficient, QoS-aware, and hybrid fault-tolerant algorithms that consider host and network device faults, as well as multiple objective optimization techniques.
Several research studies have leveraged the Google cloud trace dataset [36] to predict application job and task failures in cloud cluster systems.Chen et al. [37] explored the critical characteristics of application job and task failures and used a deep learning Recurrent Neural Network (RNN) to predict such failures.To predict task failure, Soualhia et al. [38] combined machine learning methods, including Decision Tree (DT), Boost, and Random Forest (RF).Jassas and Mahmoud [39,40] compared multiple prediction models, including DT, Logistic Regression (LR), K-Nearest Neighbors (K-NN), Naive Bayes (NB), RF, and Quadratic Discrimination Analysis (QDA), to select the most accurate method.Islam and Manivannan [41] employed a deep learning method called LSTM to predict task failure.While these works focused on predicting failures, other works proposed recovery actions for failing tasks or jobs.For instance, Rosa et al. [42] suggested terminating a job that is predicted to fail to save consumed resources, while Islam and Manivannan [43] proposed rescheduling tasks that are predicted to fail to a more reliable computing node.Soualhia et al. [44] proposed a fault-tolerant task scheduling framework (ATLAS) for Hadoop clusters, which can dynamically reschedule tasks that are predicted to fail.Our previous work [45] also utilized the Google dataset [36] to predict task failure during execution time, proposing three corrective actions to protect the task before it fails: changing the priority, scheduling class level, or task scheduling node.Chen et al. [46] proposed advance approach called IWC to improve the search method of Whale Optimization Algorithm (WOA) for Cloud task scheduling.Authors show IWC has better speed and accuracy to find the optimal task scheduling plan compared to existing meta-heuristic algorithms.Cheng et al. [47] proposed an enhanced deep reinforcement learning (DRL) to improve the existing studies that used DRL for job scheduling in Cloud platforms.They tried to optimize job execution time while meeting the expected response time of the users.Zhang et al. [48] proposed a new method called GA-DQN that combines DRL and Genetic Algorithms (GA) for scheduling jobs in cloud.The method benefits from the GA global search ability and awareness of decision-making of DRL to have optimized sub-task scheduling that can reduce the execution times of the jobs, and hence have better response time for the end users.Notably, none of these studies computed application service availability in the cloud to meet the requirements during VM placement or task scheduling procedures.Table 1 provides a summary of related work.

Multiple-objectives dynamic VM placement for application availability in cloud framework
We introduce a novel framework for dynamic VM placement in cloud platforms that prioritizes application service availability.Our framework generates and manages a comprehensive placement plan for VMs that provide services inside data centers, adhering to specific requirements to achieve multiple objectives and meet the availability needs of each application as requested by the ASP.Additionally, our framework has the ability to swiftly modify VM placement in response to application scaling or failure events.As shown in Fig. 2, the proposed MoVPAAC (Multi-Objective Virtual Machine Placement with Availability-Aware Computing) framework comprises three main modules: the Availability-Aware Application Deployment module, which optimizes VM placement to maximize availability; the Proactive Application Failure Detection module, which uses deep learning algorithms to detect potential application failures and take corrective actions before they occur; and the Dynamic Application Reconfiguration module, which allows for prompt reconfiguration of VM placement in response to application failures or changes in demand.We delve into the specific features of each module in detail in the following subsections.

Availability-aware application deployment
The Availability-Aware Application Deployment module is a critical component of our proposed framework, as it is responsible for generating the VM placement plan that will deploy the requested applications at the underlying servers located in the data center (DC).The module ensures that the objectives are achieved, while also taking into consideration the specific requirements of each application, particularly their availability as requested by ASPs.Given a set of applications with their respective requirements, each application is comprised of a set of VMs, and each VM provides a specific functionality towards providing end-to-end application services.The goal is to find the optimal placement plan for these VMs on the DC servers, such that power consumption, resource wastage, and server failure ratios are minimized, while ensuring that the availability requirements of the applications are maintained throughout their entire execution times.However, as we mentioned in the background section, VM placement is an NP-hard problem with contradictory objectives.To address this, we have formulated the problem as an INLP optimization model with multiple objectives and constraints.Moreover, we propose a heuristic approach based on the AntColony optimization method, in conjunction with the VM standby protection approach, to find a solution for the model and maximize the admissibility of the requested applications.Specifically, we define and formulate the problem statement we address in this manuscript as follows: assume there is a set A of applications that are requested by ASPs.Each application app a ∈ A is requested to be deployed at Data Center (DC), and has availability requirement that is denoted by AV appReq a . Each application app a is composed of a set of VMs V a , each VM vm i ∈ V a has a set of resources demands such as CPU, RAM and disk.The VMs of applications set A require to be placed (hosted) at the underlying set of servers S that are located in DC.Each server s j ∈ S has a resource capacity of different types such as CPU, RAM and disk.The main goal is to deploy (admit) applications set A at DC in such a way that can meet the availability requirement AV app a >= AV appReq a for each app a ∈ A , and achieve the following objectives.The first objective is to minimize the total power consumption of the active servers that are used to host VMs that compose applications in A. To compute the power consumption of server s j in the DC, we adopt the linear relationship between server power consumption and its CPU utilization as described in [49].We define the average power consumption of server s j as P j in Eq. ( 5), where P active j and P idle j are the average power consumption values when s j is active and idle, respectively, and U c j is the CPU utilization of s j , where U c j ∈ [0, 1] .The first objective is for- mulated in Eq. ( 6), where V is the set that includes all the VMs that compose all the requested applications in A, y j is a binary decision variable where value 1 indicates that s j is active and a value 0 indicates that s j is idle, as defined in Eq. (10).R c i is the CPU resource demand by vm i , and x ij is a binary decision variable where value 1 indicates that vm i is placed on s j and value 0 otherwise, as defined in Eq. (11).
The second objective of the Availability-Aware Application Deployment module is to minimize the wastage of resources of active servers in the data center (DC).The cost of wasting resources for server s j is denoted as W j and is defined in Eq. (7).The remaining CPU, RAM, and Disk resources of server s j are normalized and represented by L c j , L r j , and L d j respectively.U c j , U r j , and U d j represent the normalized resource usage of server s j .To ensure a positive value, we set β as a very small value of 0.00001.The second objective is formulated in Eq. ( 8).T c j , T r j , and T d j represent the upper utilization thresholds of CPU, RAM, and Disk of server s j respectively.These thresholds are set to the same value for all servers in the DC to prevent any server from reaching a full usage state that could negatively impact its performance.The RAM and Disk resource demand of vm i are represented by R r i and R d i respectively.The third objective of the module is to minimize the overall failure ratio of servers in the DC.The module computes the failure of server s j as the complement of its availability, as defined in Eq. ( 4), where AV s j is computed as defined in Eq. ( 1).The third objective is formalized in Eq. (9).By optimizing these objectives in a multi-objective optimization model, the module aims to find a placement plan for VMs on the DC servers that reduces power consumption, resource wastage, and failures ratio while meeting the availability requirements of the applications.To solve this problem, the module proposes a heuristic approach based on the Ant Colony Optimization method in conjunction with VM standby protection approach to maximize the admissibility of the requested applications.
Our VM placement model is governed by a set of carefully defined constraints.Firstly, each server s j can be either active or idle at any given time, as specified in Eq. (10).To indicate whether a VM vm i is placed on a particular server s j , we use a binary decision variable x ij , as outlined in Eq. (11).Additionally, each VM can be placed on at most one server, as mandated by Eq. (12).To ensure that each server has adequate resources to host any VM, we impose constraints on the amount of CPU, RAM, and disk space available on each server.Specifically, Eqs. ( 13) through (15) outline the resource requirements that must be met for each server.We also enforce an "anti-affinity" constraint to ensure that VMs belonging to the same application app a are not co-located on the same server.This helps to increase the availability of the application, as specified in Eq. (16).Our work considers the dependency between the (7) components of the same application.For example, peer, active-standby, proxy and proxied components of the same application should be hosted on different servers.Finally, to ensure that the requested applications are available to the application service provider (ASP) as required, we require that the availability of each application be greater than or equal to the level requested by the ASP.This requirement is captured in Eq. (17).By carefully balancing these constraints, we can optimize the placement of VMs to meet the needs of both users and service providers.

Subject to:
To address the INLP model and determine the optimal placement of VMs for requested applications, we introduce a heuristic algorithm called Availability-Aware Applications Deployment (AvAAD) (Algorithm 1).The AvAAD algorithm employs an AntColony optimization approach to achieve its objectives of VM placement, while utilizing a standby protection technique to ensure the availability requirements of the applications are met.The AvAAD takes a list of requested applications, their requirements, available servers at the data center, and VMs as input.It returns a list of non-admitted applications as output.Initially, the algorithm initializes three empty variables: paretoSet, violate-dAvApps, and nonAdmittedApps.It then calls the MOAntColony algorithm with VMs and servers as (10)  ) to vio-latedAvApps.For each application in violatedAvApps, the algorithm tries to enhance its availability to meet the requirement.Specifically, it attempts to add a new standby VM for the functionality with the minimum availability AV func f among all the functionalities in the application.The algorithm adds one standby VM at a time until it meets the availability requirement of the application or the number of added standby VMs reaches the threshold of app a .The newly added standby VM is placed on the server with the maximum value of 1 P j +W j +Fail s j among all servers, without violating any of the constraints defined in Eqs.(10 -17).This maintains consistency with the objectives of the MoVPAAC framework.After AvAAD handles all violated applications, it checks again for any applications that still violate their availability requirements.If an application still violates its requirement, AvAAD considers it rejected and adds it to the list of non-admitted applications (nonAdmittedApps) that is returned at the end of the algorithm execution.AvAAD optimizes VM placement while ensuring application availability, making it a robust and effective solution for the INLP model.

Algorithm 1 Availability-Aware Application Deployment (AvAAD)
The time complexity of AvAAD (Algorithm 1), can be analyzed as follows.At line (2), the algorithm calls MOAntColony (Algorithm 2) to find the placement plan of the virtual machines (VMs) in V at the servers in S. The performance of AvAAD mainly depends on the performance of MOAntColony.AntColony is a meta-heuristic algorithm that takes a polynomial execution time of O(n k ) to find the optimal solution [50].In the con- text of the VM placement problem, the value of k mainly depends on the number of iterations, ants, VMs, and servers that AntColony uses to find the placement solution.At lines (3)(4)(5)(6)(7)(8), the algorithm takes O(n) to determine the list of applications in violatedAvApps that violate their availability requirements.At lines (9)(10)(11)(12)(13)(14)(15)(16)(17), it takes O(n 2 ) to satisfy the availability for each application that violates its required availability.At lines (18)(19)(20)(21)(22), the algorithm takes O(n) to determine the list of rejected applications in non- AdmittedApps that cannot be admitted at the data center (DC) since they violate their availability requirements.Therefore, the total time complexity of Algorithm 1 can be expressed as O(n k ) + O(n) + O(n 2 ) + O(n) , which can be simplified to O(n k ) .It is worth noting that the performance of the algorithm may vary depending on the input parameters, such as the number of VMs, servers, and applications.
To achieve the objectives of application deployment, we propose a heuristic algorithm called Multiple Objectives AntColony (MOAntColony) that utilizes the Ant Colony Optimization (ACO) algorithm to find the placement of VMs for requested applications.Algorithm 2 outlines the steps of MOAntColony.The algorithm begins by initializing the parameters and pheromone trials.In each iterative step, an ant z receives a set of VMs V that to be placed in a set of servers S located at the data center (DC).The ant z then selects a server s j and starts placing the VMs in V at s j using the pseudo-random-proportional rule [37].The desirability of selecting the next vm i to place at s j depends on the pheromone concentration level and the heuristic information that guides ant z.After each movement (placement) step, the local pheromone concentration level is updated.Ant z continues moving until it completes the placement of V and builds its solution.Once all ants complete and build their solutions, a global pheromone is updated based on the pareto set PS that includes the best-located solutions.The algorithm initializes the pheromone level τ 0 using Eq.(18).Here, n is the total number of VMs that require placement, P ′ (sol 0 ) is the normalized power consumption of the servers listed in the initial placement solution sol 0 generated by the FirstFit VM placement algorithm, W ′ (sol 0 ) and Fail ′ (sol 0 ) are the resource wastage and server failures of sol 0 , respec- tively.Equation ( 19) defines P ′ (sol 0 ) , where P max j is the maximum power consumption of server j, and M is the total number of servers used in solution sol 0 .W ′ (sol 0 ) and Fail ′ (sol 0 ) are defined in Eqs. ( 22) and (23), respectively.The heuristic information η i,j indi- cates the desirability of an ant z to place vm i at server s j .The desirability η i,j considers the partial contribution for each objective.Every ant z begins with V and starts placing them sequentially on the available servers in S, which are arranged randomly.The sequence of servers from 1 to j is known during the placement of vm i at s j .The partial contributions of the first, second, and third objectives are defined in Eqs. ( 24), (25), and ( 26), respectively.These contributions are combined for the heuristic placement decision, as defined in Eq. ( 27). ( Ant z uses the pseudo-random-proportional rule, as defined in Eqs.(28) [37], to select the next VM, vm i , to be placed on server s j .The rule employs the parameter α to control the importance of pheromone trails, and q is a random number between 0 and 1.If q is less than or equal to the fixed value of q 0 (where 0 < q 0 < 1 ), it falls under exploitation, otherwise it falls under exploration, as specified in Eq. ( 28).U denotes the set of VMs that can be hosted on s j .η u,j represents the pheromone value, as defined in Eq. ( 27), while τ u,j is the local pheromone update, as defined in Eq. (30).Furthermore, Pr denotes the probability distribution of the random-proportional rule, as described in Eq. ( 29) [37].The pheromone is updated locally and globally.During the local update, ant z assigns vm i to s j and updates the pheromone, as described in Eq. (30).Here, τ 0 represents the initial pheromone level, and 0 < ρ l < 1 denotes the local pheromone evaporation parameter.The current iteration is denoted as t.The global pheromone update is performed based on the rule stated in Eq. ( 31), where 0 < ρ g < 1 is the global pheromone evaporation parameter.The coefficient , as defined in Eq. ( 32), incorporates the number of ants Z and iterations T g needed to locate the global solution sol g in the pareto set PS. Furthermore, P ′ (sol g ) , W ′ (sol g ) , and Fail ′ (sol g ) repre- sent the normalized power consumption, resource wastage, and failures, respectively, of the servers listed in the solution sol g .It is important to note that algorithm 2 pri- marily utilizes the Ant Colony metaheuristic optimization algorithm, which requires an execution time of O(n k ) [37].
The value of k depends on the number of iterations T, ants Z, VMs in V, and servers in S used by the Ant Colony algorithm to determine the placement plan for V.

Proactive application failure detection
The proactive application failure detection module is crucial for detecting application failure at an early stage, before it actually occurs.Service outages caused by application failures can lead to significant negative impacts on QoS, SLA compliance as well as negative end user experience.The module uses proactive approach to detect task failure regardless of its type from historical dataset.The dataset includes historical information about failures of tasks and their types such as network, hardware, software failures.Note the module does not react to instant failures of any type.Detecting failures at an early stage allows for appropriate service recovery actions to be taken quickly.The module adopts polling communication approach to get information about the current status of the cluster and hosted applications from the Cloud Manager.The information is used as a historical data for training and testing the used prediction method Artificial Neural Network (ANN) to predict the application failure.To validate this module, we conducted an analysis of the Google dataset [36] in our previous work [45].This dataset consists of logs of application jobs and their associated tasks executed on a cloud cluster for 29 consecutive days in 2011.We extracted information about the resources required and used by each task, as well as the termination status (finished, failed, evicted, or killed) of the tasks.Out of 48,261,777 tasks, 38% were successfully terminated, while 29% failed.Through our analysis, we identified several features that were correlated with task termination status, including the task ID, job ID, machine ID, CPU and RAM demands, mean CPU and RAM usage, and termination status.We trained a deep learning ANN method on this data to predict task failure.To detect predicted failed tasks and initiate recovery actions, our proactive application failure module employs the approach outlined in Algorithm 3. The input for the algorithm is a list of tasks that need to have their termination status predicted, and it returns a list of predicted failed tasks.It is worth noting that the ANN is trained and tested on a cleaned and prepared dataset before it is used by Algorithm 3. In terms of time complexity, Algorithm 3 takes O(n) time to predict the termination status of each task in the input list.By proactively detecting and responding to application failure, we can minimize service outages and maintain high levels of QoS and SLA compliance.

Dynamic application
The dynamic application reconfiguration module is responsible for handling reconfiguration requests that arise when the availability requirements of provisioned applications are threatened to be violated.These requests can originate from either the proactive application failure module, which notifies the module of predicted failed applications, or from the cloud manager, which sends scaling requests.In the case of a proactive notification, the module adds a new VM to replace the existing VM responsible for each predicted failed task.The placement of these new VMs is crucial to the successful recovery of the application services.The proposed placement process is designed to fulfill the objectives outlined in formulas Eqs. ( 6), (8), and (9) while respecting the constraints defined in formulas Eqs. ( 10) through (17), which align with the objectives of the MoVPAAC framework.Algorithm 4 outlines the placement procedure for these new VMs to recover the application services.The algorithm takes in a list of application tasks predicted as failed, fail-PredTasksList, and a list of servers, S, as input.It returns a map that includes the placement of the new VMs required to provision the failed tasks as output.For each failed task in failPredTasksList, the algorithm adds a new VM to provide the task and searches for a server s j ∈ S that can host the VM and has the minimum summation value of power consumption, resource waste, and failure without violating any constraints defined in formulas Eqs.(10) through (17).The algorithm then adds the record < vm i , s j > to the map vmsPlacementMap.Finally, the algorithm returns the map vmsPlacementMap.The time complexity of Algorithm 4 is O(n 2 ) because for each added VM, the algorithm searches for the best server s j among S that can host the VM.
The cloud manager at CSP can request one of four scaling types: scaling out, scaling up, scaling in, or scaling down.Scaling out request involves adding a set of new virtual machines (VMs), while scaling up request involves adding virtual resources, such as virtual central processing units (vCPUs) and virtual random-access memory (vRAM), to an existing set of individual VMs.Scaling in request involves removing a set of existing individual VMs, and scaling down request involves removing virtual resources from an existing set of VMs.If the request is for application scaling out, the reconfiguration module handles the placement of the new VMs in the same way that it handles requests from the proactive application failure module.However, in some cases, scaling up may require migrating VMs to other servers that can accommodate the updated resources without violating any constraints.The migration process must be done carefully, as it can significantly affect the outage period of the application service.The problem can be summarized as finding the optimal way to migrate all the VMs with minimum migration time while obeying the constraints.
To solve the problem, we propose an integer nonlinear programming (INLP) model with the objective of minimizing the migration time of the VMs that need to be migrated while obeying the constraints.The model includes a set of VMs that need to be migrated (G), a set of available servers at the data center (S), and the time to migrate a VM from a source server to a destination server ( migrationTime i,j,d ).Binary decision variables ( x ij and z id ) are defined to indicate the hosting server of each VM and whether a VM needs to be migrated to a specific server, respectively.We also propose a heuristic approach described in Algorithm 5 to solve the INLP model and find the placement servers of the VMs that require scaling.The algorithm takes as input the set of VMs that need to be scaled (vmsScaleList), available servers (S), and the scaling type (scaleType) and returns a map that includes the placement of the VMs on the servers in the data center.Note that Algorithm 5 is called for one corresponding application at a time where the scaling request is required to fulfill the needs of the application.If the scaling type is out, the algorithm searches for a server that can host each added VM with minimum summation value of power consumption, resource waste, and failure, while meeting all the constraints.For scaling up, the algorithm determines which VMs need to be migrated and finds a destination server that minimizes the total migration time.For scaling in and down, the algorithm rejects any scaling action that violates the application availability requirement constraint.The time complexity of Algorithm 5 can be analyzed as follows.For a scale out request at lines (1-15), the algorithm searches for the best server with minimum cost that can host each vm i .Since this operation is performed for each vm i , the time com- plexity of this operation is O(n 2 ) , where n is the number of available servers.Similarly, for a scaling up request at lines (16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33), the algorithm searches for the server that can host each vm i with minimum migration time.Again, this operation is performed for each vm i , result- ing in a time complexity of O(n 2 ) .For scaling requests of type in or down, the algorithm takes O(n) to check whether the scaling action should be taken or rejected.Overall, the time complexity of the algorithm is the sum of the time complexities of each operation, which is . This can be simplified to O(n 2 ) .Therefore, the time complexity of the algorithm is quadratic in the number of available servers n.

Experiments and results
To evaluate the effectiveness of the MoVPAAC framework, we conducted a variety of experiments testing its modules and algorithms.As a proof of concept for our research, we developed a simulation that models the key elements of the framework, including data centers, servers, VMs, and applications, and implemented it using the C++ programming language.All experiments were conducted on a 64-bit Windows 10 machine equipped with an Intel Core i7-8665U 2.11GHz processor and 16 GB of RAM, ensuring reliable and consistent results.
We conducted experiments to evaluate the performance of the availability-aware application deployment module in the proposed MoVPAAC framework.The experiments were divided into two groups.The first group consisted of a set of application deployment requests with no standby VMs.The first group includes four requests for applications deployment by different ASPs.Each request includes deployment of different number of applications.Each application has availability requirement (Req Availability), number of functionalities that compose the application (Funct No) which we add to simulate real-world applications and emphasize the concept of redundancy, (VMs No) which indicates number of VMs that host the components that provide the functionalities of the application.We simulated one DC with 85 heterogeneous servers.Server properties, such as CPU and RAM capacities and availability levels, were randomly generated using a uniform distribution with values ranging from 8-15 units and 0.7-0.99,respectively.For all servers, P active and P idle were set to 215 and 162, respectively.Table 2 describes the structure, number of VMs, and availability requirements of the applications.VM CPU and RAM demands were randomly generated using a uniform distribution with values ranging from 2-5 units.We submitted each request in Table 2 separately to the availability-aware application deployment module to deploy the applications and return the VM placement plan.We used the MOAntColony algorithm with a set number of iterations and ants for VM placement.We compared the placement results generated by AvAAD algorithm with three other baseline algorithms from the literature, CHASE [4], Convolutional Neural Network (CNN) [51], and FirstFit.We selected these algorithms based on their awareness of application availability during VM placement.CHASE is aware of application availability and is very close to our work, CNN is not aware but we incorporate the application availability into it, and FirstFit is unaware.Based on our best knowledge, existing VMs placement algorithms do not consider application availability as an objective.Table 3 summarizes the parameters used in the experiments.We simulated VMs placement for 1000 applications with their availability requirements and achieved ones after their placement at DC to train and test CNN.
We conducted an availability comparison of the applications deployed by the proposed AvAAD algorithm and three other VM placement baseline algorithms, CHASE, CNN and FirstFit, to evaluate the ability of the deployment module to deploy applications while satisfying their availability requirements.We computed the availability of all applications after deployment and compared it with the requested availability by ASPs.For request 1, Fig. 3a shows the achieved availability by the suggested placement for 5 applications of request 1 in Table 2 by each algorithm.As Fig. 3a shows, AvAAD algorithm met the availability requirements because it is greater than or equal to the requested availability for all of the requested applications, CHASE algorithm violated the availability requirements of 4 applications out of 5 requested ones, CNN algorithm met the availability of 3 applications out of requested applications, and FirstFit did not meet any availability requirement for any requested applications.Application admissibility refers to the ability of hosting (placement) the application and meet its requirements including the requested availability at the data center.If the application meets its requirements we count it as admitted based on its suggested placement by each algorithm.In other words, the availability plays a major decision to admit or reject the application.As Fig. 3c shows, for request 1, AvAAD admitted all the  5 requested applications since it met their availability, CHASE admitted 1 application and violated 4 applications out of 5 requested, CNN admitted 3 and violated 2 out of 5 requested, and FirstFit did not admit any application since it violated their requested availability.AvAAD algorithm is completely aware of the requested application availability, so it searches for any possible VMs placement for application to meet its availability requirement.CHASE tries to select servers that have maximum availability to host the VMs, but it does consider the entire application availability.So CHASE can assign VMs of application with low availability requirement at high available servers, and assign VMs of application with high requested availability at low available servers.CNN learns from the previous and historical applications that are hosted on the same DC, so it is trained and hence can predict the requested availability and place the VMs of the application accordingly.Therefore CNN achieved good results compare to AvAAD.FirstFit is not aware at all of the application availability, it places the VM on the first available server.Therefore, FirstFit achieved worst results in terms of availability.We selected FirstFit algorithm to emphasize the point that the existing VM placement algorithms do not consider application availability, which can have impact on quality of the application service and experience of the end users.For all requests in the first group, Fig. 3b shows that both AvAAD and CNN achieved mean availability close to the mean of the required availability of the applications, while CHASE and FirstFit achieved mean availability far from the required ones.
In order to evaluate the performance of the servers in the data center with various placement algorithms, we conducted an analysis of the mean power consumption of the servers that host VMs of the requested applications for first and second groups.As seen in Fig. 3d, AvAAD has a higher power consumption compared to CHASE, CNN and FirstFit algorithms.This can be attributed to approach of AvAAD by adding extra standby VMs to meet the availability requirements of only those applications that violate their availability.Consequently, the additional standby VMs consume more power, contributing to a higher overall power consumption.We also computed the mean CPU and RAM utilization of the servers after the deployment of the applications for each request.Figure 3e and f show the CPU and RAM utilization of the servers, respectively.We consider utilization of the resources as indicator for usage of the resources.The more resources utilization the better usage and lower wastage.AvAAD achieved stable and high CPU and RAM utilization, as one of its primary objectives is to minimize wastage of the resources.It is worth noting that the CPU and RAM utilization of AvAAD does not exceed the 80% utilization ratio, unlike the other algorithms, which sometimes exceed this ratio for certain requests.This is because we have set an upper threshold of 80% for both CPU and RAM utilization to prevent any server from reaching a full state of VMs, which could have a negative impact on the availability of the server as well as its performance.
In the second group of experiments, we included standby virtual machines (VMs) in the applications to recover the application service in case of active VM failure.The structure of the applications in the second group is described in Table 4, and we maintained the same VM and server properties as in the first group of experiments, except that we randomly generated availability values for servers using a uniform random distribution with a new range of 0.6-0.9, for illustrative purposes.Figure 4a displays the availability achieved by each placement algorithm for the six applications that belong to request number 6 of Group 2 in 4. The AvAAD algorithm can satisfy availability requirements for applications without adding standby VMs, which helps reducing the overall power consumption in the data center, as shown in Fig. 4d.The CNN and CHASE algorithm could satisfy availability requirements for most requested applications because standby VMs are present and they target availability during the VM placement process.Still FirstFit algorithm violates availability requirements for most of the requested applications because its approach that is not aware of the availability concept.Therefore, applications admissibility is high for the algorithms except for FirstFit as shown in Fig. 4c.As shown in Fig. 4d, using AvAAD and CNN result in servers consuming less power compared to using CHASE and FirstFit, as AvAAD does not require the addition of extra standby VMs for the protection approach.CNN searches for similar applications that have lower power consumption and can meet the requested availability.
We conducted a performance comparison of the four VM placement algorithms by measuring the execution time required to place different large sets of VMs, ranging from 500 to 3000.The results are presented in Fig. 5a.AvAAD algorithm outperformed CHASE in terms of execution time, took around 1700 seconds to place 500 VMs while CHASE required around 2215 seconds to do the same.This is because CHASE requires optimization solver to find solution that maximizes the availability of the hosted VMs which usually consumes extra time to find final solution.Still AvAAD takes a long time to find placement solution for large set of VMs and this is because it uses meta-heuristic AntColony to find initial placement of VMs for all the requested application that consumes more time.On the other hand, CNN took less time than both AvAAD and CHASE because it only requires to predict the placement of the VMs based on historical dataset.However, FirstFit algorithm was the fastest taking less than a second to place the same number of VMs.This is because FirstFit only looks for the first available server that can host the current VM.
To see the impact of AntColony on the performance of AvAAD algorithms, we measure the execution time of AvAAD using different number of ants of two different number of iterations 10 and 15 for placement 500 VMs.As Fig. 5b shows, the execution time increases by increasing number of ants and iterations.For example, AvAAD took around 1700 seconds to place 500 VMs for 10 iterations using 12 ants, while took around 1900 seconds with the 15 iterations to place the same number of VMs using the same number of ants.
To evaluate the effectiveness of the Artificial Neural Network (ANN) prediction method for application task failure, we utilized the same ANN structure as [45].Our training and testing process employed a cleaned dataset containing 1 million tasks over 100 epochs, with 500,000 tasks marked as "finished" and the other half marked as "failed".The accuracy and error loss of the ANN method were computed as illustrated in Fig. 6a and b, respectively.Accuracy denotes the percentage of correct predictions for task termination status.The ANN achieved a high accuracy of up to 93%, whereas the error loss was low, up to 14%.

Conclusion
In this research, we address the challenge of the dynamic placement of virtual machines (VMs) in the cloud, with a focus on ensuring application availability.To achieve this, we formalize the concept of application availability and model the dynamic VM placement problem as an INLP model with multiple objectives and a set of constraints.We propose a comprehensive framework that includes three modules to handle VM placement during deployment, failure, and scaling requests.The deployment module uses an AntColony optimization algorithm and a VM standby protection approach to achieve multiple objectives and satisfy the availability requirements of the applications.The results demonstrate that our proposed VM placement algorithm outperforms CHASE, CNN and FirstFit algorithms in terms of application service availability, accommodating higher number of applications, and CPU and RAM utilization.The prediction module of our framework employs deep learning ANN to predict application task failure, with an accuracy of up to 93% and a low error loss of up to 14%.Finally, the dynamic application reconfiguration module of the framework uses a heuristic approach to migrate VMs during scaling up requests.The migration solution is capable of migrating VMs with a lower migration time without compromising the availability requirements of the applications.
As future work, we plan to validate the overall performance of our proposed framework MoVPAAC including a large dataset and check for the possible comparisons with more existing methods from the literature.For example, the communication cost between the modules of the framework has a room for validation.In addition, we plan to incorporate the concept of application availability into existing cloud simulators such as CloudSim and validate our work using it.

Algorithm 3
Proactive Application Failure Detection Algorithm 4 VM Placement for Application Recovery Algorithm 5 VM Placement for Application Scaling

Fig. 3
Fig. 3 Evaluation of Availability-Aware Application Deployment Module -Group 1

jjjServer of index j sol 0 jjj
AbbreviationsASet of requested applications for deployment AV s j Availability of server of index j AV func f Availability of functionality of index f app a

Table 1
Summary of related work Fig. 2 Multiple-Objectives Dynamic VM Placement for Application Availability in Cloud (MoVPAAC) Framework ∀j ∈ S

Table 2
Description of applications requests -group 1

Table 3
Parameters used in the experiments

Table 4
Description of applications requests -group 2