Combination model for resource management based on the ant Colony algorithm
QoS assessment of the management portfolio
Let WS = {WSi| i = 1, 2, …, n} be a set of n types of subtasks that need to be completed, and let wsj = {wsij| j = (1, 2, …, m1)} be a candidate web service class in the UDDI specification that can complete subtask WSi, where mi is the number of services in the service class. Let Ii = {ti, ci, ri, …} be the set of QoS evaluation indicators for service class wsi, where ti is the time index, ci is the price index, ri is the reliability index, and the ellipsis represents scalable quality indicators. Each service class indicator set is different, and ti, ci, and ri for a web service can be dynamically combined to calculate public evaluation indicators for each service class, that is, QoS = execution time, execution cost, and reliability.
Definition 1 Execution time. Let T(wsi) be the execution time of service subtask wsi; then, d \({WS}_{QoS}^{Tinx}=\sum \limits_{i=1}^nT\left({ws}_i\right)\) is the execution time of the discovery process. When a subtask is executed sequentially for several service components, \(T\left({w}_i\right)=\sum \limits_{j=1}^kw{s}_j\); when the subtask is executed in parallel for several service components, T(wsi) = max(T(wsj)) j = 1, 2, …, k.
Definition 2 Execution cost. Let C(wsi) be the execution cost of web service subtask wsi; then, \({WS}_{Q_0S}^{Cost}=\sum \limits_{i=1}^nC\left({ws}_i\right)\) is the execution cost of the web discovery process.
Definition 3 Service reliability. Let R(wsi) be the service reliability of service subtask wsi; then, \({WS}_{QaS}^{reliablity}=\prod \limits_{l=1}^nR\left({ws}_i\right)\) is the reliability of the discovery process.
Multi-objective ant colony algorithm
Since the goal of the dynamic combination problem for a web service is to select a suitable service instance from among the candidate services for each discovered subtask, the pheromone of ksij is selected to be τij for subtask tki, and the heuristic information nij of ksij is selected for subtask tki. When the algorithm is initialized, initial values τij = τ0, 1 ≤ i ≤ n, 1 ≤ j ≤ m, are set for the pheromones. Multiple QoS parameters with different characteristics are considered in the model. To perform multi-objective optimization, different types of heuristic information need to be defined.
Reliability-prioritized heuristic information
The RP heuristic information guides ants to select highly reliable web service instances. If an ant’s heuristic type is RP, the heuristic information for selecting ksij for subtask tki can be expressed as:
$${\eta}_{ij}={RP}_{ij}=\frac{ks^j\cdot r-{\mathit{\min}}_{-}{reliability}_i+1}{{\mathit{\max}}_{-}{reliability}_i-{\mathit{\min}}_{-}{reliability}_i+1}$$
(1)
Here, \({\min}_{reliability_t}={\min}_{1\le 5.5{m}_1}\left\{{ks}_i^j,r\right\},{\max}_{reliability_t}={\max}_{1s,{sm}_1}\left\{{ks}_i^j,r\right\}\). This formula ensures that the heuristic information is normalized to the interval (0,1) and that the higher the reliability of a web service instance is, the greater the value of its heuristic information.
Time-prioritized heuristic information
The TP heuristic information guides ants to select a web service instance with a short execution time. If an ant’s heuristic type is TP, the heuristic information for selecting ksij for subtask tki can be expressed as:
$${\eta}_{ij}={TP}_{ij}=\frac{{\mathit{\max}}_{-}{time}_i-{ks}_i^j\cdot t+1}{{\mathit{\max}}_{-}{time}_i-{\mathit{\min}}_{-}{time}_i+1}$$
(2)
Here, \({\mathit{\min}}_{-}{time}_i={\mathit{\min}}_{1\le j\le {m}_i}{\left\{{ks}_i^j\cdot t\right\}}_i,{\mathit{\max}}_{-}{time}_i={\mathit{\max}}_{1\leqslant j\le {m}_1}\left\{{ks}_i^j\cdot t\right\}\). This formula ensures that the heuristic information is normalized to the interval (0, 1) and that the shorter the execution time of a web service instance is, the greater its heuristic information value.
Cost-prioritized heuristic information
The CP heuristic information guides ants to select a web service instance with a low execution cost. If an ant’s heuristic type is CP, the heuristic information for selecting ksij for subtask tki can be expressed as:
$${\eta}_{ij}={CP}_{ij}=\frac{{\mathit{\max}}_{-}{cost}_i-{ks}_i^j\cdot t+1}{{\mathit{\max}}_{-}{cost}_i-{\mathit{\min}}_{-}{cost}_i+1}$$
(3)
Here, \({\mathit{\min}}_{-}\cos {t}_i={\mathit{\min}}_{1\le j\le {m}_i}\left\{{ks}_i^jc\right\},{\mathit{\max}}_{-}{cast}_i={\mathit{\max}}_{1\le j\le {m}_1}\left\{{ks}_i^j\cdot c\right\}\). This formula ensures that the heuristic information is normalized to the interval (0, 1) and that the lower the execution cost of a web service instance is, the greater the value of its heuristic information.
Resource scheduling model for big data processing
Parameter definitions
Definition 1 Assume that the limited set of physical clusters in the current streaming big data processing platform is N = {N1, N2, …, Nd}, where the resource configuration of each physical machine is \({N}_d=<{total}_d^{cpu},{total}_d^{mem}>\). To determine the quantitative indicators for the combined scheduling strategy, it is necessary to quantify the resource utilization of each node. In this paper, the different computing resources of the CPU and memory are considered separately to perform scheduling and quantify the resource utilization rate on each node.
Definition 2 The node resource utilization Ud is calculated as the ratio of the actual amount of resources occupied on each node to the total amount of resources available at that node during operation. The CPU and memory resource utilization on a node are calculated using the following formulas:
$${\displaystyle \begin{array}{c}{U}_d^{cpu}=\frac{\sum \limits_{j=1}^n{R}_{dj}^{cpu}}{total_d^{cpu}}\\ {}{U}_d^{mem}=\frac{\sum \limits_{j=1}^n{R}_{dj}^{mem}}{\ {total}_d^{mem}}\end{array}}$$
(4)
Here, Ucpud and Umemd represent the CPU and memory resource utilization, respectively, of the physical node Nd and \(\sum \limits_{j=1}^n{R}_{dj}^{cpu}\) and \(\sum \limits_{j=1}^n{R}_{dj}^{mem}\) represent the sums of the memory and CPU resource usage, respectively, of the computing containers running on the physical node Nd.
Scheduling operation timing
For a given computing container, one can first determine whether the computing container requires resource rescheduling. The judgement rule for this purpose is:
$${\displaystyle \begin{array}{l}{PR}_{i\left(n+1\right)}^{cpu}\ne {AR}_{in}^{\varphi u}\\ {}{PR}_{i\left(n+1\right)}^{mem}\ne {AR}_{in}^{mem}\end{array}}$$
(5)
where \({PR}_{i\left(n+1\right)}^{cpu}\) and \({PR}_{i\left(n+1\right)}^{mem}\) represent the predicted CPU and memory resources, respectively, needed for the i-th computing container in the (n + 1)-th time window and ARcpuin and ARmemin represent the CPU and memory resources, respectively, actually assigned to the i-th computing container in the n-th time window That is, as long as the actual allocated resource amount is different from the predicted amount, resource rescheduling must be performed on the computing container, and the computing container is added to the resource rescheduling queue (RSQ).
Calculation of resource increase and decrease
First, the predicted resource value \({PR}_{i\left(n+1\right)}=\left({PR}_{i\left(n+1\right)}^{cpu},{PR}_{i\left(n+1\right)}^{mem}\right)\) for the i-th computing container and the actual configured resource amount \({AR}_{in}=\left({AR}_{in}^{cpu},{AR}_{in}^{mem}\right)\) for the i-th computing container in the (n + 1)-th time window are obtained; then, the resource adjustment ΔRi(n + 1) for container i in the (n + 1)-th time window can be calculated.
$${\displaystyle \begin{array}{l}\Delta {TR}_{i\left(n+1\right)}=<\Delta {R}_{i\left(n+1\right)}^{cpu},\Delta {R}_{i\left(n+1\right)}^{mem}>\\ {}\Delta {TR}_{i\left(n+1\right)}^{(pu)}={PR}_{i\left(n+1\right)}^{cpw}-{AR}_{in}^{cpu}\\ {}\Delta {TR}_{i\left(n+1\right)}^{mem}={PR}_{i\left(n+1\right)}^{mem}-{AR}_{in}^{mem}\end{array}}$$
(6)
Note that the predicted resource values in terms of CPU and memory for each computing container may be either smaller or greater than the current actual configured resource amount. Accordingly, when \(\Delta {TR}_{i\left(n+1\right)}^{cpu}\) or \(\Delta {TR}_{i\left(n+1\right)}^{mem}\) is greater than 0, this indicates a resource addition to the CPU or memory. When \(\Delta {TR}_{i\left(n+1\right)}^{cpu},\Delta {TR}_{i\left(n+1\right)}^{mem}\) is less than 0, this means that the CPU or memory resources are reduced.
Theory related to cloud workflows
Scenario model
The core business process analysis of the platform is as follows:
-
First, a service requester logs into the system using a legal user name and password and starts a service application in accordance with the workflow rules of the company. The application process mainly includes entering the application data, submitting the application, and waiting for the application to be accepted.
-
Second, the acceptor at the acceptance centre accepts the service application data information, checks the business data, accepts the service application, issues an acceptance opinion, and reviews the workflow.
-
Then, the dispatcher at the dispatching centre reviews the business data information, reviews the acceptance result, and distributes the event.
-
Finally, the dispatcher feeds the audit opinion back to the acceptor. The dispatcher distributes the event to the squad leader in accordance with the business demand. The squad leader calculates the allocation and waits for the decision-maker to issue the order, and the implementation department begins the business implementation process after receiving the instruction. After that, the result of the workflow computation is fed back to the acceptor, the acceptor summarizes the information, and the processing result is fed back to the service requester. The service requester performs the next event flow, generates a workflow information table, performs the warehousing process, and completes the workflow by sending a workflow message, which allows the information maintainer and workflow supervisor to maintain and monitor the workflow information at any time.
Role models
The roles of the entities performing a workflow can be abstracted in accordance with their functions during event processing: application requester, service requester, acceptor, dispatcher, squad leader, decision-maker and implementation department. Acceptor functions include information collection, task distribution, acceptance confirmation, programming, emergency monitoring, incident reporting and comprehensive coordination. Service requester functions include service application, information retrieval and alarm issuance. Implementation department functions include information feedback, information retrieval and command reception. Squad leader functions include information collection, information reporting, task distribution, log management, command reception, and event monitoring. Decision-maker functions include situation monitoring, program validation, and event monitoring. Scheduler functions include task signing, duty management, situation monitoring, and service auditing. Workflow monitor functions include situation monitoring and message monitoring. Business manager functions include business management, user management and personal information management. Information maintainer functions include information maintenance, backup maintenance and communication management. Application requester functions include workflow template selection, workflow template configuration, application configuration and data configuration.
Dynamic resource prediction model for big data processing
Parameter definitions
This paper introduces a sliding window function. For each application, the predicted resource usage value for the i-th computing container in the (n + 1)-th time window, Wn + 1, can be expressed as:
$${PR}_{i\left(n+1\right)}=g\left({R}_i\right)$$
(7)
where g(Ri) represents a resource usage prediction model. For all computing containers CC = {CCl, CC2, ⋯, CCm} in the streaming big data processing platform, the historical resource usage data of each computing container CCi are obtained by monitoring each time window to form a data stream Ri with temporal properties, as defined below.
Definition 1 Physical resource usage sequence
For the i-th computing container, CCi (i ≤ m), the corresponding resource usage in the n-th time window is Rin, and the resource usage sequence Ri = {Ri1, Ri2, ⋯, Rin} of computing container CCi is obtained as a complete time series, where n is the number of time windows and Rin is the amount of resources used by the application’s i-th computing container in the n-th time window. Since the resource usage includes both CPU resource usage and memory resource usage, Rin can be expressed as \({R}_{in}=\left\{{R}_{in}^{cpu},{R}_{in}^{cpu}\right\}\).
Definition 2 Sequence of changes in resource usage
For the i-th computing container CCi, the difference between the adjacent n-th time window and the (n-1)-th time window is expressed as the change in resource usage ΔRin = Rin − Ri(n − 1), from which the sequence of changes in resource usage ΔRi = {ΔRi1, ΔRi2, …, ΔRin} can be obtained for the computing container. Since Ri includes both CPU and memory resources, the sequence of changes in resource usage includes the sequence of changes in CPU usage \(\Delta {R}_{in}^{cpu}\) and the sequence of changes in memory usage \(\Delta {R}_{in}^{mem},\Delta {R}_{in}=\left\{\Delta {R}_{in}^{cpu},\Delta {R}_{in}^{mem}\right\}\), where \(\Delta {R}_{in}^{cpu},\Delta {R}_{in}^{mem}\) are calculated as follows:
$${\displaystyle \begin{array}{c}\Delta {R}_{in}^{cpu}={R}_{in}^{cpu}-{R}_{i\left(n-1\right)}^{cpu}\\ {}\Delta {R}_{in}^{\mathrm{mem}}={R}_{in}^{\mathrm{mem}}-{R}_{i\left(n-1\right)}^{\mathrm{mem}}\end{array}}$$
(8)
Resource prediction model based on the changes in resource usage
The predicted resource usage value for the i-th computing container in the (n + 1)-th time window is calculated from the historical CPU and memory usage sequences, as expressed below:
$$g\left({R}_i\right)=f\left({R}_i^{cpu},{R}_i^{mem}\right)$$
(9)
where, as shown in Definition 1, \({R}_i^{cpu}\) is the sequence of CPU resource usage from the start time of the i-th computing container to the n-th time window and \({R}_i^{mem}\) is the corresponding sequence of memory resource usage. The resource usage sequence is volatile and continuous, so the CPU and memory resource usage of the i-th computing container in the n-th time window can either increase or decrease depending on the change in usage. To predict the resource usage value in the (n + 1)-th time window, the problem is converted into the following formula:
$$f\left({R}_i^{cpu},{R}_i^{mem}\right)=\left\{{R}_i^{cpu}+\Delta {R}_{i\left(n+1\right)}^{cpu^{\prime }},{R}_i^{mem}+\Delta {R}_{i\left(n+1\right)}^{mem^{\prime }}\right\}$$
(10)
Since the CPU resource usage \({R}_{in}^{cpu}\) and the memory resource usage \({R}_{in}^{mem}\) in the n-th time window are known, the problem translates into one of finding the changes in resource usage, \(\Delta {R}_{i\left(n+1\right)}^{cpu^{\prime }}\) and \(\Delta {R}_{i\left(n+1\right)}^{mem^{\prime }}\), in the next time window.