 Research
 Open access
 Published:
Task offloading exploiting grey wolf optimization in collaborative edge computing
Journal of Cloud Computing volume 13, Article number: 23 (2024)
Abstract
The emergence of mobile edge computing (MEC) has brought cloud services to nearby edge servers facilitating penetration of realtime and resourceconsuming applications from smart mobile devices at a high rate. The problem of task offloading from mobile devices to the edge servers has been addressed in the stateoftheart works by introducing collaboration among the MEC servers. However, their contributions are either limited by minimization of service latency or cost reduction. In this paper, we address the problem by developing a multiobjective optimization framework that jointly optimizes the latency, energy consumption, and resource usage cost. The formulated problem is proven to be an NPhard one. Thus, we develop an evolutionary metaheuristic solution for the offloading problem, namely WOLVERINE, based on a Binary Multiobjective Grey Wolf Optimization algorithm that achieves a feasible solution within polynomial time having computational complexity of \(O(M^3)\), where M is an integer that determines the number of segments in each dimension of the objective space. Our experimental results depict that the developed WOLVERINE system achieves as high as 33.33%, 35%, and 40% performance improvements in terms of execution latency, energy, and resource cost, respectively compared to the stateoftheart.
Introduction
The proliferation of seamless internet connectivity technologies, such as WiFi, 4G, 5G, or LTE, as well as the availability of high processing capabilities at the mobile edge, has pushed the horizon of a new computing paradigm called mobile edge computing (MEC) [1,2,3]. In recent years, the penetration of computationintensive realtime applications has increased with the rapid rise of massively connected heterogeneous mobile devices (MDs) [4]. According to [5], Cisco predicts that by 2030, almost 500 billion gadgets will be associated with the Internet of Things (IoT). Frequent access to cloud services results in an increase in mobile data traffic as well as backhaul latency, which in turn diminishes the Quality of Experience (QoE) of the application users [1]. The MEC alleviates these problems by bringing the resources closer to the end users [6]. The benefits of MEC can further be extended by introducing collaboration among edge servers located in different geographical regions, called collaborative mobile edge computing (CoMEC) [7]. Not only do the edge servers participate in resource sharing, but vertical collaboration [8] also takes place among the three layers of CoMEC. Vertical collaboration in the MEC environment signifies collaboration among multiple layers of IoT computing infrastructure, including the IoT devices at the bottom, the edge cloud servers at the middle, and the master cloud at the top, as shown in Fig. 1.
While CoMEC increases the sustainability of edge computing, service caching at the MEC layer favors the QoE of the realtime application users [9]. Service caching refers to caching the information that must be known by the edge server to complete the task execution. This information includes system settings, the heavy program code of the application, and their related databases/libraries [10]. Figure 1 illustrates some reallife use cases where caching is exploited in MEC for better QoE. One such case is where the MEC can be exploited for intelligent transportation systems (ITS), such as extending the connected vehicle cloud into the mobile network [11]. As a result, roadside applications operating directly at the MEC may receive local messages from vehicles and roadside sensors, process them, and broadcast alerts (e.g., an accident) to nearby vehicles within the shortest possible time [12]. The second case is of virtual reality and facerecognition data processing in various applications that require frequent database access. Both of these applications are dataintensive and need to deliver output in real time to ensure higher QoE to users. In all of the aforementioned cases, service caching can go a long way to ensure fast services to users. Caching prevents the same data from being offloaded multiple times, thus, both transmission latency and energy consumption can be reduced.
Computation offloading to a CoMEC network considering service caching may improve the overall QoE by reducing the associated system costs in terms of the queuing delay of tasks, energy consumption of devices, monetary costs, and so on [13, 14]. Additionally, it is not realistic to offload all tasks of MD to MEC all the time as the limited storage and computing resources of MEC significantly affect the time delay of the offloaded tasks. Therefore, an optimal task offloading decision needs to be formulated to achieve an efficient network model while keeping the aforementioned system costs minimal. A large number of researches have been done on caching strategies [15, 16] and CoMEC. Content caching, computation offloading, and resource allocation problems have been jointly considered in [4] to reduce users’ overall task execution time but it lacks collaboration among the edge servers. An AIbased task allocation algorithm namely iRAF has been proposed in [17] for the CoMEC network where the average latency and energy have been optimized. Here, either one of the objectives is optimized by associating binary weights that create unfairness in the result. In [18], monetary cost and execution delay has been optimized using the particle swarm optimization (PSO) algorithm for a vehicular network. However, addressing mobile energy consumption still remains an issue. Three prime objectives, that is, execution time, energy consumed, and monetary cost have been optimized in a multiuser multiserver environment using a multiobjective evolutionary algorithm (MOEA/D) combining simple additive weighting (SAW) and multiattribute decision making (MDM) in [19]. This work too lacks collaboration among servers and cache resource allocation which can be crucial to addressing QoE.
This research endeavors to bridge notable gaps that have persisted in the existing body of knowledge in the MEC environment. In a dynamic environment, where heterogeneous mobile devices and edge servers are involved in optimizing multiple objectives simultaneously, no existing solutions can effectively address the problem. Several challenges are encountered while optimizing conflicting objectives together in a complex environment where multiple realtime applications operate on different user devices. Firstly, realtime applications require faster processing than others. If they are computationally expensive, offloading associated data and codes frequently creates a significant overhead. Secondly, handling offloading decisions while executing tasks can slow down the services of edge servers, especially if the resources of the edge servers become saturated, thus degrading QoE. Thirdly, since multiple objective parameters are targeted for optimization, they can be conflicting in nature. Thus, an exhaustive exploration of potential solution combinations becomes imperative. Most of the studies done so far have opted for singleobjective optimization associating scalar weights to multiple objective parameters. Some of these depend on multiple decision criteria for selecting solutions [19]. The parameters for such decisionmaking variables require meticulous finetuning and the environment saturated with realtime applications cannot afford to create extra overhead as such. Finally without service caching, every request for a particular service or content would need to travel from the user’s device to the edge server or even further to the cloud, resulting in higher latency. This delay can be especially problematic for realtime delaysensitive applications.
In this paper, we investigate a problem of joint optimization of task execution time, energy, and resource usage cost while offloading tasks in a CoMEC network. A task offloading framework based on grey WOLf optimization that exploits VERtical collaboration IN Edge computing, namely WOLVERINE system is devised to solve the problem. The WOLVERINE stands out from other taskoffloading frameworks due to its innovative features and advantages. Traditional task offloading frameworks suffer from several drawbacks, which can be categorized into three main areas: 1) lack of reproducibility of offloaded application codes, 2) lack of collaboration among the edge servers, and 3) inability to optimize multiple crucial parameters simultaneously. These limitations have negative implications for network systems, resulting in decreased QoE, underutilized resources, and suboptimal network performance. In response to these challenges, WOLVERINE introduces a novel task offloading scheme for reallife computationally intensive applications, utilizing an evolutionary algorithm. This scheme addresses the collaboration among servers and leverages cached application code to minimize time, energy, and resource costs in edge computing environments. The main contributions of the WOLVERINE framework are listed below:

We design a collaborative task offloading framework that effectively utilizes cached and computational resources to enhance user QoE in a CoMEC system where realtime applications are executed.

We formulate the problem of jointly optimizing latency, energy, and resource usage cost as a Multiobjective Linear Programming (MOLP) problem.

Due to the NPhardness of the above MOLP, we exploit Binary MultiObjective Grey Wolf Optimization (BMOGWO), a metaheuristic evolutionary algorithm, to develop a polynomial time solution to the problem, namely WOLVERINE.

The experimental results depict that the proposed WOLVERINE system outperforms in terms of execution latency, energy, and resource cost in comparison with [19] by 33.33%, 35%, and 40%, respectively.
The rest of this paper is organized as follows. “Related works” section illustrates the major existing works. “System model” section describes the system model of WOLVERINE. “Design details of WOLVERINE” section elaborates the computational model, multiobjective problem formulation, and metaheuristic task offloading scheme. “Performance evaluation” section describes the environmental setup and results of experimental analysis. Finally, “Conclusion” section summarizes the key outcomes of our work and some future research directions.
Related works
Several works in the field of collaborative edge computing have been done, including optimal task caching and task allocation while optimizing a single objective function, tradeoffs between two or more objectives, and multiobjective optimization.
The first category of works in the literature focused on singleobjective optimization in collaborative edge computing, for example, energy, time, or resource cost allocation. In [2], a genetic algorithm based on a dataaware task allocation strategy has been proposed that considers the network congestion control for allocating subtasks. In [20], the authors have focused on the reduction of energy consumption for task assignments by considering the heterogeneity of users using a heuristicbased greedy approach. An architecture has been proposed in [21] that considers unloading resourceintensive tasks from client devices in the cooperative edge space or to the remote cloud depending on users’ desire and resource availability. An AIdriven intelligent Resource Allocation Framework (iRAF) [17] has been designed to solve complex resource allocation problems considering the current network states and task characteristics. Another group of authors in [22] have utilized a deep reinforcement learning method to solve computation offloading and resource allocation problems in a blockchainbased multiUAVassisted dynamic environment.
Computation offloading that focuses on the minimization of system cost comprising the tradeoff between energy and task execution delay in the form of a weighted sum has been proposed in [15]. Collaboration among MEC servers for (data) cache and computational resource allocation are noteworthy in [15]. However, caching the content or code of applications is not enough due to the limited computational capacity of user devices as well as the delay associated with transmitting cached data or code. Hence the idea of jointly task offloading and caching needs to be considered. In [16], a joint service caching, task offloading, and system resource allocation scheme to minimize system cost comprising of time and energy have been formulated using a MILP problem. In [23], a prioritybased task offloading and caching scheme is proposed for the MEC environment, where computing a task while reducing energy cost and delay time efficiently is the main priority. A new lowcomplexity hyperheuristic algorithm has been proposed in [24], where content caching is performed along with computation offloading in an MEC network to optimize the service latency for all ground IoT devices. Mobility and user preferenceaware contentcaching in MEC are orchestrated in [25]. The authors in [26] introduce an enhanced binary PSO algorithm, which is designed for optimizing task offloading and content caching in MEC networks. It focuses on jointly optimizing task completion delay and energy consumption. Additionally, an enhanced binary particle swarm optimization (BPSO) algorithm is proposed for content caching in parallel task offloading scenarios. An alternatingiterative algorithm has been developed in [27] for jointly optimizing task caching and offloading in a resourceconstraint environment to minimize energy consumption. Here task caching indicates caching of a completed application and relevant data. Subsequently, in [4], content caching, computation offloading, and resource allocation problems have been jointly considered to reduce users’ overall task execution time. However, caching a complete application, i.e., content caching is often incompatible with user requirements. Hence, the idea of caching data codes for joint task offloading and data caching using the Lyapunov algorithm for minimizing task computation delay has been introduced in [28]. The authors have formalized joint service caching and task offloading decisions to minimize computation latency while keeping the total computation energy consumption low.
MultiObjective Optimization problems are adopted for computation offloading in edge cloud by the authors of [29] which focused on the offloading probability of tasks to edge cloud from an MD. To optimize execution time, energy, and resource cost to maximize utility for resource providers in IoT networks, energy harvesting properties of unnamed aerial vehicles (UAV) are used in [30]. A deep reinforcement learning (DRL) based solution is used for this system network that is managed by blockchain. Multiobjective optimization problems have multiple Paretooptimal solutions which are obtained by tradeoffs. Hence, evolutionary algorithms can play a significant role in reaching a singlepreferred solution [31]. In [32], time, energy, and cost were minimized for an edge cloud environment using the genetic algorithm NSGAII. Minimization of average latency and energy consumption simultaneously for offloading tasks using the Cuckoo search algorithm has been proposed in [33]. In [34], GreyWolf Optimization is used to perform a tradeoff between the minimization of energy consumption and response time in an MEC environment. An Improved MultiObjective Grey Wolf Optimization (IMOGWO) is used for subtask scheduling in an edge computing environment introduced in [35] to optimize makespan, load balance, and energy simultaneously. Computation time and cost minimization have been performed in [18] using the Particle Swarm Optimization (PSO) algorithm for a Vehicular Edge Computing (VEC) environment. In [19], a triobjective problem has been considered in a multiuser and multiserver task offloading environment where an application is divided into multiple independent subtasks. A Multiobjective Evolutionary Algorithm based on decomposition (MOEA/D) has been developed for optimizing the time, cost, and energy expended in the execution of a particular subtask. MOEA/D is also used to minimize latency and energy in [36] for the MEC environment, where the ordering of subtasks exists as a constraint. It is also used for minimization of latency and maximization of rewards for servers and tasks in [37]. However, the direct assignment of subtasks from mobile devices to a server is costly in terms of energy and offloading decisionmaking. The works mentioned above that addressed multiobjective optimization do not have a system environment similar to that of CoMEC handling reallife applications.
The summary of the stateoftheart works has been listed in Tables 1 and 2. Most of the existing literature works have either performed singleobjective optimization or weighted optimization in multiuser multiserver networks with and without cache or have performed multiobjective optimization without caching and collaboration among servers. The problem of jointly optimizing three basic objectives: execution latency, device energy, and resource cost has not yet been resolved in the CoMEC system incorporating service caching. The generation of Paretooptimal solutions for optimizing multiple objectives simultaneously in a resourceconstrained environment where servers collaborate and cache service is yet to be done. These observations have driven us to design a task offloading framework in the CoMEC environment for generating Paretooptimal solutions for multiobjective optimization by exploiting service caching of computational resources.
System model
In this section, we describe the different entities of a CoMEC network and the interactions among them.
Entities of CoMEC network
We consider a CoMEC network consisting of a set of collaborative edge servers (CESs), \(\mathbb {E}\) and a set of mobile devices (MDs), \(\mathbb {U}\), as shown in Fig. 2. Each mobile device \(k \in \mathbb {U}\) is connected with one edge server \(j \in \mathbb {E}\), which is termed as its primary edge server (PES). Let \(\tau\) be the set of M tasks arrived at a PES from mobile devices. Each task \(i \in \tau\) is denoted by a fourparameter tuple, \(\langle b_i, \mathcal {B}_i, T^{max}_i, \delta _i \rangle\), where \(b_i\) is the input data size, \(\mathcal {B}_i\) is the size of related data codes, \(T^{max}_i\) is the task deadline and \(\delta _i\) is the task budget. In this work, data code is considered to consist of applicationrelated program code, system settings, and related databases/libraries.
Each mobile device k has computational resources and each edge server j is considered to consist of both computational and cached resources. Table 3 contains major notations. A task generated from an MD can be executed either on the MD itself or at any edge server where edge servers are borrowing resources from the cloud while needed.
Collaboration among entities
Upon receiving a set of task requests, \(\tau\) from the mobile devices, the PES communicates with the other CESs for taskrelated information and checks the availability of the resources, i.e., cached and computational resources required for the execution of the tasks. After getting the resource availability information, the PES runs the WOLVERINE task allocation decision algorithm and determines the appropriate resource providers to execute the tasks considering their requirements. If none of the servers has enough resources to complete a task, it is forwarded to the master cloud for execution, implementing a vertical collaborative computation environment.
Design details of WOLVERINE
In this section, we unfold different design components of WOLVERINE. First, we present a computational model of the proposed WOLVERINE system, then we formulate the task offloading problem as a multiobjective optimization problem; and finally, we devise a binary multiobjective grey wolf optimizationbased solution.
Computational model of WOLVERINE
In this section, we unfold different design components of WOLVERINE. First, we present a computational model of the proposed WOLVERINE system, then we formulate the task offloading problem as a multiobjective optimization problem; and finally, we devise a binary multiobjective grey wolf optimizationbased solution.
Figure 3 depicts the functional modules of the proposed WOLVERINE system, where an individual module is responsible for performing a specific function. The main functional modules of the PES can be grouped into two categories: the PES service module and the CES service module. The PES service module handles the task requests from the MDs and determines the optimal task offloading policy with the help of the CES service module. The responsibility of the CES service module is to manage collaboration between the PES and the CESs. Note that any collaborative edge server can work as a primary server by installing the PES service module to achieve the corresponding functionalities. The functionalities of each module are described below:

Task Profiler receives the taskoffloading requests from the MD first and then checks for the required cached resources for each task using the Resource Availability Database (Path 2) and propagates the task and resource data to the Optimal Task Allocator module (Path 3) for optimal resource allocation.

Optimal Task Allocator is the core computational block of the PES service module. It collects the task’s descriptions from the task profiler, queries the resource availability of the Collaborative Edge Servers (CES) to the Resource Availability Checker (Path 4) whose result comes through the Resource Availability Database (Path 56789), formulates the WOLVERINE task offloading problem and communicates the associated task offloading decision vectors to the MDs.

Resource Availability Database records the availability of the computational and cached resources of the CES that comes through the Communication Module and Resource Availability Checker (Path 678).

Resource Availability Checker queries resources to other neighboring CESs and updates the cached and computational resources periodically or when triggered by the Optimal Task Allocator (Path 16).

Task Execution Module executes the computational tasks offloaded to it by utilizing the available computational resources (Path 14) and cached data administered by Caching Management Module (Path 111213).

Caching Management Module supplies the cached data to the Task Execution Module from the Cached Data module (Path 1213) and maintains the cached data repository by performing maintenance functions.

Cached Data Repository stores the cached data code from the Computational Resource module for further use (Path 15).

Computational Resources module stores the server’s available resources, such as CPU cycle and memory, for usage by the Task Execution Module.

Communication Module establishes collaboration among multiple edge servers and acts as a communication medium between the server and the MDs to share task data and computational results.
Multiobjective problem formulation
In this section, we calculate total latency \(T_{ij}\), energy consumption \(E_{ij}\) and monetary cost \(C_{ij}\) for offloading task \(i\in \tau\) to edge server \(j \in \mathbb {E}\) or for local computation. Finally, we formulate the task offloading problem of WOLVERINE as a multiobjective optimization problem.
Calculation of \({T_{ij}}\)
Two different cases for calculating \({T_{ij}}\):
In the first case, the mobile device executes the task locally, thus, experiencing no communication delay. So, the task computation delay, \(t^k_{ij}\) for executing task \(i \in \tau\) on the mobile device \(k\in \mathbb {U}\) locally is calculated as,
Here, \(c_{i}\) is the number of computation cycles required to compute the task, \(\mu ^{k}_{i}\) is the ratio of CPU cycles allocated by \(k^{th}\) mobile device to complete \(i^{th}\) task and \(f^{k}\) is the CPUcycle frequency of \(k^{th}\) mobile devices.
For the second case, the input data and/or data code are offloaded to the MEC servers. If the data code is cached at the offloading server, then only the input data needs to be transmitted; otherwise, the device sends the input data along with the code to the server. For wireless transmission between the mobile device and collaborative edge server that follows Orthogonal Multiple Access (OMA), we consider the Rayleigh channel, and the transmission rate is calculated as,
where, \(B_{ij}\) is the allocated radio bandwidth, \(p^{k}\) is the transmission power, \(h^k\) is the channel gain (\(k \in \mathbb {U}\)) and \(N_{0}\) is the variance complex of white Gaussian channel noise. Now, we calculate the communication latency, \(t^{c}_{ij}\) for offloading task i to edge server j as follows,
where, \(\sigma _{ij} \in \{0, 1\}\). Its value is 1 when the cached resources i.e., data code available in the offloading server, otherwise 0. Here, \(b_{i}\) and \(\mathcal {B}_{i}\) denote the size of the input parameters and data code, respectively. Next, we calculate the execution time of task i at the edge server j as,
where, \(\lambda _{ij}\) is the resource of server j allocated to task i and \({f^j}\) is the total resource of the \(j^{th}\) MEC. Finally, we calculate the total latency for completing task i using the following equation:
When calculating execution latency for realtime computationintensive applications in edge computing, addressing delivery or downloading latency is crucial. However, in this particular scenario, the emphasis is placed more on upload speeds and network latency rather than download times. Besides, the execution result has typically limited data size and thus it has negligible impact on resource parameters.
Calculation of \({E_{ij}}\)
For calculating total energy consumption \({E_{ij}}\), two possible cases have been brought under consideration. In the first case, the mobile device executes the task locally. Hence, we consider only task computation energy and it is calculated as follows,
where, \(\kappa\) is a coefficient that depends on device’s chip architecture [17] and \(f^{k}\) is the CPUcycle frequency of \(k^{th}\) mobile device.
For the second case, the task is executed at the server, hence, task computation energy is ignored. Thus, the energy the device expends due to transmitting input data and/or code to the MEC server is calculated as,
where, \({p^{k}}\) is the power of \(k^{th}\) mobile device and \(t^{c}_{ij}\) is the time required to transmit \(i^{th}\) task to \(j^{th}\) server. Now, the total energy consumption for offloading task i to server j is calculated as,
The overall energy consumption can include the energy consumed for transmitting the tasks to the servers. We have prioritized device energy consumption owing to the limited battery resources and computational capabilities of user devices. As a result, energy consumed for executing tasks by servers has been less emphasized.
Calculation of \({C_{ij}}\)
Similar to latency and energy, the calculation of monetary cost for task computation can also have two possible cases. If the device performs the task locally instead of offloading, then it incurs no monetary cost. In the case of offloading, the cost of computational resources, i.e., CPU cycle and/or storage resources, i.e., memory, sums up the total monetary cost. For executing \(i^{th}\) task at \(j^{th}\) server, the storage cost is calculated as follows,
Here, \(\sigma _{ij} \in \{0, 1\}\) determines the availability of cached resource. If the value of \(\sigma _{ij}\) is 1, storage cost will be incurred for the device; otherwise, no storage cost is required. \(\eta _{j}\) is the storage cost of per bit resource. Next, we calculate the cost of computing \(i^{th}\) task at \(j^{th}\) server as follows,
where, \(\gamma _{j}\) is the unit CPU cycle cost of server j. Finally, the total monetary cost for executing task i at server j can be calculated as,
We have not considered cloud servers in our problem formulation. Although cloud server adds significant benefits related to scalability, serverhealth management, backup, and service provisioning capabilities, they can create hindrances in realtime application environments due to longdistance communication where exceptional QoE needs to be achieved. Uploading and executing tasks in the cloud require extra latency and energy, which impeded performance. Hence execution of tasks in user mobile devices and edge servers adds leverage to network performance. Cloud servers are typically utilized within an edge server network only when all other edge resources are overwhelmed or during network malfunctions.
Objective function formulation
Our aim is to execute each task \(i \in \tau\) at local or remote resource \(j \in \mathbb {E}\) so as to minimize the total execution latency, energy expenditure, and incurred monetary cost. Thus, WOLVERINE formulates the task execution problem as a multiobjective minimization problem as follows,
where,
Here, \(X_{ij}\) is a binary decision variable whose value is 1 if task i is allocated to edge server j, otherwise 0. And \(X_{ij} \in \overrightarrow{\chi }_{w}\), where \(\overrightarrow{\chi }_{w}\) is a Ddimensional vector, \(\overrightarrow{\chi }_{w}\) = \((x_{1}, x_{2}, ..., x_{D})\). Each entry \(x_d\) \(\in \overrightarrow{\chi }_{w}\) corresponds to the aforementioned decision variable \(X_{ij}, \forall i\in \tau , \forall j\in \mathbb {E}\). T(\(\overrightarrow{\chi }_{w}\)), E(\(\overrightarrow{\chi }_{w}\)), and C(\(\overrightarrow{\chi }_{w}\)) denotes the objective functions related to task execution latency, execution energy, and monetary cost respectively. Equation (12), which is a multiobjective linear optimization problem, is subject to the following constraints:

Assignment Constraint: Task will be executed in either an edge server or in the user device. No partial assignment of tasks to multiple servers will be done.
$$\begin{aligned} \sum \limits _{i \in \tau }\sum \limits _{j \in \mathbb {E}} X_{ij} \le 1 \end{aligned}$$(16) 
Budget Constraint: Constraint (17) denotes that the monetary cost of task t for executing it to server j cannot exceed the task budget, \(\delta _i\).
$$\begin{aligned} C_{ij}\le {\delta _{i}}, \quad \forall i \in \tau , j \in \mathbb {E} \end{aligned}$$(17) 
Energy Constraint: Constraint (18) refers to the energy expenditure of a device in executing a task is limited by a threshold, \({E^{max}_{i}}\).
$$\begin{aligned} E_{ij}\le {E^{max}_{ij}}, \quad \forall i\in \tau , j \in \mathbb {E} \end{aligned}$$(18) 
Latency Constraint: Constraint (19) denotes that a task t needs to be completed within its deadline, \(T^{max}_i\).
$$\begin{aligned} T_{ij}\le {T^{max}_{i}}, \quad \forall i\in \tau , j\in \mathbb {E} \end{aligned}$$(19)
Theorem 1
The WOLVERINE task offloading problem formulated in Eq. (12) is NPhard.
Proof
The WOLVERINE task offloading problem aims at minimizing three objectives, yielding a set of Pareto optimal solutions. The optimization problem in Eq. (12) can be regarded as an assignment problem. To prove the NPhardness of the WOLVERINE task offloading problem, we first convert Generalized Assignment Problem (GAP), a wellknown NPhard problem [39], into a multiobjective problem. The GAP assigns M tasks to N agents to minimize the overall assignment costs as follows:
Subject to:
Here, \(\mathbb {C}\) indicates the assignment cost of task \(m \in M\) to an agent \(n \in N\), A is the resource capacity function that indicates the resource used by task \(m \in M\) and \(\mathbb {B}\) indicates the available capacity of an agent. To convert GAP to a multiobjective assignment problem, we first consider a biobjective assignment problem where resource and cost constraints of GAP is to be satisfied by converting the three objectives of WOLVERINE to a single one as follows:
where,
Subject to:
Here, the value of \(\epsilon _{i}\) is chosen in such a way that minimizing \(\mathcal {Z}_{ij}\) yields the same result as the multiobjective functions. The function u(X) is defined as, \(u(X) = 1\) if \(x \ge 0\) and 0 otherwise.
Note that, we do not consider resource limitation constraints of GAP as the constraints of the multiobjective optimization problem, rather we consider it as an objective to be optimized. If the resource limitation constraints are satisfied, then \(z_{1}\) is equal to zero and the cost of assignment \(z_{2}\) will be considered. If there exists a better solution in GAP, a better solution also exists in the corresponding multiobjective problem. We consider two feasible solution, \(\overrightarrow{Z_{1}}\) and \(\overrightarrow{Z_{2}}\) where cost \(z_{1}\) < \(z_{2}\). These two costs produce solutions (0,\(z_{1}\)) and (0,\(z_{2}\)) in multiobjective assignment problems. If we consider lexicographical minimum, then \(z_{1}\) < \(z_{2}\). Hence (0,\(z_{1}\)) is a better solution. Thus, GAP is convertible to a multiobjective assignment problem.
Therefore, it is shown that GAP can be converted to a multiobjective optimization problem. Since GAP is a wellknown NPhard problem, the WOLVERINE task offloading problem is also an NPhard one. \(\square\)
Metaheuristic task offloading
As the number of MDs or servers increases, the WOLVERINE system experiences exponential growth in execution time. Many 5G applications can not tolerate a single second of delay. The proposed WOLVERINE framework attempts to optimize multiple objectives, such as minimizing latency, reducing energy consumption, and minimizing monetary costs. These objectives can be conflicting, meaning that improving one objective may degrade the other. Pareto optimal solutions help find a set of solutions where no single objective can be improved without worsening at least one other objective. Evolutionary algorithms help in solving problems that involve Paretooptimality as the solution choice is based on the population approach [31]. Therefore, in this section, we develop a smart task offloading policy using Binary MultiObjective Grey Wolf Optimization that determines the suitable set of resources to allocate the computational tasks in polynomial time.
Preliminaries
The Grey Wolf Optimization (GWO) [40] is a bioinspired metaheuristic algorithm that is designed based on the social leadership and hunting techniques found in grey wolves. To mathematically model the social hierarchy of the wolves, the fittest solution is considered the alpha (\(\alpha\)) wolf. The second and third best solutions are named beta (\(\beta\)) and delta (\(\delta\)) wolves, respectively. The leader selection and position updating of the rest of the search agents are done in each iteration, eventually converging to a set of Paretooptimal solutions. Binary Multiobjective Grey Wolf Optimization (BMOGWO) is a special variant of MOGWO that allows search agents to move in a binary space instead of a continuous spectrum [41]. In our specific case, where we aim to optimize execution time, energy consumption by devices, and monetary cost simultaneously, the BMOGWO algorithm demonstrates superior performance compared to other evolutionary algorithms such as MOPSO, BAT, and WHALE optimization algorithms [42,43,44] for tackling multiobjective problems, efficiently addressing the optimization of multiple objectives concurrently. It also outperforms AntColony Optimization (ACO) and Whale Optimization (WO) in scenarios where task offloading is required to edge servers [34]. The BMOGWO also surpasses other evolutionary algorithms in scenarios where Paretooptimal solutions are generated due to better performance in the exploration of solution space and prevention of convergence to local optima [45].
Defining the position vector
We consider a population of wolves denoted by P where each wolf, \(w \in P\) represents a candidate solution [46]. The position of a wolf w in the search space is denoted by a Ddimensional binary position vector \(\overrightarrow{\chi }_{w}\) where D = \(\tau\) \(\times\) \(\mathbb {E}\). The Ddimensional vector is denoted by \(\overrightarrow{\chi }_{w}\) = \((x_{1}, x_{2}, ..., x_{D})\) where each entry \(x_d \in \overrightarrow{\chi }_{w}\) corresponds to a decision variable \(X_{ij}, \forall i\in \tau , \forall j\in \mathbb {E}\) such that, \(d = (i1)\times \mathbb {E} + j\).
Updating positions of the wolves
In GWO, the position of each \(\omega\) wolf is updated by considering the positions of \(\alpha\), \(\beta\), and \(\delta\) wolves. Let \(\overrightarrow{\chi }_{\alpha }\), \(\overrightarrow{\chi }_{\beta }\), \(\overrightarrow{\chi }_{\delta }\) and \(\overrightarrow{\chi }_\omega\) denote the position of \(\alpha\), \(\beta\), \(\delta\) and \(\omega\) wolves, respectively. Now we calculate the distance of the \(\omega\) wolf from the other three leader wolves as follows.
Here, \(\overrightarrow{C}\) is a position vector with values in the range [0, 2]. The position vector associates weight to each prey item, in our case, the three best solutions. The value of C is chosen randomly to favor exploration by introducing randomness in the algorithm’s behavior. This vector controls the effect of prey, in this case, the effect of the three best solutions on the updating search agents. \(\mid \overrightarrow{C} \mid\) > 1 emphasizes the effect of best solutions more on the \(\omega\) wolves whereas \(\mid \overrightarrow{C} \mid\) \(< 1\) deemphasizes the effect. This prevents local optimum convergence and ensures that the entire search space is covered. Besides, the random selection of values in C emphasizes exploration not only in the initial stages but also during the final iterations [40]. The value of C is determined as \(\overrightarrow{C}=2. \overrightarrow{r_{2}}\), where \(\overrightarrow{r_{2}} \in [0, 1]\). Now the updated position of the \(\omega\) wolf with respect to alpha, beta, and delta wolves is calculated as follows.
Here, \(\overrightarrow{A}\) is the coefficient vector that governs convergence or divergence towards the prey, or the best solutions, and has values in the range [1,1]. The formula for calculating A’s value is \(\overrightarrow{A} = 2\overrightarrow{a}.\overrightarrow{r_{1}}\overrightarrow{a}\), where \(\overrightarrow{r_{1}} \in [0, 1]\). The search space exploration is managed using the \(\overrightarrow{a}\) parameter. By averaging \(\overrightarrow{\chi }_{\alpha ,\omega }\), \(\overrightarrow{\chi }_{\beta , \omega }\) and \(\overrightarrow{\chi }_{\delta ,\omega }\) in \(\overrightarrow{\chi }_w\), the ultimate position of the \(\omega\) wolf is now determined.
Note that each entry \(x_d \in \overrightarrow{\chi }_w\) corresponds to a binary decision variable of the MOLP problem and is only allowed to have a value of either 0 or 1 as follows.
where, \(sigmoid(x_d)\) is defined as,
Here, the rand() function provides a uniformly distributed random number in the range of [0, 1] that improves search space exploration with the goal of avoiding local optima. The convergence and diversity of the Paretofront generated by MOGWO for Paretooptimal solutions in triobjective problem are higher than that of MultiObjective Particle Swarm Optimization (MOPSO) [45]. Here, convergence indicates how close the obtained solutions are to the true Paretofront. Diversity demonstrates how thoroughly the search space has been explored. It shows how much an algorithm is comparing the tradeoffs and setting the wide range of options. Higher diversity indicates a greater number of options have been explored through a different balance between the objective parameters. GreyWolf Optimization strikes a balance between the two of these. It converges toward the true Paretofront by iteratively computing the solutions. As the algorithm progresses, the positions of \(\alpha\), \(\beta\), and \(\delta\) wolves are updated based on their fitness values. These three best solutions found so far guide the search process toward finding better solutions and helps to converge towards Paretofront through optimal tradeoffs. The exploration and randomness of Grey Wolf Optimization prevent convergence to local optima and provides a better exploration of a wide range of tradeoffs.
Controlling the archive
For incorporating multiobjective optimization in GWO, an archive of fixed size is used. It is a simple storage for storing or retrieving Paretodominant solutions obtained so far, which is shown in Algorithm 1. In line 1, for each \(w \in P\), a set \(\Omega\) is initialized that stores the archive solutions dominated by \(\overrightarrow{\chi }_{w}\). A flag is also initialized to check if any solution from the archive dominates \(\overrightarrow{\chi }_{w}\). Line 5 checks for the archive members dominated by \(\overrightarrow{\chi }_{w}\) and the dominated members are added to \(\Omega\). Line 7 checks the opposite and sets the flag to 1. In case there is no archive member that dominated \(\overrightarrow{\chi }_w\), i.e., flag = 0, the archive is updated using procedure \(UpdateArchive (\mathcal {A}, \Omega )\) in line 12. Lines 213 iterate for every member of the population and the updated archive is returned.
In Algorithm 2, \(UpdateArchive (\mathcal {A}, \Omega )\) procedure is summarized. In lines 24, the dominated solutions are removed from the archive. The capacity of the archive is checked in line 5. If it is not full, then the current nondominated solution is added to the archive in line 6; otherwise, the solution from the most crowded segment is removed and the current nondominated solution is added to the archive in line 9. In line 12, if a particular solution is an outlier, the grid is updated adaptively to cover the new solution.
Adaptive grid mechanism
An adaptive grid made of hypercubes [47] is generated using the archive, where the dimension of each hypercube is equal to the number of optimization objectives. The grid mechanism divides the objective space of the problem into a grid. Each hypercube is interpreted as a geographical region that contains the solutions [47]. For our WOLVERINE task offloading problem, which has three objectives, therefore, the adaptive grid consists of threedimensional hypercubes. The boundary of the objective/target space at tth iteration is determined as \((minT_t,minE_t,minC_t\) and \(maxT_t,maxE_t,maxC_t)\). Now, we calculate the modulus of the grid using the same approach [47] as follows,
Here, M is an integer that determines the number of segments in each dimension of the objective space. Therefore, the total number of hypercubes is \(M^3\).
We employ a strategy in which nondominated solutions are removed from the most crowded segments of the archive and leader selection is performed from the less crowded segments [45]. Both of these operations are based on probabilities to avoid local optima in search spaces. The solution density in each segment plays an important role in calculating these probabilities [47]. The more nondominated solutions there are in a segment, the higher the probability of removing one solution and the lower the probability of choosing a leader. The probability of choosing the ith segment to remove a solution is calculated as follows:
where \(N_i\) is the number of obtained paretooptimal solutions in ith segment. Note that Eq. (40) assigns a higher probability to a crowded segment. On the other hand, the probability of selecting a leader from the archive is calculated in the opposite manner. The roulettewheel approach is used for the selection based on the likelihood for each hypercube [45], as expressed by the following equation:
From Eq. (41), it is clear that a segment with fewer solutions has a higher probability of being chosen as the leader.
BMOGWObased task execution
The steps of the BMOGWObased task execution scheme of WOLVERINE are presented in Algorithm 3. First, we initialize the archive in line 2. Next, we initialize a population of random position vectors and calculate their fitness values in lines 4 and 5. The archive is populated with a set of nondominated solutions generated using Algorithm 1 in line 7. Line 8 selects three different leaders using a grid mechanism. For each dimension of every wolf, the positions are updated in line 13. Parameters a, A, and C are updated in line 16. Next, we calculate the fitness values of the updated position vectors in line 18 and update the archive with updated positions using the Algorithm 1 in line 20. Hence, from the updated archive, three new leaders are selected using Eq. (41) in line 21. Lines 1121 repeat until a maximum number of iterations \(I_{max}\) is reached. Finally, the value of entry \(x_d\) of the best solution \(\overrightarrow{\chi }_\alpha\) is assigned to the corresponding decision variable in lines 2526 and the decision vector X is returned.
Complexity analysis
In this section, we analyze the complexity of the three algorithms used in WOLVERINE. In Algorithm 2, Line 3 is enclosed within a loop that iterates \(\mathcal {A}\) times in the worst case. Line 8 requires \(M^3\) time. The rest of the statements are of constant time complexity. Thus, the overall complexity of Algorithm 2 is \(O(\mathcal {A} + M^3)\). Next, we define the complexity of Algorithm 1. Lines 59 are enclosed within a loop that iterates \(\mathcal {A}\) times. Line 12 updates the archive using Algorithm 2 that takes \(O(\mathcal {A}+M^3)\). Lines 213 are also enclosed within a loop that takes P times. Hence, the computational complexity of Algorithm 1 is \(O(P\times (\mathcal {A}+M^3))\). Finally, we analyze the complexity of Algorithm 3. Lines 4 and 5 are enclosed within a loop that iterates for P times. Line 7 updates the archive that requires \(O(P\times (\mathcal {A}+M^3)))\) times. Line 13 is enclosed within a nested loop that iterates \(P\times \overrightarrow{\chi }\) times. Line 18 is enclosed in another loop that iterates for P times. Line 20 again calls Algorithm 2. Lines 1122 are also enclosed within a loop that iterates for \(I_{max}\) times. The rest of the algorithm takes constant time to run. Thus, the total computational complexity of Algorithm 3 is \(O(P\times (\mathcal {A}+ M^3 + I_{max} \times (\overrightarrow{\chi } +\mathcal {A} + M^3) )) \approx O(M^3)\).
Convergence analysis
In this section, we analyze the convergence of the developed WOLVERINE system, which is measured using Inverted Generational Distance (IGD). IGD is a metric used for assessing the quality of a set of solutions produced by an optimization algorithm, particularly in the context of multiobjective optimization. It measures the convergence and diversity of the obtained solutions concerning the true Pareto front, which represents the optimal tradeoff between conflicting objectives. The IGD metric calculates the average distance from each point in the obtained solution set to the nearest point in the true Pareto front. A lower IGD value indicates a better convergence and diversity of the obtained solutions.
If the IGD value between the obtained Pareto front \(\rho\) and the true Pareto front \(\rho *\) is \(IGD(\rho , \rho *)\), then the convergence ratio (CR) \(\mathcal {C}\) can be defined as,
where, \(\rho _t\) and \(\rho _{t+1}\) denote the Pareto Front value after t and \((t+1)\) iterations, respectively.
Theorem 2
The convergence ratio \(\mathcal {C}\) of the developed BMOGWObased WOLVERINE system is bounded by \(g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\).
Proof
This proof can be done by inductive hypothesis. We need to proof that CR \(\mathcal {C} \le g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\). It can be mathematically denoted as,
Here, \(g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\) indicates the upper bound of the solution, where the solution of the algorithm is the farthest from the true Pareto front \(\rho ^{*}\), which can be mathematically represented as follows,
where, \(d(\hat{\varrho }, \varrho ^{*})\) denotes the distance between the two solutions \(\hat{\varrho }\), and \(\varrho ^{*}\) in the solution space. The IGD value of solution \(\rho _{t}\) after iteration t can be calculated similar to [48] as follows,
Basis Step: Let us assume that \(\rho _0\) denotes the initial Pareto front approximation and \(IGD(\rho _{0}, \rho *)\) be the initial IGD value. Then, Eq. (43) can be modified as follows,
where, \(P_0, \overrightarrow{C_0}, \overrightarrow{A_0}\) denote the initial population size, position vector, and coefficient vector, respectively. Equation (46) confirms that the induction hypothesis holds true for the base step.
Inductive Step: Assume that the theorem holds up to the tth iteration i.e., \(IGD(\rho _{t}, \rho *) \le g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\). Now, we need to express the improvement in performance from iteration t to \(t+1\), which can be mathematically represented as,
where, h(.) denotes the improvement function. As \(IGD(\rho _{t}, \rho *) \le g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\), therefore, \(IGD(\rho _{t+1}, \rho *) \le g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\) in Eq. (47). Thus it confirms that Eq. (43) holds true for all t and convergence ratio \(\mathcal {C}\) of the developed WOLVERINE system is bounded by \(g(P, \overrightarrow{C}, \overrightarrow{A}, \tau , \mathbb {U}, \mathbb {E}, t)\). \(\square\)
Performance evaluation
In this section, the performance of our proposed multiobjective task offloading with the caching approach is compared with some of the existing strategies in the literature: MGBD [4], iRAF [17] and MOEA/D [19]. The work presented in [4] focuses on jointly addressing the content caching, computation offloading, and resource allocation problem to reduce users’ overall task execution time. An AIdriven resource allocation framework (iRAF) has been developed in [17] to tackle intricate resource allocation problems by considering current network conditions and parameters to optimize either execution time or energy consumption. In a multiuser and multiserver task offloading environment, a triobjective problem is addressed in [19], where time, device energy, and cost are optimized using MultiObjective Evolutionary Algorithm (MOEA/D). However, caching the data codes has not been considered in this work. The environmental setup, performance metrics, and results are discussed below.
Environmental setup
We have implemented our proposed algorithm and performed empirical numerical evaluation using Python 3.6.0 [49]. For evaluation purposes, we consider a scenario where a stationary edge server is centered in a \(1000 \times 1000m^2\) urban area. A number of collaborative edge servers are randomly located around the primary edge server and several mobile devices are connected to the edge servers. The path loss model between the mobile devices and servers is assumed to follow a lognormal distribution. In addition to the above metrics, we model packet loss on each path using the Gilbert loss model [50] and the channels handle the retransmission of lost packets using TCP protocol. 20 channels are employed, each with a bandwidth of 2MHz. Our study is focused on realtime, delaysensitive, and computationintensive applications, including interactive video gaming, AR/VR applications, medical image processing, and face recognition. The task arrivals pattern follows a Poisson distribution. The whole experiment has been run 50 times and the average of all these results is taken to plot each graph. Major environment setup parameters used in this paper are shown in Table 4. In our simulation setting environment, we have ensured that resources are allocated proportionately across different systems. All the methods from the literature were implemented and performance metrics data were collected in a system environment consistent with that of ours.
Performance metrics
We have measured the performance of our algorithm based on the following metrics:

Average latency is defined as the ratio of the total delay experienced by the tasks to the number of tasks.

Average Energy Consumption is the average amount of energy consumed by each edge device.

Average Cost Savings is calculated as the difference between a device’s budget and the monetary cost paid by it divided by the number of tasks. The higher value indicates a higher system performance.

Task Completion Reliability (TCR) is the ratio of the number of tasks completed to the submitted ones.
Result analysis
In this section, we have discussed the performance of our proposed system by varying the number of tasks, the number of servers in the system, and the average computation power per task.
Impact of a varying number of tasks
In this experiment, we vary the number of tasks of the overall network system from 10 to 250 and keep the number of servers fixed at 12. The result and comparison are shown in Fig. 4.
Figure 4(a) shows that as the number of tasks increases, the average latency also increases. Initially, latency increases slowly for a smaller number of tasks. However, as the number of tasks exceeds 160, latency increases exponentially. Latency is lower in MGBD and WOLVERINE cases than the iRAF because the former two have implemented caching. In the case of MOEA/D, the performance is close to the WOLVERINE. A single mobile device user decomposes an application into multiple independent subtasks and offloads them to various servers, depending on resource availability. However, as the subtasks are executed in parallel, the total latency considered for completing a task is the maximum latency among the subtasks, and a risk of high delay remains in case the system reaches its saturation point. Besides, the absence of servertoserver collaboration makes it difficult to share subtasks. Our proposed WOLVERINE exploits both collaborative edge computing and caching. Therefore, if the required data for a specific task is not cached at a server or computational resources are not present, the server can pass the task to another collaborative server where the task data is cached already, which decreases the service delay significantly. Therefore, our proposed WOLVERINE outperforms the stateoftheart approaches.
The impact of varying numbers of tasks on average energy consumption is depicted in Fig. 4(b). With the increasing number of tasks, the energy toll is also increasing because a large number of tasks need to share the same bandwidth and require higher latency to reach the edge. Both WOLVERINE and MGBD perform better than iRAF because of exploiting caching, which helps the system’s users reduce backhaul latency and energy. However, the energy consumption gap increases significantly between WOLVERINE and MGBD when the number of tasks rises from 110 to 160 in the network, as MGBD needs to request the cloud for task processing owing to the unavailability of resources. For MOEA/D, a higher number of tasks means subtasks are executed in mobile devices more frequently, which increases overall energy consumption in the system. Besides, when subtasks are offloaded to multiple servers, the data code needs to be offloaded as well. Thus, energy expended for offloading data code to edge servers also occurs as an overhead, and offloading tasks frequently also incurs some communication costs with the increasing number of subtasks. For WOLVERINE, all the tasks requested can be cached either at different servers or the data code for computing the tasks needs to get transmitted at the servers; that is, no access to the cloud is necessary, thus, reducing energy consumption. Besides, collaboration among servers facilitates lower energy consumption.
In Fig. 4(c), we can observe the impact on average cost savings when task numbers are varied. As the number of tasks escalates, the average cost savings is reduced as the number of offloading tasks is increased, which in turn increases monetary costs for memory and computation. For the iRAF and MGBD, with increasing tasks, the cost goes up faster than WOLVERINE. The reason behind the increasing cost in the case of iRAF is the use of DNN and Monte Carlo Tree, which incur memory and computation costs. For MGBD, with an increasing number of tasks, device budget savings decrease due to the lack of collaboration among servers and offloading to the cloud in case of unavailability of server resources. In the case of MOEA/D, the cost is lower when the number of tasks is high as many of those are locally executed. However, offloading to multiple servers from a single user device can incur higher costs in terms of memory and computation depending on the availability of server resources. The proposed WOLVERINE offers a higher percentage (50\(\%\)95\(\%\)) of savings compared to MGBD and iRAF due to exploiting service caching, binary offloading, and collaboration among servers.
In Fig. 4(d), we see that increasing the number of tasks reduces Task Completion Reliability (TCR) in all of the methods. This happens due to the scarcity of resources and the delay sensitivity of tasks. For the iRAF, the TCR falls steadily when the number of tasks increases from 10 to 110 but falls sharply with increasing tasks from 110 to 260. As the iRAF allows partial offloading, therefore, with the increasing number of tasks, the tendency to offload the greater portion of a task is also increased, which in turn also enhances the task drop rate. The higher task drop in iRAF is the higher training time using the DNN and Monte Carlo Tree, creating a latency overhead that may cause many applications to exceed their deadlines. For MGBD and WOLVERINE, the TCR falls gradually with an increasing number of tasks due to caching. However, a depthfirstsearch tree is constructed in MGBD, which incurs some overhead, resulting in crossing the deadline for some tasks. Hence, TCR is lower in comparison to WOLVERINE. For MOEA/D, with the gradually rising number of tasks, the drop rate of subtasks can be increased due to the lack of resources and higher queuing delay of mobile devices. Since collaboration among edge servers and caching cater to the task completion rate better, WOLVERINE performs better than MOEA/D in system environments that contain rapidly offloaded tasks.
Impact of a varying number of servers
The impact of varying numbers of servers on the objective parameters is represented in the graphs of Fig. 5. For this scenario, the number of tasks is fixed at 50.
For a fixed number of tasks, as the number of resources increases, the average latency decreases for all schemes as shown in Fig. 5(a). The iRAF has higher latency in comparison to both MGBD and WOLVERINE because of the higher computational time of DNN and Monte Carlo Tree. In the case of MGBD, the construction of the search tree and exhaustive searching procedure affect the overall latency. In the case of MOEA/D, a task is disintegrated into multiple subtasks, which incurs higher latency overhead for servertodevice communication, and sometimes it faces difficulty to find the most suitable server for executing some subtasks. On the other hand, WOLVERINE exhibits better performance with the increasing number of servers as it is a joint implementation of edge server collaboration and caching.
WOLVERINE also performs better in terms of energy consumption, as depicted in Fig. 5(b). With the increasing number of servers, the energy consumption of MDs is significantly minimized in all studies. In WOLVERINE, more tasks are offloaded to the edge servers when the number of collaborative edge servers increases along with the increasing availability of cached data. That is why, up to a certain increment in the number of servers, energy consumption reduces. After that, the energy level hits a plateau or does not decrease significantly as the amount of cached data and computational resources increase with the increasing number of servers.
With the increasing number of servers for all schemes, the cost of allocating tasks is increased and the average cost savings is decreased, which is depicted in 5(c). In WOLVERINE, the cost of allocating tasks increases owing to memory cost and monetary cost for computation in various CoMEC servers. Nevertheless, the average savings remain greater than that of MGBD since the local computation of tasks also occurs here, which may incur no cost at all, and the cached resource size in MGBD is high along with the exhaustive search cost of DFS trees. On the other hand, iRAF involves DNN and Monte Carlo Tree in an optimization algorithm that occupies some extra memory. Therefore, the overall computation and memory cost is higher than our proposed method.
The impact on TCR (Task Completion Reliability) for varying numbers of servers is demonstrated in Fig. 5(d). For WOLVERINE and MGBD, the TCR escalates with the increasing number of servers due to exploited caching. However, the content caching in MGBD faces some resource constraint issues for highly resourceintensive applications. On the other hand, the aforementioned issues for iRAF may incur task drops due to exceeding the deadline in this scheme. For MOEA/D, TCR is relatively stable in comparison to the MGBD and iRAF as subtasks are offloaded more frequently with increasing tasks in the system. However, it is still not better as WOLVERINE due to the absence of servertoserver collaboration. For WOLVERINE, TCR improves due to incorporating caching as well as server collaboration.
Impact of caching
Caching the data code for computationintensive tasks instead of the entire code itself creates a certain impact on objective parameters that are to be optimized and the impact is depicted here in Fig. 6. In this experiment, we varied the average computation per task while fixing the number of tasks and servers at 50 and 12, respectively.
Figure 6(a) indicates that if the average computation cycles per task increase, then average latency will increase exponentially without caching. Here, a considerable amount of time will be required for the computation of tasks along with the offloading of the tasks to collaborative servers. On the other hand, if caching is performed, then it is observed that the time required will be less as some of the data code is already available on the cached server; only the input data needs to be transmitted. Similarly, for average energy consumption, such changes are observed in Fig. 6(b), i.e., if the tasks are cached, less energy is wasted in communication overhead, which in turn reduces overall average energy consumption. Therefore, it is pretty clear from the experiments that service caching leverages task completion notably.
Impact of geographical proximity of users
In this experiment, the geographical area is varied in an edge computing environment to measure the performance of user service latency and energy consumption. A larger area results in an augmented physical distance between edge servers and users, leading to extended transmission times and subsequently higher average latency. Additionally, expanded areas tend to experience heightened network congestion as a consequence of increased user traffic, exacerbating latency concerns. This congestion contributes to elevated communication overhead, necessitating higher transmission power and, consequently, increased energy consumption for devices. Furthermore, the scarcity of resources in an extended area often necessitates a greater execution of tasks by the devices themselves, resulting in amplified energy usage at the user end.
A gradual rise in average latency and device energy is observed for all the schemes shown in Figs. 7(a) and 7(b), respectively. However, for WOLVERINE, the increase in average latency and energy are significantly lower than the rest of the schemes. Since both collaborative edge computing and caching are exploited in this scheme, service delay and energy consumption are lowered as increased area multiplies the chances of finding appropriate edge servers and cached resources. For the rest of the schemes, the drawbacks for a higher value of energy and latency can be attributed to transmission to cloud, task dependency, higher computation, memory overhead by algorithms themselves, and local optima convergence.
Ablation experiment
As a strategy to retain superior nondominated solutions and to expand the exploration of a broader search space, the WOLVERINE system incorporates an adaptive grid mechanism. This technique facilitates leader selection and enhances the quality of solutions through probabilitybased elimination methods. To conduct an ablation experiment, we have adjusted the average number of computations per task, maintaining a fixed number of tasks and servers at 80 and 16, respectively. Subsequently, we have analyzed the effects on latency and energy consumption with and without the adaptive grid mechanism.
The graphs in Fig. 8 state that as the computation cycles per task increase, both average latency and average energy consumption experience an exponential rise when the adaptive grid mechanism is not utilized. Conversely, its inclusion leads to reduced latency and energy consumption. These are achieved by accelerated convergence and exploitation of the most effective solutions. It has also notably decreased the number of trialanderror attempts.
Hypervolume and inverted generational distance
In this section, we measure the performance to evaluate the quality of Paretooptimal solutions obtained by the developed WOLVERINE system.
In multiobjective evolutionary algorithms (MOEAs), hypervolume is a commonly used performance metric, which measures the volume of the objective space that is dominated by the solutions in the Pareto front approximation. The hypervolume indicator assesses the effectiveness of a given set of solutions by calculating the volume of the objective space that it covers. It provides a single scalar value that represents the spread and diversity of the Pareto front approximation. Higher hypervolume values indicate better coverage and a more comprehensive representation of the Pareto front.
In Fig. 9(a), the hypervolume region of the Pareto front has been demonstrated in a 3D graph, which is computed based on the covered space by the nondominated solutions, relative to a predefined reference point. This reference point represents an ideal state without any necessary tradeoffs between objectives, which is marked green color in the graph for clarity. The shaded region, inclusive of the reference point, visually represents the hypervolume region, signifying the extent of the objective space covered by the set of nondominated solutions. For hypervolume in this case, we have considered 10 servers with 60 tasks and the scalar value of hypervolume is 24.59 after 50 iterations. This is the highest value obtained which became steady after 50 iterations.
For convergence analysis of the developed WOLVERINE system, we have calculated IGD in terms of the number of iterations. Figure 9(b) illustrates how the IGD values change throughout the execution, indicating the performance and convergence of the algorithm. From this graph, we can observe that the IGD initially starts with a high value and gradually decreases throughout the first 50 iterations. Subsequently, it stabilizes at a particular value, indicating that convergence has been achieved. Higher values of IGD suggest that the solution set obtained for a certain number of iterations is not close to convergence. As the optimization algorithm explores more of the search space, lower values of IGD are obtained, signifying improved convergence and proximity to the true Pareto front. We have compared IGD values of MOEA/D with that of WOLVERINE. It is observed that IGD values for MOEA/D are higher than those of WOLVERINE for similar iteration numbers. The poorer distribution of the Pareto front in the case of MOEA/D can be attributed to its higher IGD values in comparison to WOLVERINE [45]. The value of IGD becomes stable after 50 iterations for WOLVERINE whereas for MOEA/D the value stabilizes after 65 iterations and at a higher value than that of WOLVERINE. Thus the graphical representations point towards a better convergence of WOLVERINE in comparison to MOEA/D, indicating that WOLVERINE achieves convergence faster.
Conclusion
This paper introduced an efficient task offloading framework, namely WOLVERINE, that brought about a collaboration among the edge servers to share computational resources while penetrating realtime applications in edge devices with optimal energy consumption and resource cost. The multiobjective optimization problem was proven to be NPhard; therefore, we formulated a Binary Multiobjective Grey Wolf Optimizationbased metaheuristic solution that deduced the Pareto optimal solutions for time, energy, and cost objectives i.e., the triobjective optimization problem in polynomial time. The performance analysis results carried out in Python and demonstrated significant performance improvement as high as 33.33%, 35%, and 40% in terms of execution latency, energy, and resource cost, respectively compared to the stateoftheart.
An improved version of GWO can be exploited on the developed system through dynamic weight association to multiple objectives and modification to the convergence factor. New scopes can be added by considering data loss, security of executed tasks, and so on. Deployment of a deeplearning model to accurately predict the task arrival rate, allocate the tasks, and adjust the cache resources following that prediction can be interesting future works. Furthermore, we can enhance our current framework by hybridizing different evolutionary algorithms to address the strengths of these algorithms in a dynamic environment. Consideration of robustness and fault tolerances in case of points of failure also adds a new edge to our current work.
Availability of data and materials
The data will be provided upon request.
References
Liang B, Gregory MA, Li S (2022) Multiaccess edge computing fundamentals, services, enablers and challenges: a complete survey. J Netw Comput Appl 199(103):308
Sahni Y, Cao J, Yang L (2018) Dataaware task allocation for achieving low latency in collaborative edge computing. IEEE Internet Things J 6(2):3512–3524
Nandi PK, Reaj MRI, Sarker S, Razzaque MA, Rashid MM, Roy P, (2024) Task offloading to edge cloud balancing utility and cost for energy harvesting internet of things. J Netw Comput Appl 221:103766
Zhang J, Hu X, Ning Z, Ngai ECH, Zhou L, Wei J, Cheng J, Hu B, Leung VC (2018) Joint resource allocation for latencysensitive services over mobile edge computing networks with caching. IEEE Internet Things J 6(3):4283–4294
Shafique K, Khawaja BA, Sabir F, Qazi S, Mustaqim M (2020) Internet of things (IoT) for nextgeneration smart systems: A review of current challenges, future trends and prospects for emerging 5GIoT scenarios. IEEE Access 8:23022–23040
Ullah I, Lim HK, Seok YJ, Han YH (2023) Optimizing task offloading and resource allocation in edgecloud networks: a DRL approach. J Cloud Comput 12(1):112
Ren J, Yu G, He Y, Li GY (2019) Collaborative cloud and edge computing for latency minimization. IEEE Trans Veh Technol 68(5):5031–5044
Puthal D, Mohanty SP, Wilson S, Choppali U (2021) Collaborative edge computing for smart villages [energy and security]. IEEE Consum Electron Mag 10(3):68–71
Chien WC, Weng HY, Lai CF (2020) Qlearning based collaborative cache allocation in mobile edge computing. Futur Gener Comput Syst 102:603–610
Xu J, Chen L, Zhou P (2018) Joint service caching and task offloading for mobile edge computing in dense networks. In: IEEE INFOCOM 2018IEEE Conference on Computer Communications, IEEE, pp 207–215
Mach P, Becvar Z (2017) Mobile edge computing: a survey on architecture and computation offloading. IEEE Commun Surv Tutor 19(3):1628–1656
Li C, Tang J, Tang H, Luo Y (2019) Collaborative cache allocation and task scheduling for dataintensive applications in edge computing environment. Futur Gener Comput Syst 95:249–264
Alfakih T, Hassan MM, Gumaei A, Savaglio C, Fortino G (2020) Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on SARSA. IEEE Access 8:54074–54084
Alam MGR, Hassan MM, Uddin MZ, Almogren A, Fortino G (2019) Autonomic computation offloading in mobile edge for IoT applications. Futur Gener Comput Syst 90:149–157
Chen Z, Chen Z, Jia Y (2019) Integrated task caching, computation offloading and resource allocation for mobile edge computing. In: IEEE Global Commun. Conf. (GLOBECOM). IEEE, Waikoloa, pp 1–6
Bi S, Huang L, Zhang YJA (2020) Joint optimization of service caching placement and computation offloading in mobile edge computing systems. IEEE Trans Wirel Commun 19(7):4947–4963
Chen J, Chen S, Wang Q, Cao B, Feng G, Hu J (2019) iRAF: A deep reinforcement learning approach for collaborative mobile edge computing IoT networks. IEEE Internet Things J 6(4):7011–7024
Luo Q, Li C, Luan T, Shi W (2022) Minimizing the delay and cost of computation offloading for vehicular edge computing. IEEE Trans Serv Comput 1. 15(5):2897–2909.
Wang P, Li K, Xiao B, Li K (2022) Multiobjective optimization for joint task offloading, power assignment, and resource allocation in mobile edge computing. IEEE Internet Things J 9(14):11737–11748
Gu B, Chen Y, Liao H, Zhou Z, Zhang D (2018) A distributed and contextaware task assignment mechanism for collaborative mobile edge computing. Sensors 18(8):2423
Mahenge MP, Li C, Sanga CA Collaborative mobile edge and cloud computing: Tasks unloading for improving users’ quality of experience in resourceintensive mobile applications. In: 2019 IEEE 4th Int. Conf. Comput. and Commun. Systems (ICCCS). IEEE, Singapore, pp 322–326
Mohammed A, Nahom H, Tewodros A, Habtamu Y, Hayelom G (2020) Deep reinforcement learning for computation offloading and resource allocation in blockchainbased multiUAVenabled mobile edge computing. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, Chengdu, pp 295–299
Nur FN, Islam S, Moon NN, Karim A, Azam S, Shanmugam B (2019) Prioritybased offloading and caching in mobile edge cloud. J Commun Softw Syst 15(2):193–201
Hao Y, Song Z, Zheng Z, Zhang Q, Miao Z (2023) Joint communication, computing, and caching resource allocation in LEO satellite MEC networks. IEEE Access 11:6708–6716
GulELaraib, Zaman SKu, Maqsood T, Rehman F, Mustafa S, Khan MA, Gohar N, Algarni AD, Elmannai H (2023) Content caching in mobile edge computing based on user location and preferences using cosine similarity and collaborative filtering. Electronics 12(2)
Xiao Z, Shu J, Jiang H, Lui JC, Min G, Liu J, Dustdar S (2022) Multiobjective parallel task offloading and content caching in D2Daided MEC networks. IEEE Trans Mob Comput 22(11):6599–6615
Hao Y, Chen M, Hu L, Hossain MS, Ghoneim A (2018) Energy efficient task caching and offloading for mobile edge computing. IEEE Access 6:11365–11373
Zhang N, Guo S, Dong Y, Liu D (2020) Joint task offloading and data caching in mobile edge computing networks. Comput Netw 182:107446
Liu L, Chang Z, Guo X, Ristaniemi T Multiobjective optimization for computation offloading in mobileedge computing. In: 2017 IEEE Symposium Comput. and Commun. (ISCC). IEEE, Heraklion, pp 832–837
Seid AM, Lu J, Abishu HN, Ayall TA (2022) Blockchainenabled task offloading with energy harvesting in multiUAVassisted IoT networks: A multiagent DRL approach. IEEE J Sel Areas Commun 40(12):3517–3532
Deb K (2001) Multiobjective Optimization Using Evolutionary Algorithms. Wiley, New York
Afrin M, Jin J, Rahman A, Tian YC, Kulkarni A (2019) Multiobjective resource allocation for edge cloud based robotic workflow in smart factory. Future Gener Comput Syst 97:119–130
Song C, Zhou H (2020) Computation offloading optimization in mobile edge computing based on multiobjective cuckoo search algorithm. In: Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence, pp 189–193
Abbas A (2021) Metaheuristicbased offloading task optimization in mobile edge computing. Int J Distrib Sens Netw
Jiang K, Ni H, Han R, Wang X (2019) An improved multiobjective grey wolf optimizer for dependent task scheduling in edge computing. Int J Innov Comput Inf Control 15(6):2289–2304
Song F, Xing H, Luo S, Zhan D, Dai P, Qu R (2020) A multiobjective computation offloading algorithm for mobileedge computing. IEEE Internet Things J 7(9):8780–8799
Gong Y, Bian K, Hao F, Sun Y, Wu Y (2023) Dependent tasks offloading in mobile edge computing: a multiobjective evolutionary optimization strategy. Futur Gener Comput Syst 148:314–325
Sardar Khaliq uz Z, Maqsood T, Rehman F, Mustafa S, Khan MA, Gohar N, Algarni AD, Elmannai H, (2023) Content caching in mobile edge computing based on user location and preferences using cosine similarity and collaborative filtering. Electronics 12(2):284
Man TH (2005) An algorithm for multiobjective assignment problem. PhD thesis, The Chinese University of Hong Kong
Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61
Shahjalal M, Farhana N, Roy P, Razzaque MA, Kaur K, Hassan MM (2022) A binary gray wolf optimization algorithm for deployment of virtual network functions in 5G hybrid cloud. Comput Commun 193:63–74
Hussein M (2021) Simulationoptimization for the planning of offsite construction projects: a comparative study of recent swarm intelligence metaheuristics. Sustainability 13(24):13551
AlImron CN (2022) An energyefficient no idle permutations flow shop scheduling problem using grey wolf optimizer algorithm. Jurnal Ilmiah Teknik Industri 21(1)
Wei L (2022) Multiobjective gray wolf optimization algorithm for multiagent pathfinding problem. In: 2022 IEEE 5th International Conference on Electronics Technology (ICET). IEEE, Chengdu
Mirjalili S, Saremi S, Mirjalili SM, Coelho LdS (2016) Multiobjective grey wolf optimizer: a novel algorithm for multicriterion optimization. Expert Syst Appl 47:106–119
Roy P, Sarker S, Razzaque MA, Hassan MM, AlQahtani SA, Aloi G, Fortino G (2020) AIenabled mobile multimedia service instance placement scheme in mobile edge computing. Comput Netw 182(107):573
Zhao F, He X, Zhang Y, Ma W, Zhang C A novel pareto archive evolution algorithm with adaptive grid strategy for multiobjective optimization problem. In: 2019 IEEE 23rd Int. Conf. Comput. Support. Coop. Work Des. (CSCWD), pp 301–306
Liu Y, Wei J, Li X, Li M (2019) Generational distance indicatorbased evolutionary algorithm with an improved niching method for manyobjective optimization problems. IEEE Access 7:63881–63891
Van Rossum G, Drake FL (2009) Python 3 Reference Manual. CreateSpace, Scotts Valley
Gilbert EN (1960) Capacity of a burstnoise channel. Bell Syst Tech J 39(5):1253–1265
Funding
The authors are grateful to the UGC Research Project, University of Dhaka, Bangladesh for supporting grants. This work was also supported by King Saud University, Riyadh, Saudi Arabia, through the Researchers Supporting Project under Grant RSP2024R18.
Author information
Authors and Affiliations
Contributions
Idea Generation, Writing and Analysis  Nawmi Nujhat, Fahmida Haque Shanta, Palash Roy, Sujan Sarker Supervision and Evaluation  Md. MamunOrRashid, Md. Abdur Razzaque Formulation, Editing, and Analysis  Mohammad Mehedi Hassan and Giancarlo Fortino.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nujhat, N., Haque Shanta, F., Sarker, S. et al. Task offloading exploiting grey wolf optimization in collaborative edge computing. J Cloud Comp 13, 23 (2024). https://doi.org/10.1186/s1367702300570z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1367702300570z