Task offloading exploiting grey wolf optimization in collaborative edge computing

The emergence of mobile edge computing (MEC) has brought cloud services to nearby edge servers facilitating penetration of real‑time and resource‑consuming applications from smart mobile devices at a high rate. The problem of task offloading from mobile devices to the edge servers has been addressed in the state‑of‑the‑art works by intro‑ ducing collaboration among the MEC servers. However, their contributions are either limited by minimization of ser‑ vice latency or cost reduction. In this paper, we address the problem by developing a multi‑objective optimization framework that jointly optimizes the latency, energy consumption, and resource usage cost. The formulated problem is proven to be an NP‑hard one. Thus, we develop an evolutionary meta‑heuristic solution for the offloading problem, namely WOLVERINE, based on a Binary Multi‑objective Grey Wolf Optimization algorithm that achieves a feasible solution within polynomial time having computational complexity of O ( M 3 ) , where M is an integer that determines the number of segments in each dimension of the objective space. Our experimental results depict that the devel‑ oped WOLVERINE system achieves as high as 33.33%, 35%, and 40% performance improvements in terms of execu‑ tion latency, energy, and resource cost, respectively compared to the state‑of‑the‑art.


Introduction
The proliferation of seamless internet connectivity technologies, such as WiFi, 4G, 5G, or LTE, as well as the availability of high processing capabilities at the mobile edge, has pushed the horizon of a new computing paradigm called mobile edge computing (MEC) [1][2][3].In recent years, the penetration of computation-intensive real-time applications has increased with the rapid rise of massively connected heterogeneous mobile devices (MDs) [4].According to [5], Cisco predicts that by 2030, almost 500 billion gadgets will be associated with the Internet of Things (IoT).Frequent access to cloud services results in an increase in mobile data traffic as well as backhaul latency, which in turn diminishes the Quality of Experience (QoE) of the application users [1].The MEC alleviates these problems by bringing the resources closer to the end users [6].The benefits of MEC can further be extended by introducing collaboration among edge servers located in different geographical regions, called collaborative mobile edge computing (CoMEC) [7].Not only do the edge servers participate in resource sharing, but vertical collaboration [8] also takes place among the three layers of CoMEC.Vertical collaboration in the MEC environment signifies collaboration among multiple layers of IoT computing infrastructure, including the IoT devices at the bottom, the edge cloud servers at the middle, and the master cloud at the top, as shown in Fig. 1.
While CoMEC increases the sustainability of edge computing, service caching at the MEC layer favors the QoE of the real-time application users [9].Service caching refers to caching the information that must be known by the edge server to complete the task execution.This information includes system settings, the heavy program code of the application, and their related databases/ libraries [10].Figure 1 illustrates some real-life use cases where caching is exploited in MEC for better QoE.One such case is where the MEC can be exploited for intelligent transportation systems (ITS), such as extending the connected vehicle cloud into the mobile network [11].As a result, roadside applications operating directly at the MEC may receive local messages from vehicles and roadside sensors, process them, and broadcast alerts (e.g., an accident) to nearby vehicles within the shortest possible time [12].The second case is of virtual reality and facerecognition data processing in various applications that require frequent database access.Both of these applications are data-intensive and need to deliver output in real time to ensure higher QoE to users.In all of the aforementioned cases, service caching can go a long way to ensure fast services to users.Caching prevents the same data from being offloaded multiple times, thus, both transmission latency and energy consumption can be reduced.
Computation offloading to a CoMEC network considering service caching may improve the overall QoE by reducing the associated system costs in terms of the queuing delay of tasks, energy consumption of devices, monetary costs, and so on [13,14].Additionally, it is not realistic to offload all tasks of MD to MEC all the time as the limited storage and computing resources of MEC significantly affect the time delay of the offloaded tasks.Therefore, an optimal task offloading decision needs to be formulated to achieve an efficient network model while keeping the aforementioned system costs minimal.A large number of researches have been done on caching strategies [15,16] and CoMEC.Content caching, computation offloading, and resource allocation problems have been jointly considered in [4] to reduce users' overall task execution time but it lacks collaboration among the edge servers.An AI-based task allocation algorithm namely iRAF has been proposed in [17] for the CoMEC Fig. 1 Real-life applications of service caching in MEC network where the average latency and energy have been optimized.Here, either one of the objectives is optimized by associating binary weights that create unfairness in the result.In [18], monetary cost and execution delay has been optimized using the particle swarm optimization (PSO) algorithm for a vehicular network.However, addressing mobile energy consumption still remains an issue.Three prime objectives, that is, execution time, energy consumed, and monetary cost have been optimized in a multi-user multi-server environment using a multi-objective evolutionary algorithm (MOEA/D) combining simple additive weighting (SAW) and multi-attribute decision making (MDM) in [19].This work too lacks collaboration among servers and cache resource allocation which can be crucial to addressing QoE.
This research endeavors to bridge notable gaps that have persisted in the existing body of knowledge in the MEC environment.In a dynamic environment, where heterogeneous mobile devices and edge servers are involved in optimizing multiple objectives simultaneously, no existing solutions can effectively address the problem.Several challenges are encountered while optimizing conflicting objectives together in a complex environment where multiple real-time applications operate on different user devices.Firstly, real-time applications require faster processing than others.If they are computationally expensive, offloading associated data and codes frequently creates a significant overhead.Secondly, handling offloading decisions while executing tasks can slow down the services of edge servers, especially if the resources of the edge servers become saturated, thus degrading QoE.Thirdly, since multiple objective parameters are targeted for optimization, they can be conflicting in nature.Thus, an exhaustive exploration of potential solution combinations becomes imperative.Most of the studies done so far have opted for single-objective optimization associating scalar weights to multiple objective parameters.Some of these depend on multiple decision criteria for selecting solutions [19].The parameters for such decision-making variables require meticulous fine-tuning and the environment saturated with realtime applications cannot afford to create extra overhead as such.Finally without service caching, every request for a particular service or content would need to travel from the user's device to the edge server or even further to the cloud, resulting in higher latency.This delay can be especially problematic for real-time delay-sensitive applications.
In this paper, we investigate a problem of joint optimization of task execution time, energy, and resource usage cost while offloading tasks in a CoMEC network.A task offloading framework based on grey WOLf optimization that exploits VERtical collaboration IN Edge computing, namely WOLVERINE system is devised to solve the problem.The WOLVERINE stands out from other taskoffloading frameworks due to its innovative features and advantages.Traditional task offloading frameworks suffer from several drawbacks, which can be categorized into three main areas: 1) lack of reproducibility of offloaded application codes, 2) lack of collaboration among the edge servers, and 3) inability to optimize multiple crucial parameters simultaneously.These limitations have negative implications for network systems, resulting in decreased QoE, underutilized resources, and suboptimal network performance.In response to these challenges, WOLVERINE introduces a novel task offloading scheme for real-life computationally intensive applications, utilizing an evolutionary algorithm.This scheme addresses the collaboration among servers and leverages cached application code to minimize time, energy, and resource costs in edge computing environments.The main contributions of the WOLVERINE framework are listed below: • We design a collaborative task offloading framework that effectively utilizes cached and computational resources to enhance user QoE in a CoMEC system where real-time applications are executed.The rest of this paper is organized as follows."Related works" section illustrates the major existing works."System model" section describes the system model of WOLVERINE."Design details of WOLVERINE" section elaborates the computational model, multi-objective problem formulation, and meta-heuristic task offloading scheme."Performance evaluation" section describes the environmental setup and results of experimental analysis.Finally, "Conclusion" section summarizes the key outcomes of our work and some future research directions.

Related works
Several works in the field of collaborative edge computing have been done, including optimal task caching and task allocation while optimizing a single objective function, trade-offs between two or more objectives, and multi-objective optimization.
The first category of works in the literature focused on single-objective optimization in collaborative edge computing, for example, energy, time, or resource cost allocation.In [2], a genetic algorithm based on a dataaware task allocation strategy has been proposed that considers the network congestion control for allocating sub-tasks.In [20], the authors have focused on the reduction of energy consumption for task assignments by considering the heterogeneity of users using a heuristic-based greedy approach.An architecture has been proposed in [21] that considers unloading resourceintensive tasks from client devices in the cooperative edge space or to the remote cloud depending on users' desire and resource availability.An AI-driven intelligent Resource Allocation Framework (iRAF) [17] has been designed to solve complex resource allocation problems considering the current network states and task characteristics.Another group of authors in [22] have utilized a deep reinforcement learning method to solve computation offloading and resource allocation problems in a blockchain-based multi-UAV-assisted dynamic environment.
Computation offloading that focuses on the minimization of system cost comprising the trade-off between energy and task execution delay in the form of a weighted sum has been proposed in [15].Collaboration among MEC servers for (data) cache and computational resource allocation are noteworthy in [15].However, caching the content or code of applications is not enough due to the limited computational capacity of user devices as well as the delay associated with transmitting cached data or code.Hence the idea of jointly task offloading and caching needs to be considered.In [16], a joint service caching, task offloading, and system resource allocation scheme to minimize system cost comprising of time and energy have been formulated using a MILP problem.In [23], a priority-based task offloading and caching scheme is proposed for the MEC environment, where computing a task while reducing energy cost and delay time efficiently is the main priority.A new low-complexity hyper-heuristic algorithm has been proposed in [24], where content caching is performed along with computation offloading in an MEC network to optimize the service latency for all ground IoT devices.Mobility and user preferenceaware content-caching in MEC are orchestrated in [25].The authors in [26] introduce an enhanced binary PSO algorithm, which is designed for optimizing task offloading and content caching in MEC networks.It focuses on jointly optimizing task completion delay and energy consumption.Additionally, an enhanced binary particle swarm optimization (BPSO) algorithm is proposed for content caching in parallel task offloading scenarios.An alternating-iterative algorithm has been developed in [27] for jointly optimizing task caching and offloading in a resource-constraint environment to minimize energy consumption.Here task caching indicates caching of a completed application and relevant data.Subsequently, in [4], content caching, computation offloading, and resource allocation problems have been jointly considered to reduce users' overall task execution time.However, caching a complete application, i.e., content caching is often incompatible with user requirements.Hence, the idea of caching data codes for joint task offloading and data caching using the Lyapunov algorithm for minimizing task computation delay has been introduced in [28].The authors have formalized joint service caching and task offloading decisions to minimize computation latency while keeping the total computation energy consumption low.
Multi-Objective Optimization problems are adopted for computation offloading in edge cloud by the authors of [29] which focused on the offloading probability of tasks to edge cloud from an MD.To optimize execution time, energy, and resource cost to maximize utility for resource providers in IoT networks, energy harvesting properties of unnamed aerial vehicles (UAV) are used in [30].A deep reinforcement learning (DRL) based solution is used for this system network that is managed by blockchain.Multi-objective optimization problems have multiple Pareto-optimal solutions which are obtained by trade-offs.Hence, evolutionary algorithms can play a significant role in reaching a single-preferred solution [31].In [32], time, energy, and cost were minimized for an edge cloud environment using the genetic algorithm NSGA-II.Minimization of average latency and energy consumption simultaneously for offloading tasks using the Cuckoo search algorithm has been proposed in [33].In [34], Grey-Wolf Optimization is used to perform a trade-off between the minimization of energy consumption and response time in an MEC environment.An Improved Multi-Objective Grey Wolf Optimization (IMOGWO) is used for sub-task scheduling in an edge computing environment introduced in [35] to optimize makespan, load balance, and energy simultaneously.Computation time and cost minimization have been performed in [18] using the Particle Swarm Optimization (PSO) algorithm for a Vehicular Edge Computing (VEC) environment.In [19], a tri-objective problem has been considered in a multi-user and multi-server task offloading environment where an application is divided into multiple independent sub-tasks.A Multi-objective Evolutionary Algorithm based on decomposition (MOEA/D) has been developed for optimizing the time, cost, and energy expended in the execution of a particular sub-task.MOEA/D is also used to minimize latency and energy in [36] for the MEC environment, where the ordering of subtasks exists as a constraint.It is also used for minimization of latency and maximization of rewards for servers and tasks in [37].However, the direct assignment of sub-tasks from mobile devices to a server is costly in terms of energy and offloading decision-making.The works mentioned above that addressed multi-objective optimization do not have a system environment similar to that of CoMEC handling real-life applications.
The summary of the state-of-the-art works has been listed in Tables 1 and 2. Most of the existing literature works have either performed single-objective optimization or weighted optimization in multi-user multi-server networks with and without cache or have performed multi-objective optimization without caching and collaboration among servers.The problem of jointly optimizing three basic objectives: execution latency, device energy, and resource cost has not yet been resolved in the CoMEC system incorporating service caching.The generation of Pareto-optimal solutions for optimizing multiple objectives simultaneously in a resource-constrained environment where servers collaborate and cache service is yet to be done.These observations have driven us to design a task offloading framework in the CoMEC environment for generating Pareto-optimal solutions for multi-objective optimization by exploiting service caching of computational resources.

System model
In this section, we describe the different entities of a CoMEC network and the interactions among them.

Entities of CoMEC network
We consider a CoMEC network consisting of a set of collaborative edge servers (CESs), E and a set of mobile devices (MDs), U , as shown in Fig. 2. Each mobile device k ∈ U is connected with one edge server j ∈ E , which is termed as its primary edge server (PES).Let τ be the set of M tasks arrived at a PES from mobile devices.Each task i ∈ τ is denoted by a four-parameter tuple, b i , B i , T max i , δ i , where b i is the input data size, B i is the size of related data codes, T max i is the task deadline and δ i is the task budget.In this work, data code is considered to consist of application-related program code, system settings, and related databases/libraries.
Each mobile device k has computational resources and each edge server j is considered to consist of both computational and cached resources.Table 3 contains major notations.A task generated from an MD can be executed either on the MD itself or at any edge server where edge servers are borrowing resources from the cloud while needed.

Collaboration among entities
Upon receiving a set of task requests, τ from the mobile devices, the PES communicates with the other CESs for task-related information and checks the availability of the resources, i.e., cached and computational resources required for the execution of the tasks.After getting the resource availability information, the PES runs the WOLVERINE task allocation decision algorithm and determines the appropriate resource providers to execute the tasks considering their requirements.If none of the servers has enough resources to complete a task, it is forwarded to the master cloud for execution, implementing a vertical collaborative computation environment.

Design details of WOLVERINE
In this section, we unfold different design components of WOLVERINE.First, we present a computational model of the proposed WOLVERINE system, then we formulate the task offloading problem as a multi-objective optimization problem; and finally, we devise a binary multi-objective grey wolf optimization-based solution.

Computational model of WOLVERINE
In this section, we unfold different design components of WOLVERINE.First, we present a computational model of the proposed WOLVERINE system, then we formulate the task offloading problem as a multi-objective optimization problem; and finally, we devise a binary multi-objective grey wolf optimization-based solution.
Figure 3 depicts the functional modules of the proposed WOLVERINE system, where an individual module is responsible for performing a specific function.The main functional modules of the PES can be grouped into two categories: the PES service module and the CES service module.The PES service module handles the task requests from the MDs and determines the optimal task offloading policy with the help of the CES service module.The responsibility of the CES service module is to manage collaboration between the PES and the CESs.Note that any collaborative edge server can work as a primary server by installing the PES service module to achieve the corresponding functionalities.The functionalities of each module are described below: • Task Profiler receives the task-offloading requests from the MD first and then checks for the required Radio bandwidth allocated to task i by server j Size of the data code related to task i ∈ τ σ i,j Cached resource availability for task i at server j among multiple edge servers and acts as a communication medium between the server and the MDs to share task data and computational results.

Multi-objective problem formulation
In this section, we calculate total latency T ij , energy consumption E ij and monetary cost C ij for offloading task i ∈ τ to edge server j ∈ E or for local computation.Finally, we formulate the task offloading problem of WOLVERINE as a multi-objective optimization problem.

Calculation of T ij
Two different cases for calculating T ij : In the first case, the mobile device executes the task locally, thus, experiencing no communication delay.So, the task computation delay, t k ij for executing task i ∈ τ on the mobile device k ∈ U locally is calculated as, Here, c i is the number of computation cycles required to compute the task, µ k i is the ratio of CPU cycles allocated by k th mobile device to complete i th task and f k is the CPU-cycle frequency of k th mobile devices.
For the second case, the input data and/or data code are offloaded to the MEC servers.If the data code is cached at the offloading server, then only the input data needs to be transmitted; otherwise, the device sends the input data along with the code to the server.For wireless transmission between the mobile device and collaborative edge server that follows Orthogonal Multiple Access (OMA), we consider the Rayleigh channel, and the transmission rate is calculated as, where, B ij is the allocated radio bandwidth, p k is the transmission power, h k is the channel gain ( k ∈ U ) and N 0 is the variance complex of white Gaussian channel noise.Now, we calculate the communication latency, t c ij for offloading task i to edge server j as follows, where, σ ij ∈ {0, 1} .Its value is 1 when the cached resources i.e., data code available in the offloading server, otherwise 0. Here, b i and B i denote the size of the input parameters and data code, respectively.Next, we calculate the execution time of task i at the edge server j as, where, ij is the resource of server j allocated to task i and f j is the total resource of the j th MEC.Finally, we calcu- late the total latency for completing task i using the following equation: When calculating execution latency for real-time computation-intensive applications in edge computing, addressing delivery or downloading latency is crucial.However, in this particular scenario, the emphasis (1) is placed more on upload speeds and network latency rather than download times.Besides, the execution result has typically limited data size and thus it has negligible impact on resource parameters.

Calculation of E ij
For calculating total energy consumption E ij , two pos- sible cases have been brought under consideration.In the first case, the mobile device executes the task locally.Hence, we consider only task computation energy and it is calculated as follows, where, κ is a co-efficient that depends on device's chip architecture [17] and f k is the CPU-cycle frequency of k th mobile device.For the second case, the task is executed at the server, hence, task computation energy is ignored.Thus, the energy the device expends due to transmitting input data and/or code to the MEC server is calculated as, where, p k is the power of k th mobile device and t c ij is the time required to transmit i th task to j th server.Now, the total energy consumption for offloading task i to server j is calculated as, The overall energy consumption can include the energy consumed for transmitting the tasks to the servers.We have prioritized device energy consumption owing to the limited battery resources and computational capabilities of user devices.As a result, energy consumed for executing tasks by servers has been less emphasized.

Calculation of C ij
Similar to latency and energy, the calculation of monetary cost for task computation can also have two possible cases.If the device performs the task locally instead of offloading, then it incurs no monetary cost.In the case of offloading, the cost of computational resources, i.e., CPU cycle and/or storage resources, i.e., memory, sums up the total monetary cost.For executing i th task at j th server, the storage cost is calculated as follows, Here, σ ij ∈ {0, 1} determines the availability of cached resource.If the value of σ ij is 1, storage cost will be incurred (6 for the device; otherwise, no storage cost is required.η j is the storage cost of per bit resource.Next, we calculate the cost of computing i th task at j th server as follows, where, γ j is the unit CPU cycle cost of server j.Finally, the total monetary cost for executing task i at server j can be calculated as, We have not considered cloud servers in our problem formulation.Although cloud server adds significant benefits related to scalability, server-health management, backup, and service provisioning capabilities, they can create hindrances in real-time application environments due to long-distance communication where exceptional QoE needs to be achieved.Uploading and executing tasks in the cloud require extra latency and energy, which impeded performance.Hence execution of tasks in user mobile devices and edge servers adds leverage to network performance.Cloud servers are typically utilized within an edge server network only when all other edge resources are overwhelmed or during network malfunctions.

Objective function formulation
Our aim is to execute each task i ∈ τ at local or remote resource j ∈ E so as to minimize the total execution latency, energy expenditure, and incurred monetary cost.Thus, WOLVERINE formulates the task execution problem as a multi-objective minimization problem as follows, where, Here, X ij is a binary decision variable whose value is 1 if task i is allocated to edge server j, otherwise 0. And X ij ∈ − → χ w , where − → χ w is a D-dimensional vector, − → χ w = (x 1 , x 2 , ..., x D ) .Each entry x d ∈ − → χ w corresponds to the aforementioned decision variable X ij , ∀i ∈ τ , ∀j ∈ E .T( − → χ w ), E( − → χ w ), and C( − → χ w ) denotes the objective func- tions related to task execution latency, execution energy, (10) and monetary cost respectively.Equation (12), which is a multi-objective linear optimization problem, is subject to the following constraints: • Assignment Constraint: Task will be executed in either an edge server or in the user device.No partial assignment of tasks to multiple servers will be done.
• Budget Constraint: Constraint (17) denotes that the monetary cost of task t for executing it to server j cannot exceed the task budget, δ i .
• Energy Constraint: Constraint (18) refers to the energy expenditure of a device in executing a task is limited by a threshold, E max i .
• Latency Constraint: Constraint (19) denotes that a task t needs to be completed within its deadline, T max i .

Theorem 1
The WOLVERINE task offloading problem formulated in Eq. ( 12) is NP-hard.

Proof
The WOLVERINE task offloading problem aims at minimizing three objectives, yielding a set of Pareto optimal solutions.The optimization problem in Eq. ( 12) can be regarded as an assignment problem.To prove the NPhardness of the WOLVERINE task offloading problem, we first convert Generalized Assignment Problem (GAP), a well-known NP-hard problem [39], into a multi-objective problem.The GAP assigns M tasks to N agents to minimize the overall assignment costs as follows: Subject to: (16) i∈τ j∈E Here, C indicates the assignment cost of task m ∈ M to an agent n ∈ N , A is the resource capacity function that indicates the resource used by task m ∈ M and B indi- cates the available capacity of an agent.To convert GAP to a multi-objective assignment problem, we first consider a bi-objective assignment problem where resource and cost constraints of GAP is to be satisfied by converting the three objectives of WOLVERINE to a single one as follows: where,

Subject to:
Here, the value of ǫ i is chosen in such a way that mini- mizing Z ij yields the same result as the multi-objective functions.The function u(X) is defined as, u(X) = 1 if x ≥ 0 and 0 otherwise.
Note that, we do not consider resource limitation constraints of GAP as the constraints of the multi-objective optimization problem, rather we consider it as an objective to be optimized.If the resource limitation constraints are satisfied, then z 1 is equal to zero and the cost of assignment z 2 will be considered.If there exists a bet- ter solution in GAP, a better solution also exists in the corresponding multi-objective problem.We consider (23) j∈N where cost z 1 < z 2 .These two costs produce solutions (0,z 1 ) and (0,z 2 ) in multiobjective assignment problems.If we consider lexicographical minimum, then z 1 < z 2 .Hence (0,z 1 ) is a better solution.Thus, GAP is convertible to a multi-objective assignment problem.Therefore, it is shown that GAP can be converted to a multi-objective optimization problem.Since GAP is a well-known NP-hard problem, the WOLVERINE task offloading problem is also an NP-hard one.

Meta-heuristic task offloading
As the number of MDs or servers increases, the WOL-VERINE system experiences exponential growth in execution time.Many 5G applications can not tolerate a single second of delay.The proposed WOLVERINE framework attempts to optimize multiple objectives, such as minimizing latency, reducing energy consumption, and minimizing monetary costs.These objectives can be conflicting, meaning that improving one objective may degrade the other.Pareto optimal solutions help find a set of solutions where no single objective can be improved without worsening at least one other objective.Evolutionary algorithms help in solving problems that involve Pareto-optimality as the solution choice is based on the population approach [31].Therefore, in this section, we develop a smart task offloading policy using Binary Multi-Objective Grey Wolf Optimization that determines the suitable set of resources to allocate the computational tasks in polynomial time.

Preliminaries
The Grey Wolf Optimization (GWO) [40] is a bioinspired meta-heuristic algorithm that is designed based on the social leadership and hunting techniques found in grey wolves.To mathematically model the social hierarchy of the wolves, the fittest solution is considered the alpha ( α ) wolf.The second and third best solutions are named beta ( β ) and delta ( δ ) wolves, respectively.The leader selection and position updating of the rest of the search agents are done in each iteration, eventually converging to a set of Pareto-optimal solutions.Binary Multi-objective Grey Wolf Optimization (BMOGWO) is a special variant of MOGWO that allows search agents to move in a binary space instead of a continuous spectrum [41].In our specific case, where we aim to optimize execution time, energy consumption by devices, and monetary cost simultaneously, the BMOGWO algorithm demonstrates superior performance compared to other evolutionary algorithms such as MOPSO, BAT, and WHALE optimization algorithms [42][43][44] for tackling multi-objective problems, efficiently addressing the optimization of objectives concurrently.It also outperforms Ant-Colony Optimization (ACO) and Whale Optimization (WO) in scenarios where task offloading is required to edge servers [34].The BMOGWO also surpasses other evolutionary algorithms in scenarios where Pareto-optimal solutions are generated due to better performance in the exploration of solution space and prevention of convergence to local optima [45].

Defining the position vector
We consider a population of wolves denoted by P where each wolf, w ∈ P represents a candidate solution [46].The position of a wolf w in the search space is denoted by a D-dimensional binary position vector − → χ w where D = τ × E .The D-dimensional vector is denoted by − → χ w = (x 1 , x 2 , ..., x D ) where each entry x d ∈ − → χ w corresponds to a decision variable X ij , ∀i ∈ τ , ∀j ∈ E such that, d = (i − 1) × E + j.

Updating positions of the wolves
In GWO, the position of each ω wolf is updated by con- sidering the positions of α , β , and δ wolves.Let − → χ α , − → χ β , − → χ δ and − → χ ω denote the position of α , β , δ and ω wolves, respectively.Now we calculate the distance of the ω wolf from the other three leader wolves as follows.
Here, − → C is a position vector with values in the range [0, 2].The position vector associates weight to each prey (29 item, in our case, the three best solutions.The value of C is chosen randomly to favor exploration by introducing randomness in the algorithm's behavior.This vector controls the effect of prey, in this case, the effect of the three best solutions on the updating search agents.| − → C | > 1 emphasizes the effect of best solutions more on the ω wolves whereas | − → C | < 1 de-emphasizes the effect.This prevents local optimum convergence and ensures that the entire search space is covered.Besides, the random selection of values in C emphasizes exploration not only in the initial stages but also during the final iterations [40].The value of C is determined as . Now the updated position of the ω wolf with respect to alpha, beta, and delta wolves is calculated as follows.
Here, − → A is the co-efficient vector that governs conver- gence or divergence towards the prey, or the best solu- Note that each entry x d ∈ − → χ w corresponds to a binary decision variable of the MOLP problem and is only allowed to have a value of either 0 or 1 as follows.
where, sigmoid(x d ) is defined as, Here, the rand() function provides a uniformly distributed random number in the range of [0, 1] that improves search space exploration with the goal of avoiding local optima.The convergence and diversity of the Paretofront generated by MOGWO for Pareto-optimal solutions in tri-objective problem are higher than that of Multi-Objective Particle Swarm Optimization (MOPSO) [45].Here, convergence indicates how close the obtained solutions are to the true Pareto-front.Diversity demonstrates how thoroughly the search space has been explored.It shows how much an algorithm is comparing (32) the trade-offs and setting the wide range of options.Higher diversity indicates a greater number of options have been explored through a different balance between the objective parameters.Grey-Wolf Optimization strikes a balance between the two of these.It converges toward the true Pareto-front by iteratively computing the solutions.As the algorithm progresses, the positions of α , β , and δ wolves are updated based on their fitness values.These three best solutions found so far guide the search process toward finding better solutions and helps to converge towards Pareto-front through optimal trade-offs.
The exploration and randomness of Grey Wolf Optimization prevent convergence to local optima and provides a better exploration of a wide range of trade-offs.

Algorithm 1 Archive controller the archive
For incorporating multi-objective optimization in GWO, an archive of fixed size is used.It is a simple storage for storing or retrieving Pareto-dominant solutions obtained so far, which is shown in Algorithm 1.
In line 1, for each w ∈ P , a set is initialized that stores the archive solutions dominated by − → χ w .A flag is also initialized to check if any solution from the archive dominates − → χ w .Line 5 checks for the archive members dominated by − → χ w and the dominated mem- bers are added to .Line 7 checks the opposite and sets the flag to 1.In case there is no archive member that dominated − → χ w , i.e., flag = 0, the archive is updated using procedure UpdateArchive(A, �) in line 12.Lines 2-13 iterate for every member of the population and the updated archive is returned.
In Algorithm 2, UpdateArchive(A, �) procedure is sum- marized.In lines 2-4, the dominated solutions are removed from the archive.The capacity of the archive is checked in line 5.If it is not full, then the current non-dominated solution is added to the archive in line 6; otherwise, the solution from the most crowded segment is removed and the current non-dominated solution is added to the archive in line 9.In line 12, if a particular solution is an outlier, the grid is updated adaptively to cover the new solution.

Adaptive grid mechanism
An adaptive grid made of hypercubes [47] is generated using the archive, where the dimension of each hypercube is equal to the number of optimization objectives.The grid mechanism divides the objective space of the problem into a grid.Each hypercube is interpreted as a geographical region that contains the solutions [47].For our WOLVER-INE task offloading problem, which has three objectives, therefore, the adaptive grid consists of three-dimensional hypercubes.The boundary of the objective/target space at t-th iteration is determined as (minT t , minE t , minC t and maxT t , maxE t , maxC t ) .Now, we calculate the modulus of the grid using the same approach [47] as follows, Here, M is an integer that determines the number of segments in each dimension of the objective space.Therefore, the total number of hypercubes is M 3 .
We employ a strategy in which non-dominated solutions are removed from the most crowded segments of the archive and leader selection is performed from the less crowded segments [45].Both of these operations are based on probabilities to avoid local optima in search spaces.The solution density in each segment plays an important role in calculating these probabilities [47].The more non-dominated solutions there are in a segment, the higher the probability of removing one solution and the lower the probability of choosing a leader.The probability of choosing the i-th segment to remove a solution is calculated as follows: where N i is the number of obtained pareto-optimal solu- tions in i-th segment.Note that Eq. ( 40) assigns a higher probability to a crowded segment.On the other hand, the probability of selecting a leader from the archive is calculated in the opposite manner.The roulette-wheel approach is used for the selection based on the likelihood for each hypercube [45], as expressed by the following equation: (37 From Eq. ( 41), it is clear that a segment with fewer solutions has a higher probability of being chosen as the leader.

BMOGWO-based task execution
The steps of the BMOGWO-based task execution scheme of WOLVERINE are presented in Algorithm 3. (41 First, we initialize the archive in line 2. Next, we initialize a population of random position vectors and calculate their fitness values in lines 4 and 5.The archive is populated with a set of non-dominated solutions generated using Algorithm 1 in line 7. Line 8 selects three different leaders using a grid mechanism.For each dimension of every wolf, the positions are updated in line 13.Parameters a, A, and C are updated in line 16.
Next, we calculate the fitness values of the updated position vectors in line 18 and update the archive with updated positions using the Algorithm 1 in line 20.
Hence, from the updated archive, three new leaders are selected using Eq. ( 41) in line 21.Lines 11-21 repeat until a maximum number of iterations I max is reached.Finally, the value of entry x d of the best solution − → χ α is assigned to the corresponding decision variable in lines 25-26 and the decision vector X is returned.

Complexity analysis
In this section, we analyze the complexity of the three algorithms used in WOLVERINE.In Algorithm

Convergence analysis
In this section, we analyze the convergence of the developed WOLVERINE system, which is measured using Inverted Generational Distance (IGD).IGD is a metric used for assessing the quality of a set of solutions produced by an optimization algorithm, particularly in the context of multi-objective optimization.It measures the convergence and diversity of the obtained solutions concerning the true Pareto front, which represents the optimal trade-off between conflicting objectives.The IGD metric calculates the average distance from each point in the obtained solution set to the nearest point in the true Pareto front.A lower IGD value indicates a better convergence and diversity of the obtained solutions.
If the IGD value between the obtained Pareto front ρ and the true Pareto front ρ * is IGD(ρ, ρ * ) , then the con- vergence ratio (CR) C can be defined as, where, ρ t and ρ t+1 denote the Pareto Front value after t and (t + 1) iterations, respectively.

Theorem 2
The convergence ratio C of the developed BMOGWO-based WOLVERINE system is bounded by

Proof
This proof can be done by inductive hypothesis.We need to proof that CR C ≤ g(P, It can be mathematically denoted as, Here, g(P, − → C , − → A , τ , U, E, t) indicates the upper bound of the solution, where the solution of the algorithm is the farthest from the true Pareto front ρ * , which can be mathematically represented as follows, where, d( ̺, ̺ * ) denotes the distance between the two solu- tions ̺ , and ̺ * in the solution space.The IGD value of solution ρ t after iteration t can be calculated similar to [48] as follows,

Basis
Step: Let us assume that ρ 0 denotes the initial Pareto front approximation and IGD(ρ 0 , ρ * ) be the initial IGD value.Then, Eq. ( 43) can be modified as follows, where, P 0 , − → C 0 , − → A 0 denote the initial population size, position vector, and co-efficient vector, respectively.(42 Equation (46) confirms that the induction hypothesis holds true for the base step.

Inductive
Step: Assume that the theorem holds up to the t-th iteration i.e., IGD(ρ t , ρ * ) ≤ g(P, − → C , − → A , τ , U, E, t) .Now, we need to express the improvement in performance from iteration t to t + 1 , which can be mathemati- cally represented as, where, h(.) denotes the improvement function.As (47).Thus it confirms that Eq. ( 43) holds true for all t and convergence ratio C of the developed WOLVERINE system is bounded by g(P,

Performance evaluation
In this section, the performance of our proposed multiobjective task offloading with the caching approach is compared with some of the existing strategies in the literature: MGBD [4], iRAF [17] and MOEA/D [19].The work presented in [4] focuses on jointly addressing the content computation offloading, and resource allocation problem to reduce users' overall task execution time.An AI-driven resource allocation framework (iRAF) has been developed in [17] to tackle intricate resource allocation problems by considering current network conditions and parameters to optimize either execution time or energy consumption.In a multi-user and multi-server task offloading environment, a tri-objective problem is addressed in [19], where time, device energy, and cost are optimized using Multi-Objective Evolutionary Algorithm (MOEA/D).However, caching the data codes has not been considered in this work.The environmental setup, performance metrics, and results are discussed below.

Environmental setup
We have implemented our proposed algorithm and performed empirical numerical evaluation using Python 3.6.0[49].For evaluation purposes, we consider a ( 47) scenario where a stationary edge server is centered in a 1000 × 1000m 2 urban area.A number of collaborative edge servers are randomly located around the primary edge server and several mobile devices are connected to the edge servers.The path loss model between the mobile devices and servers is assumed to follow a lognormal distribution.In addition to the above metrics, we model packet loss on each path using the Gilbert loss model [50] and the channels handle the re-transmission of lost packets using TCP protocol.20 channels are employed, each with a bandwidth of 2MHz.Our study is focused on real-time, delay-sensitive, and computationintensive applications, including interactive video gaming, AR/VR applications, medical image processing, and face recognition.The task arrivals pattern follows a Poisson distribution.The whole experiment has been run 50 times and the average of all these results is taken to plot each graph.Major environment setup parameters used in this paper are shown in Table 4.In our simulation setting environment, we have ensured that resources are allocated proportionately across different systems.All the methods from the literature were implemented and performance metrics data were collected in a system environment consistent with that of ours.

Performance metrics
We have measured the performance of our algorithm based on the following metrics: • Average latency is defined as the ratio of the total delay experienced by the tasks to the number of tasks.• Average Energy Consumption is the average amount of energy consumed by each edge device.
• Average Cost Savings is calculated as the difference between a device's budget and the monetary cost paid by it divided by the number of tasks.The higher value indicates a higher system performance.• Task Completion Reliability (TCR) is the ratio of the number of tasks completed to the submitted ones.

Result analysis
In this section, we have discussed the performance of our proposed system by varying the number of tasks, the number of servers in the system, and the average computation power per task.

Impact of a varying number of tasks
In this experiment, we vary the number of tasks of the overall network system from 10 to 250 and keep the number of servers fixed at 12. The result and comparison are shown in Fig. 4. Figure 4(a) shows that as the number of tasks increases, the average latency also increases.Initially, latency increases slowly for a smaller number of tasks.However, as the number of tasks exceeds 160, latency increases exponentially.Latency is lower in MGBD and WOLVER-INE cases than the iRAF because the former two have implemented caching.In the case of MOEA/D, the performance is close to the WOLVERINE.A single mobile device user decomposes an application into multiple independent sub-tasks and offloads them to various servers, depending on resource availability.However, as the sub-tasks are executed in parallel, the total latency considered for completing a task is the maximum latency among the sub-tasks, and a risk of high delay remains in case the system reaches its saturation point.Besides, the absence of server-to-server collaboration makes it difficult to share sub-tasks.Our proposed WOLVERINE exploits both collaborative edge computing and caching.Therefore, if the required data for a specific task is not cached at a server or computational resources are not present, the server can pass the task to another collaborative server where the task data is cached already, which decreases the service delay significantly.Therefore, our proposed WOLVERINE outperforms the state-of-the-art approaches.
The impact of varying numbers of tasks on average energy consumption is depicted in Fig. 4(b).With the increasing number of tasks, the energy toll is also increasing because a large number of tasks need to share the same bandwidth and require higher latency to reach the edge.Both WOLVERINE and MGBD perform better than iRAF because of exploiting caching, which helps the system's users reduce backhaul latency and energy.However, the energy consumption gap increases significantly The required CPU cycles to complete task [6 × 10 9 -9 × 10 10 ]Hz The CPU-cycle frequency of MD 300MHz The computation capability of edge servers  The bandwidth of one channel 2MHz

Size of input data [3MB-50MB]
Number of Iteration ( I max ) 50 Population size 50 between WOLVERINE and MGBD when the number of tasks rises from 110 to 160 in the network, as MGBD needs to request the cloud for task processing owing to the unavailability of resources.For MOEA/D, a higher number of tasks means sub-tasks are executed in mobile devices more frequently, which increases overall energy consumption in the system.Besides, when sub-tasks are offloaded to multiple servers, the data code needs to be offloaded as well.Thus, energy expended for offloading data code to edge servers also occurs as an overhead, and offloading tasks frequently also incurs some communication costs with the increasing number of sub-tasks.
For WOLVERINE, all the tasks requested can be cached either at different servers or the data code for computing the tasks needs to get transmitted at the servers; that is, no access to the cloud is necessary, thus, reducing energy consumption.Besides, collaboration among servers facilitates lower energy consumption.In Fig. 4(c), we can observe the impact on average cost savings when task numbers are varied.As the number of tasks escalates, the average cost savings is reduced as the number of offloading tasks is increased, which in turn increases monetary costs for memory and computation.For the iRAF and MGBD, with increasing tasks, the cost goes up faster than WOLVERINE.The reason behind the increasing cost in the case of iRAF is the use of DNN and Monte Carlo Tree, which incur memory and computation costs.For MGBD, with an increasing number of tasks, device budget savings decrease due to the lack of collaboration among servers and offloading to the cloud in case of unavailability of server resources.In the case of MOEA/D, the cost is lower when the number of tasks is high as many of those are locally executed.However, offloading to multiple servers from a single user device can incur higher costs in terms of memory and computation depending on the availability of server resources.The proposed WOLVERINE offers a higher percentage (50%-95% ) of savings compared to MGBD and iRAF due to exploiting service caching, binary offloading, and collaboration among servers.
In Fig. 4(d), we see that increasing the number of tasks reduces Task Completion Reliability (TCR) in all of the methods.This happens due to the scarcity of resources and the delay sensitivity of tasks.For the iRAF, the TCR falls steadily when the number of tasks increases from 10 to 110 but falls sharply with increasing tasks from 110 to 260.As the iRAF allows partial offloading, therefore, with the increasing number of tasks, the tendency to offload the greater portion of a task is also increased, which in turn also enhances the task drop rate.The higher task drop in iRAF is the higher training time using the DNN and Monte Carlo Tree, creating a latency overhead that Fig. 4 Impacts of varying number of tasks may cause many applications to exceed their deadlines.For MGBD and WOLVERINE, the TCR falls gradually with an increasing number of tasks due to caching.However, a depth-first-search tree is constructed in MGBD, which incurs some overhead, resulting in crossing the deadline for some tasks.Hence, TCR is lower in comparison to WOLVERINE.For MOEA/D, with the gradually rising number of tasks, the drop rate of sub-tasks can be increased due to the lack of resources and higher queuing delay of mobile devices.Since collaboration among edge servers and caching cater to the task completion rate better, WOLVERINE performs better than MOEA/D in system environments that contain rapidly offloaded tasks.

Impact of a varying number of servers
The impact of varying numbers of servers on the objective parameters is represented in the graphs of Fig. 5.For this scenario, the number of tasks is fixed at 50.
For a fixed number of tasks, as the number of resources increases, the average latency decreases for all schemes as shown in Fig. 5(a).The iRAF has higher latency in comparison to both MGBD and WOLVERINE because of the higher computational time of DNN and Monte Carlo Tree.In the case of MGBD, the construction of the search tree and exhaustive searching procedure affect the overall latency.In the case of MOEA/D, a task is disintegrated into multiple sub-tasks, which incurs higher latency overhead for server-to-device communication, and sometimes it faces difficulty to find the most suitable server for executing some sub-tasks.On the other hand, WOLVERINE exhibits better performance with the increasing number of servers as it is a joint implementation of edge server collaboration and caching.
WOLVERINE also performs better in terms of energy consumption, as depicted in Fig. 5(b).With the increasing number of servers, the energy consumption of MDs is significantly minimized in all studies.In WOLVER-INE, more tasks are offloaded to the edge servers when the number of collaborative edge servers increases along with the increasing availability of cached data.That is why, up to a certain increment in the number of servers, energy consumption reduces.After that, the energy level hits a plateau or does not decrease significantly as the amount of cached data and computational resources increase with the increasing number of servers.
With the increasing number of servers for all schemes, the cost of allocating tasks is increased and the average cost savings is decreased, which is depicted in 5(c).In WOLVERINE, the cost of allocating tasks increases owing to memory cost and monetary cost for computation in various CoMEC servers.Nevertheless, the average savings remain greater than that of MGBD since the local computation of tasks also occurs here, which may incur no cost at all, and the cached resource size in MGBD is Fig. 5 Impacts of varying number of servers high along with the exhaustive search cost of DFS trees.On the other hand, iRAF involves DNN and Monte Carlo Tree in an optimization algorithm that occupies some extra memory.Therefore, the overall computation and memory cost is higher than our proposed method.
The impact on TCR (Task Completion Reliability) for varying numbers of servers is demonstrated in Fig. 5(d).For WOLVERINE and MGBD, the TCR escalates with the increasing number of servers due to exploited caching.However, the content caching in MGBD faces some resource constraint issues for highly resource-intensive applications.On the other hand, the aforementioned issues for iRAF may incur task drops due to exceeding the deadline in this scheme.For MOEA/D, TCR is relatively stable in comparison to the MGBD and iRAF as sub-tasks are offloaded more frequently with increasing tasks in the system.However, it is still not better as WOLVERINE due to the absence of server-to-server collaboration.For WOLVERINE, TCR improves due to incorporating caching as well as server collaboration.

Impact of caching
Caching the data code for computation-intensive tasks instead of the entire code itself creates a certain impact on objective parameters that are to be optimized and the impact is depicted here in Fig. 6.In this experiment, we varied the average computation per task while fixing the number of tasks and servers at 50 and 12, respectively.
Figure 6(a) indicates that if the average computation cycles per task increase, then average latency will increase exponentially without caching.Here, a considerable amount of time will be required for the computation of tasks along with the offloading of the tasks to collaborative servers.On the other hand, if caching is performed, then it is observed that the time required will be less as some of the data code is already available on the cached server; only the input data needs to be transmitted.Similarly, for average energy consumption, such changes are observed in Fig. 6(b), i.e., if the tasks are cached, less energy is wasted in communication overhead, which in turn reduces overall average energy consumption.Therefore, it is pretty clear from the experiments that service caching leverages task completion notably.

Impact of geographical proximity of users
In this experiment, the geographical area is varied in an edge computing environment to measure the performance of user service latency and energy consumption.A larger area results in an augmented physical distance between edge servers and users, leading to extended transmission times and subsequently higher average latency.Additionally, expanded areas tend to experience heightened network congestion as a consequence of increased user traffic, exacerbating latency concerns.This congestion contributes to elevated communication overhead, necessitating higher transmission power and, consequently, increased energy consumption for devices.Furthermore, the scarcity of resources in an extended area often necessitates a greater execution of tasks by the devices themselves, resulting in amplified energy usage at the user end.
A gradual rise in average latency and device energy is observed for all the schemes shown in Figs.7(a) and 7(b), respectively.However, for WOLVERINE, the increase in average latency and energy are significantly lower than the rest of the schemes.Since both collaborative edge computing and caching are exploited in this scheme, service delay and energy consumption are lowered as increased area multiplies the chances of finding appropriate edge servers and cached resources.For the rest of the schemes, the drawbacks for a higher value of energy and latency can be attributed to transmission to cloud, task dependency, higher computation, memory

Ablation experiment
As a strategy to retain superior non-dominated solutions and to expand the exploration of a broader search space, the WOLVERINE system incorporates an adaptive grid mechanism.This technique facilitates leader selection and enhances the quality of solutions through probability-based elimination methods.To conduct an ablation experiment, we have adjusted the average number of computations per task, maintaining a fixed number of tasks and servers at 80 and 16, respectively.Subsequently, we have analyzed the effects on latency and energy consumption with and without the adaptive grid mechanism.
The graphs in Fig. 8 state that as the computation cycles per task increase, both average latency and average energy consumption experience an exponential rise when the adaptive grid mechanism is not utilized.Conversely, its inclusion leads to reduced latency and energy consumption.These are achieved by accelerated convergence and exploitation of the most effective solutions.It has also notably decreased the number of trial-and-error attempts.

Hypervolume and inverted generational distance
In this section, we measure the performance to evaluate the quality of Pareto-optimal solutions obtained by the developed WOLVERINE system.
In multi-objective evolutionary algorithms (MOEAs), hypervolume is a commonly used performance metric, which measures the volume of the objective space that is dominated by the solutions in the Pareto front approximation.The hypervolume indicator assesses the effectiveness of a given set of solutions by calculating the volume of the objective space that it covers.It provides a single scalar value that represents the spread and diversity of the Pareto front approximation.Higher hypervolume values indicate better coverage and a more comprehensive representation of the Pareto front.
In Fig. 9(a), the hypervolume region of the Pareto front has been demonstrated in a 3D graph, which is computed based on the covered space by the non-dominated solutions, relative to a predefined reference point.This reference point represents an ideal state without any Fig. 7 Impacts of geographical proximity of users Fig. 8 Impacts adaptive grid mechanism necessary trade-offs between objectives, which is marked green color in the graph for clarity.The shaded region, inclusive of the reference point, visually represents the hypervolume region, signifying the extent of the objective space covered by the set of non-dominated solutions.For hypervolume in this case, we have considered 10 servers with 60 tasks and the scalar value of hypervolume is 24.59 after 50 iterations.This is the highest value obtained which became steady after 50 iterations.
For convergence analysis of the developed WOLVER-INE system, we have calculated IGD in terms of the number of iterations.Figure 9(b) illustrates how the IGD values change throughout the execution, indicating the performance and convergence of the algorithm.From this graph, we can observe that the IGD initially starts with a high value and gradually decreases throughout the first 50 iterations.Subsequently, it stabilizes at a particular value, indicating that convergence has been achieved.Higher values of IGD suggest that the solution set obtained for a certain number of iterations is not close to convergence.As the optimization algorithm explores more of the search space, lower values of IGD are obtained, signifying improved convergence and proximity to the true Pareto front.We have compared IGD values of MOEA/D with that of WOLVERINE.It is observed that IGD values for MOEA/D are higher than those of WOLVERINE for similar iteration numbers.The poorer distribution of the Pareto front in the case of MOEA/D can be attributed to its higher IGD values in comparison to WOLVERINE [45].The value of IGD becomes stable after 50 iterations for WOLVERINE whereas for MOEA/D the value stabilizes after 65 iterations and at a higher value than that of WOLVERINE.Thus the graphical representations point towards a better convergence of WOLVERINE in comparison to MOEA/D, indicating that WOLVERINE achieves convergence faster.

Conclusion
This paper introduced an efficient task offloading framework, namely WOLVERINE, that brought about a collaboration among the edge servers to share computational resources while penetrating real-time applications in edge devices with optimal energy consumption and resource cost.The multi-objective optimization problem was proven to be NP-hard; therefore, we formulated a Binary Multi-objective Grey Wolf Optimization-based meta-heuristic solution that deduced the Pareto optimal solutions for time, energy, and cost objectives i.e., the tri-objective optimization problem in polynomial time.The performance analysis results carried out in Python and demonstrated significant performance improvement as high as 33.33%, 35%, and 40% in terms of execution latency, energy, and resource cost, respectively compared to the state-of-the-art.
An improved version of GWO can be exploited on the developed system through dynamic weight association to multiple objectives and modification to the convergence factor.New scopes can be added by considering data loss, security of executed tasks, and so on.Deployment of a deep-learning model to accurately predict the task arrival rate, allocate the tasks, and adjust the cache resources following that prediction can be interesting future works.Furthermore, we can enhance our current framework by hybridizing different evolutionary algorithms to address the strengths of these algorithms in a dynamic environment.Consideration of robustness and fault tolerances in case of points of failure also adds a new edge to our current work.

Fig. 2
Fig.2The structure of CoMEC network over fiber-wireless connection

γ jd
Per unit CPU-cycle cost of server j ∈ E η j Per unit storage cost of server j ∈ E χ w Position vector of wolf w ∈ P x w Position of wolf w ∈ P at d th dimension cached resources for each task using the Resource Availability Database (Path 2) and propagates the task and resource data to the Optimal Task Allocator module (Path 3) for optimal resource allocation.• Optimal Task Allocator is the core computational block of the PES service module.It collects the task's descriptions from the task profiler, queries the resource availability of the Collaborative Edge Servers (CES) to the Resource Availability Checker (Path 4) whose result comes through the Resource Availability Database (Path 5-6-7-8-9), formulates the WOLVERINE task offloading problem and communicates the associated task offloading decision vectors to the MDs.• Resource Availability Database records the availability of the computational and cached resources of the CES that comes through the Communication Module and Resource Availability Checker (Path 6-7-8).• Resource Availability Checker queries resources to other neighboring CESs and updates the cached and computational resources periodically or when triggered by the Optimal Task Allocator (Path 16).• Task Execution Module executes the computational tasks offloaded to it by utilizing the available computational resources (Path 14) and cached data administered by Caching Management Module (Path 11-12-13).• Caching Management Module supplies the cached data to the Task Execution Module from the Cached Data module (Path 12-13) and maintains the cached data repository by performing maintenance functions.• Cached Data Repository stores the cached data code from the Computational Resource module for further use (Path 15).• Computational Resources module stores the server's available resources, such as CPU cycle and memory, for usage by the Task Execution Module.• Communication Module establishes collaboration

Algorithm 2
Algorithm for updating archive Algorithm 3 BMOGWO based task offloading

Fig. 6
Fig. 6 Impacts caching on the performance

Table 2
Summary of targeted performance parameters

Table 3
Description of notations Notation DescriptionUSet of mobile devices in the system τ , E Set of tasks and set of servers, respectively Line 20 again calls Algorithm 2. Lines 11-22 are also enclosed within a loop that iterates for I max times.The rest of the algorithm takes constant time to run.Thus, the total computational complexity of Algorithm 3 is 2, Line 3 is enclosed within a loop that iterates |A| times in the worst case.Line 8 requires M 3 time.The rest of the statements are of constant time complexity.Thus, the overall complexity of Algorithm 2 is O(|A| + M 3 ) .Next, we define the complexity of Algorithm 1. Lines 5-9 are enclosed within a loop that iterates |A| times.Line 12 updates the archive using Algorithm 2 that takes O(|A| + M 3 ) .Lines 2-13 are also enclosed within a loop that takes |P| times.Hence, the computational complexity of Algorithm 1 is O(|P| × (|A| + M 3 )) .Finally, we analyze the complexity of Algorithm 3. Lines 4 and 5 are enclosed within a loop that iterates for |P| times.Line 7 updates the archive that requires O(|P| × (|A| + M 3 ))) times.Line 13 is enclosed within a nested loop that iterates |P| × | − → χ | times.Line 18 is enclosed in another loop that iterates for |P| times.

Table 4
Evaluation parameters