This section begins with the description of the testbed where the evaluation tests have been carried out. Next, the CEP pattern that has been used in the tests, as well as the details of load generation will be specified. Already entering to the evaluation itself, a first analysis is presented on the impact of using various network technologies in the latency experienced by the end users of the system when receiving the generated events, depending on whether a fog or cloud computing architecture is used. Subsequently, a stress test is performed on both architectures taking into account the latency according to the number of alerts generated, to end with an analysis of the consumption of resources between both architectures.
Testbed description
Now the main hardware and software components of the testbed developed for carrying out the experiments will be described. The edge level of the testbed is deployed as a Python script that emulates 20 end-points and 2 gateways (10 end-points for each), namely, the Source entity in “Latency analysis” section. For the Fog Node, a Raspberry Pi 3 model B+ type microcomputer has been used, which has a 4-core 64-bit 1.4GHz processor, a 1GB RAM LPDDR2 SDRAM and Raspbian (without Graphical User Interface) operative system.
In order to keep control of the environment (i.e., network latencies), the core level has been implemented on-premise by using local resources. More precisely, the core level was implemented on an Intel Core i7 computer at 2.90GHzx8 with 8GB of RAM and 1TB of Hard Disk. Final User is a Huawei P20 Lite smartphone with Android version 8. A basic Android application has been developed in order to receive the alarms from CEP-Broker. As noted above, all the components have been deployed at different locations in Lima (Peru) and are interconnected through the public Internet.
The CEP engine used in this work is Apache Flink (version 1.8.0). Apache Flink is an open-source framework for state calculations on unlimited and limited data flows. Two types of processes are created during the runtime environment in Apache Flink. On the one hand, the Jobmanager implements 50 and 175 threads in Local and Global CEP, respectively, and is responsible for coordinating distributed execution, assignment of tasks, fault management, etc. On the other hand, the Taskmanager, configured with 512MB, is responsible for executing the tasks assigned by the Jobmanager on the data flow. The configuration of these two types of processes was optimised to minimize latency in the generation of alarms for our case study.
CEP pattern
The following is the implemented CEP pattern that will be used to analyse the incoming data, generate the events and notify with an alarm, in addition to the simulation of events that will be used to study the latency and performance set out in the next section.
In the tests performed, a CEP engine has been deployed for processing Closer-context events with a simple pattern. Thus, an alarm will be generated provided that, in moments of time t1 and t2, the values Vx(t1) and Vy(t2), received from different end-points, x and y, exceed a set threshold, Th. That is, it is true that Vx(t1)>Th & Vy(t2)>Th with t1<t2.
Thus, Fig. 7 shows an example of simulation until the second 120 to clarify the process of generating alarms. For our simulation a threshold Th=40 has been established. For the generation of events, it must be fulfilled that in consecutive moments a data arrives V1(t1) y V2(t2) such that V1(t1)>40 & V2(t2)>40 with t1<t2.
The process is as follows (in that strict order):
-
1.
At the beginning, when it reaches CEP, the data V1(0)=40 is discarded for not fulfilling the condition.
-
2.
Upon arrival of the second data V2(40)=41 this is stored by fulfilling the first case of the employer.
-
3.
In the next 80 seconds another data arrives V3(80)=42 so the complete pattern has been fulfilled and, therefore, generates the event and we close this first case of alarm generation. Likewise, the first pattern is met again with this data, so we open a second case of event generation.
-
4.
In the second 120, we see that V4(120)=40 arrives so the pattern does not meet and the second case is discarded for not complying with the established rule.
It is important to note that the number of alarms can be increased by sending more topics in less timeframes, so we can set the maximum number of alarms per minute. Therefore, for all the tests, 10-minute simulations were made simulating a controlled number of alerts every minute in an equidistant manner, that is, 10 tests were carried out generating the same number of alerts every minute. For example, in a first round a consumption test was performed with the generation of 200 alarms per minute for 10 minutes; once the services are restarted, a load of 400 alarms is performed per minute for 10 minutes and the services are restarted.
For this work a maximum limit of 800 alarms/min has been established since when generating more alarms, a bottleneck was created in the Fog Node and events were beginning to be lost. To do this, 20 end-points are emulated and a total of 1600 data per minute is sent, that is, 80 data per end-point. Note that the load applied to the system is the same for all tests, varying only the number of alarms; therefore, the use of network bandwidth from Source is always the same.
Finally, it should be noted that for the following results, 30 tests were performed to ensure its accuracy. Mean values have been represented.
Influence of network technology on the latency
A key aspect of the proposed architecture is the network technology used by the Final Users (see Fig. 6). These elements can be connected to the Fog Node thanks to WAN networks or LAN networks depending on the location of the Final User. Performance depends on the technology used. Thus, in this section we are going to evaluate the impact of some of the most widely used technologies. More precisely, in these experiments we are going to evaluate the influence of 3 different technologies: 3G, 4G and WiFi on latency.
So, the testbed described in “Testbed description” section has been deployed considering 3 different Final Users, all of them subscribed to the Local Broker: (i) one is subscribed by WiFi (it is in its wireless LAN coverage area); and (ii) the other two are subscribed through 3G and 4G telephone networks respectively (WAN connection).
Hence, Fig. 8 shows the results of making this comparison between the different connections to the Broker for a load with the pattern described in the previous subsection and a total of 800 alarms/min. As expected, a user who is on the same LAN of the Fog Node (WiFi connection) will receive the alert in less time than one connected by 3G and 4G, although 4G is very close to WiFi. One of the strengths of 4G is the speed and stability of the signal with respect to 3G which, as can be seen, has a more pronounced variance than 4G [36].
Thus, it can be seen from this study that the fog computing approach allows recipients in the area of coverage of the Fog Node to receive the alarm with a significantly lower latency than those recipients connected by telephony network. It should be noted that with a cloud computing approach, recipients can only receive the alert from the core level. The additional latencies incurred may be harmful for a wide range of applications.
Cloud vs. fog: latency evaluation with stress workload
Since the 4G telephony network has stable results and good latency performance, this will be the network used to send alarms to Final User in the remaining experiments. In addition, and as we will see in this section, this latency study should be extended so that we can compare if latency is reduced with the generation of Local Events (fog computing), rather than Global Events (cloud computing). Thus, in this particular case, and by which the subsequent performance study will be carried out, we will compare the latency in both architectures for a controlled number of alarms generated, specifically 200, 400, 600 and 800 alarms/min. Equation 1 has been used to calculate total latency (see “Latency analysis” section). In all the cases, averaged values for latencies are shown.
In this context, we can see in Fig. 9 how using a fog computing architecture reduces latency considerably, that is, the notification of an event arrives earlier to Final Users than in a cloud computing architecture.
Now, in this case we can see how the latency exceeds the second in the case of cloud computing. Moreover, it has a growing linear trend with a steep slope. On the other hand, fog computing also presents a linear trend, although it has much smoother slope, that is, it almost maintains a constant value. Therefore, we can consider that the latency in fog computing, in addition to being lower than in the cloud computing architecture, has a more stable value, independently of the assigned load.
In this context, the following test tries to determine which element of the architecture has the greatest impact on latency. For these results, the description of the latency and how to obtain it in each sector must be taken into account (see “Latency analysis” section). In particular, latencies in the three sectors are shown: (i) L1 time between the Source entity and before analysing in CEP (see Equation 2); (ii) TCEP time since a data is analysed and the event is generated (see Equation 3); and (iii) L2 time since the event becomes an alarm and reaches Final User (see Equation 4).
Therefore, Fig. 10 shows the average latency data, broken down by each sector indicated above. In it, it can be seen that in both architectures, the element that contributes most to latency is the MQTT Broker in the two phases of communication.
Taking into account the times obtained in the study of latency in Fig. 10 we can draw the following conclusions by sector:
-
L1 (see Fig. 10a): In this sector, the cloud computing architecture records a growing trend: the more alarms per minute there are, the higher latency L1. In the case of fog computing we can observe that the latency is constant and independent, as soon as we analyze a considerable set of events. Note that this parameter includes both the transmission time of the network and the work done by the MQTT Broker. In both architectures the communication latency has a low variance (see Fig. 8, 4G connection) so the variation observed in the latency values of the figure is due to the initialisation behaviour of the MQTT Broker as the first data arrives from Source.
-
TCEP (see Fig. 10b): The first behaviour to observe between both architectures is that the time observed in fog computing is slightly longer than the time in cloud computing because the resources in the Fog Node are more scarce than in the cloud. In any case, the time in both architectures is very similar and practically constant in this sector and, therefore, not very significant.
-
L2 (see Fig. 10c): For this sector we can see how, unlike L1, in both architectures there is a constant and independent trend to the number of alarms, because the Broker service is already initialised and it only distributes the alarms to the Final User. On the other hand, it is observed that in this case the latency for the cloud computing architecture is more than twice the one obtained by the fog computing architecture.
In summary, we can see that the growing trend in cloud computing (see Fig. 9) is due to the time spent in the L1 sector. In addition, an important factor that we can observe at this point has been that the MQTT Broker is a critical point of latency, while CEP performs the analysis of the data at a minimum latency.
Finally, not only latency is important to evaluate in both architectures. The distribution of computational resources in the different architectures must also be assessed.
Cost analysis: use of resources
In this section we will continue with the stress test developed for latency, but analysing the computational consumption for a fog computing architecture with respect to a cloud computing one. See Fig. 5 to remember the workflow in both architectures, analysing the distribution of resources at the core and edge level.
Case 1: core level
In the following results the measurements have been made in the core level, that is, in the central server (Cloud). The idea of this test is to know the computational consumption in the core level when using any of the architectures to evaluate. To this end, the Perf tool [37, 38] has been used to measure the energy consumed, Joules/millisecond (J/msec), and the Linux top tool to obtain the percentage of CPU and RAM consumed (see Fig. 11).
At first sight, we can see that cloud computing has a higher computational consumption in the measured values, so, by using a fog computing architecture we have reduced considerably the consumption of resources in the Cloud. The metrics evaluated are detailed below:
-
As for average CPU consumption (in %), see Fig. 11a, we can see that it has not been excessive in both architectures since the events sent do not perform complex mathematical operations that stress the CPU, but are simple comparison events. It can be seen that when cloud computing is used, CPU consumption is at most 1% higher than in fog computing, which is a very insignificant increase.
-
Regarding the consumption of RAM (in %), see Fig. 11b, we see more interesting results. It is possible to appreciate that the single activation of the CEP engine and the Broker represents a 35% increase in memory consumption. This aspect is due to the fact that CEP performs the analysis of events by storing data in the buffer and the Broker distributes the alarms from RAM. In contrast, in the case of fog computing, we see a very low value since the Broker and CEP services are not activated.
-
In matters of energy, see Fig. 11c, we see an average reduction of 69% in benefit of using fog computing with respect to cloud computing, without becoming high values. It is a consequence of the lower use of CPU and RAM.
It has been possible to verify how the use of fog computing download of work at the core level. This would be an additional benefit of the fog computing architectures (distribute resources across the different distributed devices) that will be more noticeable the more sophisticated the processing to be performed on the data.
Case 2: edge level
To obtain computational consumption at the edge level, when using a Raspberry Pi as Fog Node, only the consumption (in %) of CPU and RAM could be obtained because the Perf tool is not available for ARM processors. Therefore, the measurements made are now observed in Fig. 12. Note that the scales used in the graphs are different than in Case 1 (see Fig. 11). However, it is reported that the maximum energy consumption of a Raspberry Pi board at maximum load (i.e., the worst case) is 5.1W [39], leading to 0.0051J/msec, which is negligible compared to the energy consumed in the core level.
At first glance we can see how to use fog computing we have a greater consumption of resources in the Fog Node. This point is interesting and corroborates that the decrease in the consumption of computational resources in the core level implies a redistribution of resource usage towards the edge level. The metrics evaluated are detailed below:
-
Regarding the CPU consumption (in %), see Fig. 12a, we can observe for both architectures a linear behavior with the number of alarms processed per minute, although the slope obtained in the fog computing architecture is much steeper, reaching a consumption of 20% compared to 6% of cloud computing for 800 alarms/min. This is due to two facts: i) The limited performance of low-cost devices, such as the Raspberry Pi of our testbed; and ii) The workload since it is not only assigned by the CEP engine, but also that of the Broker who must make a double publication. However, the result obtained at this point is key: low-cost devices (less than US$40 per device) can be used to analyze data from IoT applications with real-time requirements using a CEP engine without overloading the system.
-
Regarding the consumption of RAM (in %), see Fig. 12b, it can be seen that the CEP and Broker engine consume up to 80% of the RAM in the Fog Node, that is, almost 70% more than in the cloud computing model, since in the latter case the Fog Node is a passive element. Like the core level analysis, CEP performs the event analysis and the Broker distributes the alarms from RAM. A key aspect that certifies the feasibility of using low-cost devices is that the % of memory in use is constant and independent of the number of alarms generated.
As a summary, we can observe that the assignment of tasks and work to the edge level with CEP and Broker brings with them a distribution of work assigned to the Fog Nodes while the core level has a much lower load. This is an interesting fact since in addition to harnessing the computing power at the edge level, it also highlights that the response times to the end user are much shorter, which in turn enables the deployment of a large number of applications with real-time requirements in IoT.