Simcan2Cloud: a discrete-event-based simulator for modelling and simulating cloud computing infrastructures

Cañizares, Pablo C.; Núñez, Alberto; Bernal, Adrián; Cambronero, M. Emilia; Barker, Adam

doi:10.1186/s13677-023-00511-w

Research
Open access
Published: 18 September 2023

Simcan2Cloud: a discrete-event-based simulator for modelling and simulating cloud computing infrastructures

Pablo C. Cañizares¹,
Alberto Núñez²,
Adrián Bernal³,
M. Emilia Cambronero³ &
…
Adam Barker⁴

Journal of Cloud Computing volume 12, Article number: 133 (2023) Cite this article

1458 Accesses
1 Citations
Metrics details

Abstract

Cloud computing is an evolving paradigm whose adoption has been increasing over the last few years. This fact has led to the growth of the cloud computing market, together with fierce competition for the leading market share, with an increase in the number of cloud service providers. Novel techniques are continuously being proposed to increase the cloud service provider’s profitability. However, only those techniques that are proven not to hinder the service agreements are considered for production clouds. Analysing the expected behaviour and performance of the cloud infrastructure is challenging, as the repeatability and reproducibility of experiments on these systems are made difficult by the large number of users concurrently accessing the infrastructure. To this, must be added the complications of using different provisioning policies, managing several workloads, and applying different resource configurations. Therefore, in order to alleviate these issues, we present Simcan2Cloud, a discrete-event-based simulator for modelling and simulating cloud computing environments. Simcan2Cloud focuses on modelling and simulating the behaviour of the cloud provider with a high level of detail, where both the cloud infrastructure and the interactions of the users with the cloud are integrated in the simulated scenarios. For this purpose, Simcan2Cloud supports different resource allocation policies, service level agreements (SLAs), and an intuitive and complete API for including new management policies. Finally, a thorough experimental study to measure the suitability and applicability of Simcan2Cloud, using both real-world traces and synthetic workloads, is presented.

Introduction

Over the last few years, cloud computing has become a reference for on-demand computing. The high level of flexibility, security, and cost savings have led companies to use this computing paradigm for the provision of the services they require. According to the Right-Scale 2019 State of the Cloud Report from [1], 94% of enterprises use at least one cloud service, and spending on such services reached $227.8 billion. In order to satisfy this demand, there exist several cloud service providers, such as Amazon Web Services (AWS), Azure, Google Cloud, VMWare Cloud, and Oracle Cloud Infrastructure, among others.

Market competition has led service providers to seek elements of differentiation, such as performance, quality of service, and cost. Thus, one of the main goals of cloud providers is to achieve a good balance between system performance and usage of computational resources while maintaining profits. However, achieving a balanced architecture that accomplishes this goal is challenging. Considering an emerging company that provides cloud services, considerable growth in the number of users accessing the services may lead to experiencing system bottlenecks, which may force profit drops due to the loss of users. The main idea is to provide the data-centre with adequate computational resources to serve the incoming users, avoiding overdimensioning, or underdimensioning, the system.

In order to obtain a good cost-performance ratio it is necessary to perform a thorough analysis of the cloud when processing different workloads, which allows the provider to properly configure the different cloud parameters, such as virtual machines (VMs), resource allocation policies, and the cost of each VM offered [2]. A misconfigured cloud environment may lead to poor overall performance, which will have a significant impact on the quality of service and, consequently, compromise the reputation of the company.

Unfortunately, carrying out an experimental analysis on production-ready environments is complex, expensive and, in some cases, not possible due to the necessity of having dedicated access to the system. Furthermore, applying configuration changes in a production system, such as adding more machines, replacing computational resources, or setting a new network topology, may affect the behaviour of the system.

In the last ten years, researchers have tackled these issues by using simulation techniques [3,4,5,6]. The main features of these techniques allow the creation of simulation tools that are appropriate for modelling, analysing and studying complex systems. In essence, a simulator uses an abstraction of the system under study - namely a model - to imitate its behaviour by representing its most relevant features. Among the most important advantages provided by simulation, we can highlight the following cloud-systems-related ones: (i) The system under study is not required to execute the simulations. In general, simulators can be run on a regular computer; (ii) Experiments can be easily reproduced in a simulated environment. In most cases, there exist a high number of inter-related parameters and variables that cannot be controlled on a real-world production system, such as the users accessing the system concurrently, thus making the repeatability of the experiments impossible [7]. However, simulation allows us to reproduce the same experiment in a controllable way; (iii) Experiments can be run in parallel, improving performance without requiring specific hardware resources [8, 9]. Thus, simulations can be run on a standard desktop – using the available CPU cores – or, in order to significantly increase the number of simulations executed in parallel, on a computer cluster; and (iv) Simulation provides more flexibility when applying changes to the configuration settings. While modifying the configuration of a cloud system is a time-consuming and expensive task, simulation only requires us to modify the configuration of the model by setting up the correct parameters, such as the network topology, or the resource allocation policy.

Currently, there exists a broad spectrum of simulation platforms for modelling cloud computing systems. However, most of the cloud simulators are focused on representing the behaviour of the system from the users’ perspective, and do not consider the cloud provider part. For instance, DISSECT-CF [10] is considered as one of the most relevant cloud computing simulators. However, different aspects related to the cloud provider, such as allocation policies, user management, and costs are not taken into consideration. Additionally, there exist several proposals focused on different cloud provider aspects, such as pricing features [11,12,13], cloud deployments [14], modelling resources [15], and services offered by the cloud provider [16]. Nevertheless, these works are not targeted at considering the underlying hardware of a cloud platform.

To the best of our knowledge, there are few simulation platforms aimed at describing the cloud provider, with a reasonable level of detail, while considering the infrastructure support. In these terms, CloudSim [8] offers several policies for the management of the available cloud resources, supporting different host selection strategies, service deployment, and VM provisioning. However, the resources of the cloud infrastructure, and both the management and the behaviour of the users, are not particularly detailed. In order to overcome these issues, we present Simcan2Cloud, a discrete-event-based framework for modelling and simulating cloud systems. Simcan2Cloud mainly focuses on the cloud provider, supporting the modelling of cloud infrastructures and the interaction of the users with the cloud. In addition, for analysing how Simcan2Cloud is aligned with the real world, the platform includes a trace representation module that allows to execute real-world traces collected from production-ready systems. Thus, we can compare the simulated system with the real – target – system to find potential inconsistencies. Below, we highlight the most relevant and novel features of our proposed simulation platform:

1.
Flexible SLAs. Simcan2Cloud considers different SLA definitions in cloud computing environments. Hence, the requested resources are allocated to the users according to the different parameters established in the SLA: availability of the resources, rental time, and a configurable cost model that covers several aspects, such as discounts for delays, an extra cost for additional time, and compensation for unavailability.
2.
A cloud provider waiting queue. In terms of user management, the platform provides a queue system to handle users upon their arrival in the cloud. This mechanism enables users to wait for the requested service – by subscribing to the system – instead of leaving the system immediately.
3.
Priority users. In order to enrich the behaviour of the system, the platform supports the management of users with different priority levels. Hence, high-priority users are not required to wait in the cloud provider queue, since their requested resources are allocated on reserved machines, which are exclusively dedicated to these users.
4.
A renting time extension offer. With respect to the service rentals, the platform supports extending the rental time of the VMs when some services are still running, and the rental time of the requested resources has expired. This feature is designed to cover a common behaviour in cloud environments.
5.
Resource usage. This platform includes a module for monitoring the usage of the computational resources at the data-centre, such as CPUs, RAM memory and storage. This feature enables users to analyse usage patterns and detect disruptive behaviours in these key subsystems.
6.
An API to easily include new management policies. Simcan2Cloud supports different scheduling policies for resource allocation. The cloud provider can select the most appropriate algorithms for maximising both the percentage of resource usage and the cloud provider profits. In addition, the platform provides templates to facilitate the creation of custom scheduling policies and user behaviours.

This paper is organised as follows. Firstly, Section “Related work” introduces and analyses the state of the art of cloud computing simulators. Section “Simcan2Cloud” presents the architecture and the implementation details of Simcan2Cloud. Then, we present an empirical study in Section “Empirical study”, in which the performance of Simcan2Cloud is analysed and discussed. Finally, Section “Conclusions and future work” contains our conclusions and some lines for future work.

Related work

In the last few decades, simulation techniques have been adopted by the research community as a valuable way to study and analyse cloud computing environments. As a result, a significant number of cloud computing simulators have appeared in the literature [6, 17,18,19,20,21]. The noticeable growth in the state-of-the-art surveys – from an average of 10 in 2012 [21] to up to 30 in 2020 [6] – is a clear indicator of the increasing interest in designing cloud simulators.

Cloud computing simulators

In the current literature, we found several simulation platforms focused on the cloud provider. The CloudNetSim++ simulator [22] is a cloud simulator, built on OMNeT++, that uses the INET Framework to model a complete network layer. This simulator allows users to describe SLA policies, scheduling algorithms, and billing costs, and offers the built-in OMNeT++ user interface. Thus, users must learn the basics of the OMNeT++ environment to create cloud scenarios. Another proposal is the Data Centre Simulator (DCSim), a simulation framework for modelling and simulating data centre infrastructures [23]. In general terms, DCSim focuses on the IaaS layer, which is used for providing services to multiple clients. It is also worth mentioning that DCSim supports the modelling of cost and SLAs. DISSECT-CF is a simulation platform focused on modelling resource sharing and the cloud infrastructure with a high level of detail [10]. This approach presents quite a detailed IaaS stack simulation and supports energy-aware techniques for cloud infrastructures, hence allowing the inclusion of new metrics for analysing different resources. SCORE [24] is a simulator based on Google Omega and written in Scala. SCORE simulates parallel scheduling, energy consumption, and synthetic workloads, as well as offering shutting-down and powering-on computational node mechanisms. In the same line, SCORE-GAME is an extension of SCORE that includes an energy scheduling policy based on the Stackelberg game [25]. The model of this simulator includes two roles, namely the Scheduling Manager and the Energy-Efficiency Manager. The former processes the tasks as quickly as possible, while the latter is targeted at minimising the overall energy consumption. In this way, this proposal is based on a competition between those roles, where the main goal is to balance the trade-offs between energy consumption and performance. iCanCloud is a simulation platform built on the OMNeT++ framework [26]. In essence, this simulator represents the behaviour of cloud systems by modelling the physical machines supporting the cloud, the configuration of the VMs provided and different resource allocation policies. Additionally, the E-mc$^2$ framework [9] has been developed to include support for measuring the energy consumption of the different hardware components of the system, such as the memories, the CPUs of disk drives, etc. Thus, iCanCloud can be used to estimate the trade-offs between cost and performance in a wide range of cloud scenarios.

The surveys of cloud simulation tools found in the current literature claim that CloudSim [8] is one of the leading cloud simulators [6]. CloudSim uses SimJava as the simulation core and allows the modelling of hosts in data centres, virtual machines, user tasks, and resource provisioning policies. CloudSim focuses on service broker scheduling algorithms and implements space-shared and time-shared allocation policies to manage computing resources. This simulator provides a limited network model, as it only considers transmission delays and lacks a realistic network topology. Since several researchers found limitations in carrying out experiments with CloudSim in areas of study, the research community has extended the capabilities of the tool by implementing other simulators based on CloudSim. Among such simulators extending CloudSim functionalities, we can highlight NetworkCloudSim, CloudAnalyst, CDOSim, WokflowSim, CloudExp, and UCloud.

NetworkCloudSim [27] improves the network layer of CloudSim by implementing switches at several aggregation levels and providing communication models with different levels of detail. These implementations allow developers to model parallel applications. CloudAnalyst [28] provides a GUI for CloudSim, presenting geographical factors that allow the configuration of user and data centre locations. Basically, the location feature enables the simulator to calculate the response and processing time of the requests. CDOSim [29] simulates the cost and performance characteristics of cloud deployment scenarios, and allows developers to model delays and SLA violations, helping them to choose a deployment strategy. Although this simulator implements VM migration, it still inherits a limited network model from CloudSim.

WokflowSim [30] introduces the modelling of scientific workflows in a cloud environment and job clustering, which allows researchers to study the impact of job failures on workflows. This simulator is not suitable for data-intensive applications, since it does not model the performance of the storage system. The simulator CloudExp [31] improves CloudSim by including complex network models and a Map-Reduce processing model. CloudExp offers SLA definition based on measurable terms, and also supports a workload generator toolkit to model real workloads. One of the main weaknesses of CloudExp is the static model for measuring the performance of the VMs. UCloud [32] is a hybrid cloud simulator – for university environments – focusing on scenarios that require the services of public clouds when the private cloud is full. In addition, UCloud implements performance monitoring, university activities, and security management, as well as considering the cost of using the public cloud, but not the cost of the data centre communications.

Comparison of Simcan2Cloud and SoTA solutions

In this section, we present a comprehensive comparison between Simcan2Cloud and some of the well-known cloud simulators. It is important to highlight the effort and time invested by the research community to create and maintain a broad spectrum of simulation tools, a fact that has led to the existence of a wide range of cloud simulators. In order to choose those simulators that have been widely adopted by the community, we have carefully analysed papers available in the current literature, surveys – such as those of [6] and [17] – and public repositories.

Table 1 Comparison of the main cloud computing simulators existing in the literature

Simcan2Cloud: a discrete-event-based simulator for modelling and simulating cloud computing infrastructures

Abstract

Introduction

Related work

Cloud computing simulators

Comparison of Simcan2Cloud and SoTA solutions

Simcan2Cloud

Software description

Service level agreements

Architecture

API

GUI

Empirical study

Experiment 1: Synthetic workloads and multiple CPU configurations

Experiment 2: Real world traces and SLAs

Conclusions and future work

Availability of data and materials

Notes

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Authors’ information

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Competing interests

Additional information

Publisher’s Note

Appendices

Appendix

Appendix A: Graph Appendix

Appendix B: Implementation details

Rights and permissions

About this article

Cite this article

Share this article

Keywords