Cloud resource orchestration in the multi-cloud landscape: a systematic review of existing frameworks

The number of both service providers operating in the cloud market and customers consuming cloud-based services is constantly increasing, proving that the cloud computing paradigm has successfully delivered its potential. Nevertheless, the unceasing growth of the cloud market is posing hard challenges on its participants. On the provider side, the capability of orchestrating resources in order to maximise profits without failing customers’ expectations is a matter of concern. On the customer side, the efficient resource selection from a plethora of similar services advertised by a multitude of providers is an open question. In such a multi-cloud landscape, several research initiatives advocate the employment of software frameworks (namely, cloud resource orchestration frameworks - CROFs) capable of orchestrating the heterogeneous resources offered by a multitude of cloud providers in a way that best suits the customer’s need. The objective of this paper is to provide the reader with a systematic review and comparison of the most relevant CROFs found in the literature, as well as to highlight the multi-cloud computing open issues that need to be addressed by the research community in the near future.


Introduction
Over the last few years, cloud computing has established itself as a new model of distributed computing by offering complex hardware and software services in very different fields. As reported in the RightScale 2019 State of the Cloud Report [1], many companies and organisations have successfully adopted the cloud computing paradigm worldwide, while more and more are approaching it as they see a real opportunity to grow their business. According to that report, 94 percent of IT professionals surveyed said their companies are using cloud computing services, and 91 percent are using the public cloud. Organisations leverage almost 5 clouds on average, and companies are running about 40 percent of their workloads in the cloud. The enterprise cloud spend is growing quickly as companies plan to spend 24 percent more on public cloud in 2019 vs. 2018.
The competition between cloud providers is getting stronger in order to acquire increasing market shares: a key point to optimise resource usage and fully exploit the potential of cloud computing is the issue of resource orchestration [2]. Cloud resource orchestration regards complex operations such as selection, deployment, monitoring, and run-time control of resources. The overall goal of orchestration is to guarantee full and seamless delivery of applications by meeting Quality of Service (QoS) goals of both cloud application owners and cloud resource providers. Resource orchestration is considered to be a challenging activity because of the scale dimension that resources have reached, and the proliferation of heterogeneous cloud providers offering resources at different levels of the cloud stack. Cloud Resource Orchestration Frameworks (CROFs) have emerged as systems to manage the resource lifecycle, from the selection phase to the monitoring one [2][3][4]. Today most of commercial cloud providers offer a cloud orchestration platform to end-users [5]: however, these products are proprietary and, for obvious business reasons, are not portable. In addition, although modern configuration management solutions exist (e.g., Amazon OpsWorks, Ansible, Puppet, Chef ) that provide support for handling resource configuration over cloud services, all potential users (ranging from professional programmers and system administrators to non-expert end-users) are often required to understand various low-level cloud service APIs and procedural programming constructs in order to create and maintain complex resource configurations.
The advent of the multi-cloud computing further exacerbates the already challenging orchestration issues. The multi-cloud paradigm is a very recent technological trend within the cloud computing landscape, which revolves around the opportunity of taking advantage of services and resources provided by multiple clouds [6,7]. Multicloud presumes there is no a priori agreement between cloud providers, and a third party is responsible for the services. That is the case for Cloud brokerage scenarios, where a broker intermediates between cloud providers and cloud consumers [8]. In order to enable an effective multi-cloud paradigm, it is essential to guarantee an easy portability of applications among cloud providers [9,10]. This new requirement calls for more powerful resource orchestration mechanisms cross-cutting multiple cloud administrative domains, i.e., capable of dealing with the heterogeneity of the underlying cloud resources and services.
This work explores the many issues of resource orchestration in the cloud landscape. A review of existing works in the addressed field is conducted in order to identify the challenges that have mostly attracted researchers in recent years, and highlight the aspects that have not been fully covered yet. The main contribution of our work is twofold. Firstly, by deeply analysing recently appeared literature, we build a comprehensive taxonomy of desirable features and dimensions useful to characterise CROFs. Then, in accordance with the identified features, we compare several CROFs from both industry and academia. This will help the reader not only to understand the strengths of each framework, but also to identify the unsolved challenges that have to be addressed in the near future.
The remainder of the paper is organised as follows. In "Research methodology" section the methodology followed in our study is described. "Related surveys" section presents a survey of existing works related to our study.
In "Analysis framework" section we identify the CROF capabilities which have been used to carry out the review presented in "Review of cROFs" section. In "Critical dis cussion" section we summarise the results of the review, emphasising current limitations and open challenges. Finally, "Conclusion" section concludes our work.

Research methodology
The primary motivation of this study is to shed light on the recent advances that both industry and academia have made in facing the cloud resource orchestration's issues in the multi-cloud landscape.
With this aim in mind, we identified the fields relevant to our study in order to clearly frame the research scope. Beyond the quite expected cloud resource orchestration topic, the following macro topics were also investigated: cloud interoperability, cloud brokerage, interconnected clouds. As outlined in "Introduction" section, cloud resource orchestration deals with the discovery, selection, allocation, and management of cloud resources. When multiple clouds are in place, cloud brokering and interoperability issues due to the simultaneous access to heterogeneous services of interconnected providers cannot be neglected in the analysis of cloud resource orchestration.
We surveyed the literature recently produced in the mentioned fields. Specifically, we sought for proposals, frameworks, prototypes, commercial products somehow addressing the above discussed issues. The databases taken into consideration in this survey are the following: Scopus 1 , ACM Digital Library 2 , IEEE Xplore Digital Library 3 , Elsevier ScienceDirect 4 , and SpringerLink 5 . We also took care of filtering out research items that are dated earlier than the last decade.
We found out that many researchers have already published surveys that are relevant to our object of study. Each of these surveys lists and classifies, under different perspectives, numerous initiatives taken under the big umbrella of the cloud resource orchestration field, be them fully-fledged CROFs or minor proposals focusing just on a restricted set of orchestration features. The primary objective of the study proposed in this work is to provide a new, unified analysis of the existing initiatives, which embraces all the analysis perspectives proposed by the past surveys and eventually identifies the missing ones.
Therefore, as shown in Fig. 1, the first step of our study consisted in reviewing the literature surveys with the aim of a) consolidating the list of CROFs and, in general, proposals on which to run a qualitative comparative analysis, (2020) 9: 49 Page 3 of 24 Fig. 1 Research methodology steps b) extracting the analysis dimensions addressed in each survey. Then, the second step was to elaborate an analysis framework in order to provide a more comprehensive set of features on which a new comparison step would be run. Next, following the references found in the surveys, each CROF on the list was further revised according to the above-mentioned comparative guidelines, and the output of the analysis was eventually gathered in a synoptic

Related surveys
This section presents the results of a literature survey we conducted in order to identify published studies that relate to our work to varying degrees. Specifically, we investigated the vast area of cloud computing searching for proposals and initiatives falling under the theme cloud resource orchestration in the multi-cloud landscape.
Of particular importance in the context of the discussion were the following works: Inter-cloud Challenges, Expectations and Issues Cluster position paper [11], and the Manifesto for Future Generation Cloud Computing [12]. Both works acknowledged resource provisioning and orchestration as an open challenge. In [11], Ferrer et al. recognised it as a research area with a high business impact in the medium term. Besides, in light of more and more heterogeneous cloud resources distributed across diverse cloud typologies and models, both studies stressed the importance of investigating related research areas, such as cloud interoperability and portability, service discovery and composition (i.e., cloud brokerage), and interconnected clouds. The relationship of these related research areas with the main topic of this survey are schematically depicted in Fig. 2. We depicted the multi-cloud resource orchestration research scope as a big umbrella fully covering the cloud resource orchestration research area, and partially sharing themes covered by the cloud brokerage, inter-clouds and cloud interoperability/portability research fields.
We remark that the study conducted in this first investigative step did not intend to seek for actual proposals and initiatives in the focused fields. Instead, it targeted the literature works proposing themselves surveys of the most relevant proposals (step 1 in Fig. 1). Here, the aim is to highlight the limits of existing literature surveys and, thus, to provide a motivation to our work. Also, by "surveying existing literature surveys" we were able to collect the pointers to the actual research proposals, which were the object of investigation in the next steps of our study.
Below, we discuss some of the most representative literature surveys broken down into the four above-mentioned cloud sub-topics. In each of the following sections the subtopic is briefly introduced, and the aspects relevant to the multi-cloud orchestration topic are pointed out.

Cloud interoperability
The cloud computing community typically uses the term interoperability to refer to the ability of easily moving workloads and data from one cloud provider to another or between private and public clouds [13]. Ten years ago, the standardisation bodies NIST [14], OMG [15] and DMTF [16] developed, among others, several use cases related to cloud interoperability. All the bodies, independently of each other, defined a common umbrella of interoperability use cases covering topics such as user authentication, workload migration, data migration and workload management.
In [17], the authors performed a comprehensive survey on cloud interoperability, with a focus on interoperability among different IaaS cloud platforms. They investigated the existing efforts on taxonomies and standardisation of cloud interoperability, and identified some open issues to advance the research topic as well. Nevertheless, the presented solutions and concepts are mainly focused on IaaS interoperability.
In [18], the authors did their survey on service interoperability and portability on cloud systems with respect to cloud computing service discovery. Still, other interoperability approaches such as the Model Driven Engineering (MDE) and open solutions were not extensively explored.
In [19], the authors described the main challenges regarding cloud federation and interoperability, as well as showcased and reviewed the potential standards to tackle these issues. Similar to [17], their work is restricted to IaaS interoperability, with no other service or deployment models being covered.

Cloud brokerage
According to the Gartner definition [20], "Cloud services brokerage is an IT role and business model in which a company or other entity adds value to one or more (public or private) cloud services on behalf of one or more consumers of that service via three primary roles including aggregation, integration and customization brokerage". As defined by NIST [21], a cloud service broker "... is an entity that manages the use, performance and delivery of cloud services and negotiates relationships between cloud providers and cloud consumers. " From these definitions, it is clear that any business player which intends to act as a broker between the cloud consumers and the cloud providers must cope with the diversity of providers and the heterogeneity of the multitude of services the latter offer.
In [6], the authors proposed taxonomies for inter-cloud architectures and application brokering. They presented a detailed survey of both academic and industry developments for inter-cloud, cataloguing many projects and fitting them onto the introduced taxonomies. They also analysed the existing works and identified open challenges in the area of inter-cloud application brokering. Their efforts are nonetheless limited to broker-based strategies.
In [22], a systematic literature survey was conducted to compile studies related to cloud brokerage. The authors presented an understanding of the state of the art and a novel taxonomy to characterise cloud brokers, identifying the main limitations of current solutions and highlighting areas for future research. However, just like [6], their whole analysis only covers broker-based approaches.

Interconnected clouds
Interconnected clouds, also called Inter-cloud, can be viewed as a natural evolution of cloud computing. Intercloud has been introduced by Cisco [23] as an interconnected global "cloud of clouds" that mimics the term Inter-net, "network of networks". Basically, the Inter-cloud refers to a mesh of clouds that are unified based on open standard protocols to provide a cloud interoperability.
A more sophisticated definition of Inter-cloud is given by the Global Inter-cloud Technology Forum (GICTF) [24]: "Inter-cloud is a cloud model that, for the purpose of guaranteeing service quality, such as the performance and availability of each service, allows on-demand reassignment of resources and transfer of workload through an interworking of cloud systems of different cloud providers based on coordination of each consumer's requirements for service quality with each provider's SLA and use of standard interfaces".
In [8,9,25], the author investigated the consumption of resources and services from multiple clouds, as well as proposed a list of requirements for interoperability solutions, highlighting the technological barriers and some well-known solutions for multi-cloud environments. The author did not present the origin of these requirements, nor did she identify the degree of fulfillment of the requirements by theoretical approaches and technical solutions.
In [26], the authors discussed all the relevant aspects motivating cloud interoperability, categorising and identifying cloud interoperability scenarios and architectures. They provided a taxonomy of the main challenges for the Inter-cloud realisation. A comprehensive review of the state of the art, including standardisation initiatives, ongoing projects and studies in the area, was also conducted.
In [27], the authors analysed the existing literature to identify how interoperability in cloud computing has been addressed. They investigated requirements and usage scenarios for interoperable applications as well as cloud interoperability solutions, presenting a limited list of open issues and directions for future research.
In [28], the authors surveyed the literature to analyse and categorise various solutions for solving the interoperability and portability issues of Interconnected clouds, referring to both user-side (Multi-clouds or Aggregated service by Broker) and provider-side (Federated clouds or Hybrid clouds) scenarios, as specified in [8,25]. They also performed a comparative analysis of the literature works falling into the same category, and discussed the challenges of Interconnected clouds along the same lines as [17] and [26].
Despite delving into Interconnected clouds, starting with motivation, scenarios, possible solutions for interoperability, and ending with open issues and future directions, all these works ( [26][27][28]) gave limited attention to cloud resource orchestration. In addition, none of them covered aspects pertaining to the application development, deployment, and lifecycle management.

Cloud resource orchestration
In a panorama where organisations get to use many types of cloud computing systems simultaneously, the complexity of the workloads devoted to the management of the life-cycle of resources (data and applications) across the systems dramatically increases. Cloud orchestration is the process of managing these multiple workloads, in an automated fashion, across several cloud solutions. Typical activities underlying such a complex process are the resource description, selection, configuration, deployment, monitoring and control. Let us not forget that the orchestration problem is exacerbated by the diversity of the cloud systems, for what concerns both technical and administrative features.
In [2], the authors characterised the cloud resource orchestration in a multi-layered stack, and highlighted the main research challenges involved in programming orchestration operations for different cloud resource types across all layers of a cloud resource stack. The scope of their analysis is nevertheless restricted to the area of cloud resource orchestration.
In [3], the authors proposed a multidimensional taxonomy for classifying and comparing cloud resource orchestration techniques from both industry and academia, identifying open research issues and offering directions for future study. Similar to [2], their work only covers the topic of cloud resource orchestration.
In [29], the authors performed a systematic literature survey to build up a taxonomy of the main research interests regarding TOSCA. Different topics were addressed, such as devising cloud orchestration methods using TOSCA, extending the language of TOSCA, and presenting tools for manipulating TOSCA models. Despite being envisioned as a topic which is expected to play an increasingly important role, interoperability received very limited attention.

Analysis framework
In this section we introduce the desired capabilities for CROFs, focusing on deployment and management aspects. From the consumers' standpoint, CROFs implement a service-oriented model which ensures successful hosting and delivery of applications by using cloud resources in order to meet their QoS requirements. Our reference architecture for CROFs is depicted in Fig. 3. Processes and services involved in cloud resource orchestration are categorised depending on their functionalities in relation to this reference model.
The Access Layer regulates interaction with the framework. Users can access services from the lower layers by means of CLIs, Web APIs, and Dashboards. The Application Management Layer concerns the handling of applications throughout their entire lifecycle, from the Development to the Execution passing through the Deployment. The Development refers to languages and models to typically represent applications, workflows, QoS requirements, and policies. Application descriptions define application components as well as their relationships. Workflow descriptions specify the behavioural aspects of applications by means of declarative or imperative approaches. Policy descriptions provide applications with dynamic control behaviours (e.g. defining load-based policies to scale up and down applications) in order to meet QoS requirements. The Deployment refers to the actual application deployment on cloud resources, which might go through a preliminary resource discovery process. The Execution entails effective automation of complex management tasks, such as scaling and failure handling, which typically require a monitoring engine collecting system and application metrics. Based on the captured metrics, a recovery engine and a policy enforcement engine can determine the decisions to make in order to recover from failures and enforce policies, respectively.
The Resource Management Layer includes services (e.g. discovery services, provisioning services, monitoring services) handling resources throughout their whole lifecycle. These services coordinate the required actions from the upper layer by leveraging operations at the Resource Provisioning Layer. The Resource Provisioning Layer encompasses services offering the most basic operations regarding cloud resources. A range of provisioning services (e.g.  In [32][33][34], the authors presented their vision for cloud computing, including views on future research areas, one of them being resource provisioning and orchestration. A thorough analysis of these research areas and related challenges from different perspectives was carried out. In [35], GigaSpaces Research investigated prevalent approaches for managing applications in cloud environments, namely, orchestration, PaaS (Platform as a Service) and CMP (Cloud Management Platform). A number of categories serving as a common ground for comparison between the different approaches were proposed.
Based on the study of Baur et al. [4], we enriched the list of desirable capabilities pertaining to CROFs by reviewing the literature and integrating the aforementioned works. Such capabilities, summarised in Fig. 4, can be classified into two main categories as either Cloud Features or Application Features. Details about each set of features are provided in the following subsections.

Cloud features
Cloud features address cloud infrastructure aspects with special focus on supported deployment across multiple cloud providers. Whilst some works [4,35] investigated features such as multi/cross-cloud support and integration of external services and systems, others [31] focused on capabilities such as interoperability and access modes to CROFs. We propose a comprehensive approach which takes into account all the said aspects that we discuss next.

Multi-cloud support
Supporting multiple cloud providers is one of the most crucial features for CROFs, as it allows to select the best matching offer for an application from a diverse cloud landscape. Cloud providers often differ from each other regarding their APIs. For that reason CROFs should offer a cloud abstraction layer (see "Interoperability approach" section), which hides differences and avoids the need for provider-specific customisation causing the vendor lock-in issue.

Cross-cloud support
Cross-cloud support enhances the multi-cloud feature by allowing to distribute component instances of a single application over multiple cloud providers. The advantages of cross-cloud deployment are threefold: a) it allows a sophisticated selection of the best-fitting cloud providers on a per component instance basis, optimising costs or improving quality of services; b) it leverages the application availability as it introduces resilience against the failure of individual cloud providers; and c) it helps coping with privacy issues.

Interoperability approach
In the context of cloud computing, interoperability can be defined as the ability to develop applications that combine resources that can interoperate, or work together from multiple cloud providers, hence taking advantage of specific features provided by each provider [27]. A few research papers [9,27,28]  Formulating standards for cloud computing is the most obvious solution for interoperability. Even though a plethora of standards have been proposed so far (e.g., OCCI 6 , CIMI 7 , OVF 8 , CDMI 9 , TOSCA [36]), lack of widespread accepted standards necessitates investigating other solutions for interoperability. When cloud providers use different APIs and data models in order to exhibit the same features, semantic interoperability becomes involved. Semantic technologies (e.g, OWL 10 , SPARQL 11 , SWRL 12 ) can prove useful to provide semantic interoperability among different cloud providers. Broker-based approaches can also alleviate semantic interoperability by means of ontology-based interfaces concealing the differences among cloud vendors. Cloud interoperability can also be addressed by exploiting MDE techniques [10]. Another viable solution for cloud interoperability includes open libraries (e.g., Apache jclouds 13 , Apache Libcloud 14 ) and services, which rely on abstraction layers in order to decouple application development from proprietary technologies of cloud providers.

Integration
Support for advanced IaaS/PaaS services (e.g., DBaaS, LBaaS, FWaaS) is desirable. It reduces complexity and management efforts for the end user. On a negative note, it comes at the expense of flexibility. BYON (Bring Your Own Node) captures the ability to use already running servers for application deployment. In particular, it enables the use of servers not managed by a cloud platform or virtual machines on unsupported cloud providers.

Access
This feature captures what interfaces CROFs use to interact with cloud resources. Three types of interfaces are usually supported: command-line, web-based dashboard, and web-based API.
Command-line interfaces wrap cloud-specific API actions as commands or scripts executable through shell environments. Despite command-line interfaces being easier to implement, their usage requires a deep understanding about cloud resources and related orchestration operations.
Web-based dashboards present cloud resources as userfriendly artifacts and resource catalogues. Visual artifacts and catalogues aim at simplifying resource selection, assembly, and deployment. These features make Web-based dashboards simpler and more flexible than command-line interfaces.
Web-based APIs allow other tools and systems (e.g. monitoring tools) to integrate cloud resource management operations into their functionalities. They provide the highest abstraction out of the three interface types.

Application features
Application features address development, deployment, and execution aspects of applications. To this end, unlike all previous works, we collect features according to the application phase they pertain to. For instance, with reference to the development phase, we have identified Portability and Containerisation as relevant features. Furthermore, we also propose a classification of application domains of interest for CROFs.

Application domain
Application domain refers to the types of applications that CROFs have been targeted and customised for. Academic 13 https://jclouds.apache.org/ 14 http://libcloud.apache.org/ research has been done toward the characterisation of application domains over the past few years [31][37] [38]. Grounding on the study of Buyya et al. [37], we classified application domains into two categories: Scientific applications, and Business applications (see Fig. 5).
Cloud computing systems meet the needs of different types of applications in the scientific domain: high-performance computing (HPC) applications, highthroughput computing (HTC) applications, and Largescale data analytics/Internet of Things (IoT), which is a matter of common interest for both scientific and business sectors. In regard to the business domain, cloud computing is the preferred technology for a wide range of applications, from multitier web applications (e.g., web, mobile, online gaming applications) to media and content delivery network (CDN) applications (e.g, video encoding & transcoding, video rendering, video streaming, web/mobile content acceleration).

Portability
Portability has been defined as the capability of a program to be executed on various types of data processing systems without converting the program to a different language and with little or no modification [39]. In the context of cloud computing, portability can be classified into three categories: data portability, function or application portability, and service or platform portability [40]. In particular, application portability refers to the ability to define application functionalities in a vendor-agnostic way.
Supporting open standards such as CAMP [41] and TOSCA [36] for modelling the application topology and the component lifecycles facilitates the usage of CROFs and further increases the reusability of the topology definition, as it restricts the vendor lock-in issue to cloud provider level. Reusability can also be improved via a modularised approach regarding the application description. Methods to achieve modularity include templating, parameterisation, and inheritance. Furthermore, since the initial effort for describing applications and application components is high, model sharing by means of existing libraries or marketplaces would be beneficial.

Containerisation
Container-based virtualisation [42] is a key approach for sharing the host operating system kernel across multiple guest instances (i.e., containers), while keeping them isolated. Environment-level containers provide a resource isolation mechanism with little overhead compared to OS-level hypervisors [43]. Moreover, the increased isolation offered by containers allows resource consumption to be configured, controlled, and limited at the instance level. Docker is the leading Linux-based platform for developing, shipping, and running applications through container based virtualisation.
Since managing a large amount of containers inside a Docker cluster can be difficult, container-centric orchestrators such as Docker Swarm 15 , Google Kubernetes 16 , and Apache Mesos 17 have appeared. They perform orchestration at container level by automating the provisioning and management of complex containerised deployments across multiple hosts and locations.

Resource selection
Resource selection refers to the level of automation supported by CROFs with respect to the selection of hardware and software resources. It usually involves identifying and analysing alternative cloud resources based on selection criteria. Resource selection approaches can be classified into four categories.
In a manual binding users provide the concrete unique identifiers of the cloud entities. In an automatic binding they specify abstract requirements (e.g. number of cores), which CROFs are responsible for binding to a concrete offer at runtime. Automatic binding can be enhanced by offering an optimised binding, which leverages optimisation criteria based on attributes of the cloud provider (e.g., price, location) to select the best fitting offer. A dynamic binding offers a solving system that enables changes to the binding based on runtime information (e.g., metric data from the monitoring system).

Lifecycle control
Lifecycle control defines the actions that need to be executed in order to fully manage cloud applications. Existing CROFs provide varying levels of automation, typically categorised as script-based, and DevOps approaches.
A script-based approach consists of a set of shell scripts, which are executed in a specific order. It has limited ability to express dependencies, react to changes, and verify configurations. Script-based approaches can be extended to support DevOps tools (e.g., Chef 18 , Puppet 19 , Ansible 20 ) that offer a more sophisticated approach to deployment management and ready-to-use deployment descriptions.

Wiring & workflow
Most cloud applications are distributed with components residing on different virtual machines. When application deployment takes place, an application instance consisting of one or more component instances gets created. Since dependency relationships may exist between components, the deployment functionality also has the task of wiring component instances together.
A straight-forward approach to resolve those dependencies is attribute and event passing, in which case lifecycle scripts lock/wait for attributes to become available or register listeners on topology change events. An improvement is a manual workflow defined by users in order to take care of the deployment order. Nevertheless, the easiest way for users to deploy applications is an automatic workflow deduction from the lifecycle actions defined on components and their relationships. Additionally, CROFs may offer extensions for external services like IaaS/PaaS services (see "Integration" section) to ensure that the deployment engine is aware of this dependency.

Monitoring
Tracking the behaviour of applications is the key to assessing the quality of the deployment and an important building block for adaptation. As a first step this comprises the collection of metrics. CROFs should offer a way to measure system metrics (e.g., CPU usage) and application metrics (e.g., number of requests). If predefined metrics are not sufficient, a well defined way to add custom metrics should be provided. Aggregation mechanisms enable to compute higher-level metrics and combine multiple metrics as well. Access to historical data is also desirable in order to support a higher-level evaluation of monitoring data.

Runtime adaptation
CROFs should automatically adapt applications in order to deal with dynamic deviations (e.g., increased load).
Operations to face such changes are mainly scaling, and migration. However, the adaptation support of many CROFs is limited to horizontal scaling with thresholdbased triggers. Rule engines leveraging complex metrics and QoS goals would be an improvement.
Since cross-cloud deployments may experience failures, CROFs should also support recovery from undesired, erroneous states. Another feature related to adaptation is continuous integration/continuous delivery (CI/CD), which allows to modify the topology model of deployed applications reducing changes to as few as possible.

Review of cROFs
This section presents a selection of CROFs from different landscapes. Notwithstanding that the current state of the art embraces a large number of frameworks, this work contemplates a subset of them which we deem to be representative of the characteristics of the majority of existing solutions. We classify the frameworks in two categories: production/commercial CROFs, and experimental/academic ones.
Production/commercial CROFs are used in a production environment by private and public cloud providers. Whereas some of them are closed-source, others are open-source and supported by a thriving community of developers and users. Experimental/academic CROFs usually originate from the research scenery and advance the state of the art, even though their implementation is mostly prototypal.
We discuss next each class of CROFs, and analyse their main capabilities from both cloud and application perspectives, as extensively covered in "Analysis frame work" section. Table 1 provides a bird's-eye view of the frameworks taken under consideration. Specifically, each row represents a CROF (Name) and specifies the original authors (Organisation), basic dates for the initial and latest releases (Active), a brief introduction (Description), and the sources consulted (References).

Production/commercial cROFs
Nowadays, there is a great variety of production/commercial CROFs around [44], such as infrastructure-centric services (e.g., Heat, CloudFormation) provided by cloud providers which are also IaaS providers, platform-centric (e.g., Cloud Foundry, Open-Shift) and platform-agnostic (e.g., Cloudify, Terraform) tools provisioning resources from IaaS providers. In this section we first debate some of the most relevant solutions introduced in Table 1, and subsequently summarise their cloud and application features in Tables 2 and 3 respectively.

Heat
OpenStack Heat [45] is a service for managing the entire life-cycle of infrastructure and applications within OpenStack clouds. It implements an orchestration engine to launch multiple composite cloud applications based on either a CloudFormation compatible template format (CFN) or the native OpenStack Heat Orchestration Template format (HOT). HOT templates are defined in YAML.
A Heat template describes the infrastructure of a cloud application in a declarative fashion, enabling creation of most OpenStack resource types as well as more advanced functions (such as instance high availability, instance autoscaling, and nested stacks) through OpenStack-native REST API calls. The resources, once created, are referred to as stacks. Heat templates are consumed by the Open-StackClient, which provides a command-line interface (CLI) to OpenStack APIs for launching stacks, viewing details of running stacks, and updating and deleting stacks.
Heat only allows a single-cloud deployment on an OpenStack environment. With reference to interoperability, Heat provides neither semantics nor MDE solutions, but it provides support for TOSCA via the independent Heat-Translator project 21 which translates TOSCA templates to HOT.
Regarding portability, Heat partially supports model standards ( TOSCA) and reusability via input parameters, and template composition. It also supports containerisation by means of OpenStack Zun service 22 .
Cloud resources can only be selected through manual binding, whereas both manual and automatic workflows can leverage script-based or DevOps tools (such as Chef and Puppet) in order to handle the whole application life-cycle. Heat provides horizontal scaling with threshold triggers based on infrastructure metrics. It partially supports continuous delivery by updating existing stacks, resulting in some resources being updated in-place and others being replaced with brand new resources. Failure recovery capabilities are also supported by means of manual workflows and stacks update.    Typical blueprints contain declarations for various resource types, including cloud resources. Cloudify allows multi-cloud and cross-cloud deployments by means of built-in plugins. It also supports BYON, and leverages TOSCA for interoperability and portability. However, despite being aligned with the modelling standard, Cloudify's DSL does not directly reference the standard types.
Cloudify supports containerisation using Docker. Container orchestration is also available through Kubernetes. Cloud resources can only be selected through manual binding, whereas both manual and automatic workflows can leverage script-based or DevOps tools (such as Ansible, Chef, and Puppet) in order to handle the application life-cycle. Cloudify provides infrastructure, application, and custom metrics. It also enables the definition of custom aggregations and policies using Clojure 23 and Riemann 24 . 23 https://clojure.org/ 24 http://riemann.io/ Cloudify offers built-in workflows for application healing (by applying the uninstall and install workflows' logic, respectively) and horizontal scaling. Complex scenarios (e.g., vertical scaling, cloud bursting) are not supported out of the box. Live migration is partially-fulfilled in the context of containerised applications, though. Multiple pods with containerised applications can be moved between nodes in the same Kubernetes cluster, without service disruption. Continuous delivery is supported through deployment updates, which allow to modify a running topology by adding/removing/modifying nodes. Modifying existing nodes will cause their automatic reinstallation, though.

Brooklyn
Apache Brooklyn [47] is an open-source framework for modelling, deploying, and managing distributed applications defined using declarative YAML blueprints written in Brooklyn's DSL. Brooklyn's YAML format follows the CAMP specification [41], but uses some custom extensions. Support for TOSCA is planned for the near future. Blueprints are usually consumed by the Brooklyn client CLI in order to access a running Brooklyn Server. A web console and powerful REST-APIs are available as well.
Brooklyn allows multi-cloud and cross-cloud deployments on many public and private clouds. It also supports private infrastructures (BYON), and leverages Apache jclouds as cloud abstraction layer for interoperability. Portability is achieved via model reusability mechanisms (e.g, type inheritance) and model sharing (e.g, types   Brooklyn supports manual as well as basic automatic binding for resource selection, whereas it does not support workflow scenarios. Life-cycle actions (i.e. effectors) for entities can be configured through either shell scripts 25 http://www.clocker.io/ or Chef recipes. Brooklyn pulls metrics by either executing remote actions or accessing an external monitoring tool. Nevertheless, it is the user's responsibility to implement those actions, or to provide an interface to an external monitoring tool.
Metrics/QoS can be fed into policies, which automatically take actions such as restarting failed nodes, or scaling out. By default, a thresholdbased policy is available. Continuous delivery is exclusively possible on component level, namely by redeploying single components with updated software.

Stratos
Apache Stratos [48] is an open-source PaaS framework which allows developers to build distributed applications and services. Applications are typically composed of sets of cartridges representing descriptions of abstract VMs hosting both business and infrastructure services, combined with deployment and scaling policies. Stratos defines configurations and applications in a specific JSON format, therefore they can be shared. Reusability is limited, since cartridges contain references to IDs of IaaS snapshots and hardware configuration. Applications can be managed by means of Stratos CLI. A web console and powerful REST-APIs are available as well. Stratos support multiple providers and utilises Apache jclouds as cloud abstraction layer for interoperability. Despite using jclouds, BYON is not supported. No external services are supported either. Stratos leverages Kubernetes as a cluster orchestration framework in order to provide containerisation. Cloud resources are manually selected when configuring cartridges. In addition, while the life cycle description for managing VMs is done by Stratos itself, the software setup is delegated to Puppet. Only manual workflows are supported.
Stratos uses a cartridge agent residing within each VM in order to access system and application metrics. It is not possible to define custom metrics. Using in-flight requests, load average, and free memory metrics combined with a complex event processor and the Drools rule engine 26 , Stratos enacts a multi-factored horizontal auto-scaling. It also includes cloud bursting, allowing to seamlessly migrate applications between clouds. Recovery actions are supported in case some tasks within VMs of an application topology fail, by automatically destroying and recreating the affected cartridge instance. Continuous delivery is not supported, since users need to undeploy applications before changing their definitions.

Alien4Cloud
Alien4Cloud (Application LIfecycle ENabler for cloud) [49] is an open-source platform that makes application management on the cloud easy for enterprises. It leverages other existing open-source projects that help orchestrating cloud applications and focus on run-time aspects (e.g., Cloudify). In Alien4Cloud, applications templates (blueprints) are modelled in TOSCA in order to allow interoperability and portability. Blueprints can also be shared across platform users via a maintained TOSCA catalog. However, Alien4Cloud supports a slightly modified version of TOSCA Simple Profile. Application deployment is done through an orchestrator on a location configured for and managed by an orchestrator. Alien4Cloud supports a number of orchestrators (Cloudify, Puccini 27 , and Marathon 28 ) via plugins. Locations describe a logical deployment target ranging from private/public clouds to a set of physical machines (BYON), or even Docker containers (Kubernetes and Mesos). Multi-cloud and cross-cloud deployments are supported.
Cloud resources can only be selected through manual binding (node substitution), whereas both manual and automatic workflows can leverage script-based or DevOps tools (such as Ansible, Chef, and Puppet) in order to handle the application life-cycle. Regarding monitoring and run-time adaptation, since Cloudify can be used as Alien4Cloud's backend orchestration solution, the same considerations apply. In particular, Alien4Cloud supports horizontal scaling as well as continuous delivery.

Terraform
Terraform [50] is an open-source infrastructure as code tool for building, changing, and versioning infrastructures in a platform-agnostic way. It uses its own high-level configuration language known as Hashicorp Configuration Language (HCL), or optionally JSON, in order to detail the infrastructure setup. Despite being non-compliant with any model standards, HCL supports reusability via modules and module composition. Reusable modules can also be shared by means of the Terraform Registry as well as other sources (e.g., GitHub, Bitbucket). Configurations are usually consumed by the Terraform CLI, but Terraform Enterprise also provides both a web-based dashboard and REST APIs.
Terraform can manage multiple cloud providers and even cross-cloud dependencies by means of special plugins called providers. Providers are available for Docker containers and container orchestration as well as external cloud services (e.g. Amazon RDS 29 ). However, no support is provided for BYON. Cloud resources are manually selected during configuration, while life-cycle actions can be configured through provisioners executing scripts or running configuration management (Chef, Puppet, Salt). Only automatic workflows are supported.
Terraform leverages providers in order to provide autoscaling capabilities with threshold triggers on system metrics gathered by monitoring services (e.g., Azure Monitor 30 , Amazon CloudWatch 31 ). Continuous delivery is supported by applying configuration updates, which allow to add/remove/modify resources. When resource arguments cannot be updated in-place, the existing resource 27 32 resources. Container orchestration is also supported by means of Elastic Kubernetes Service (EKS) 33 resources as well. Cloud resources are selected through manual binding, whereas lifecycle actions can be configured through userdata scripts or DevOps tools (Chef, Puppet). Only automatic workflows are supported.
CloudFormation provides automatic scaling capabilities by means of AWS Auto Scaling 34 , which uses dynamic scaling and predictive scaling to automatically scale resources based on Amazon CloudWatch metrics. Customised metrics for Application Auto Scaling can also be defined. Live migration is partially-fulfilled in the context of containerised applications. For instance, it's possible to gracefully migrate existing applications from a worker node group to another. Continuous delivery is supported by stack updates. Depending on the resource and properties being updated, an update might interrupt or even replace an existing resource. Recovery actions are supported by automatically rolling back the existing stack on failure.

Experimental/academic cROFs
In this section, we initially review an ensemble of significant experimental/academic CROFs outlined in Table 1, and then summarise them according to their cloud and application features in Tables 4 and 5 respectively. Additionally, we briefly run through other research initiatives focusing only on specific aspects of CROFs. 32

Cloudiator
Cloudiator [52,53,62] is an open-source cross-cloud orchestration framework, which relies on Apache jclouds in order to support many public and private cloud platforms. The main orchestration component, namely Colosseum, can be accessed via a Java client, or a web-based user interface, or a REST-API.
The application description consists of individual components, which are assembled to form a full application. Each component provides interface operations (e.g., bash scripts) for managing the component life-cycle. Dependencies between application components are described through communication entities linking provided ports and required ports. Despite being non-compliant to any modelling standards, application components are reusable across different applications.
The resource broker is responsible for automatically selecting the correct cloud offer (previously discovered by the discovery engine), depending on the desired requirements/constraints on virtual machine configuration. The deployment engine acquires the virtual machine and forwards the component installation request to the remote life-cycle agent, namely Lance. Lance runs component instances within Docker containers by default. In addition, only automatic workflows are supported.
Automatic scaling capabilities are provided by means of AXE, a monitoring and adaptation engine embedded in Cloudiator, which implements scalability rules consisting of threshold-based conditions linked to raw or composed metrics. Migration features are partially fulfilled by supporting access to OpenStack's live migration functionality. Recovery actions are supported by the recovery engine, which detects abnormal states of system entities marking them as failed, and applies solutions based on failure categories. The same mechanism is used in order to represent changes in the models (continuous delivery).

Roboconf
Roboconf [54,63] is an open-source scalable orchestration framework for multi-cloud platforms. Many IaaS providers (e.g., OpenStack, AWS, Azure, vSphere), as well as Docker containers and local deployments for onpremise hosts, are supported by using special plugins. Roboconf partially supports interoperability by means of OCCI extensions and a generic target implementation based on Apache jclouds. In addition, it can be accessed by means of a shell-based console, or a web-based user interface, or a REST API.
Roboconf provides a CSS-inspired DSL, which allows to describe applications and their execution environments in a hierarchical way. A distributed application is seen as a set of components, building an acyclic graph describing both containment and run-time relationships between

Cross-Cloud
Interoperability -Open Standards Automatic scaling capabilities are provided by means of autonomic management implemented by the DM and the remote agents. Agents send notifications to the DM whenever certain threshold-based conditions linked to system metrics are met. The DM's decision engine responds to those notifications using corresponding imperative rules. Monitoring application metrics still needs to be addressed. Both application migrations and global/percomponent rollbacks (continuous deployment) are part of Roboconf 's roadmap, but they are not supported out of the box yet.

INDIGO-DataCloud
INDIGO-DataCloud (INtegrating Distributed data Infrastructures for Global ExplOitation) [55,56,64] is an opensource data and computing platform targeted at scientific communities, and provisioned over Cloud and Grid-based infrastructures as well as over HTC and HPC clusters. The Orchestrator delegates the deployment to the IM, to OpenStack Heat or to the Mesos frameworks, based on TOSCA templates and a list of providers ranked by the CloudProviderRanker. The Monitoring component collects monitoring data from both PaaS core services and client infrastructure/services by means of specific probes. The SLAM establishes an agreement between customer and provider about capacity and quality targets. The Data Management Services provide an abstraction layer for accessing the data storage in a unified and federated way.
INDIGO-DataCloud supports multi-cloud and cross-cloud deployments, as well as interoperability by leveraging open standards (OCCI, CDMI). It also promotes portability by adopting an extension of TOSCA for describing applications and services. Cloud resources are automatically selected and optimised by the CloudProviderRanker, depending on SLAs and   to ensure the elasticity of the cluster, Marathon 36 and Chronos 37 frameworks in order to handle Long-Running Services (LRS) and application jobs, respectively. Marathon can also migrate services if problems occur. Despite different DevOps practises being adopted for both the core services and user applications (e.g., automated builds of each application image are triggered once a new change is committed to its repository), hot changes in application deployments are not supported out of the box.

MiCADO
MiCADO (Microservices-based Cloud Application-level Dynamic Orchestrator) [57] is an open-source multicloud orchestration and auto-scaling framework for Docker containers, orchestrated by Kubernetes (or alternatively by Docker Swarm). The full MiCADO framework has been investigated and implemented in the COLA (Cloud Orchestration at the Level of Application) project funded by the European Commission [66].
MiCADO core services are deployed on MiCADO Master, which is configured as the Kubernetes Master Node and provides the Docker Engine, Occopus [67] (to scale VMs), Prometheus 38 (for monitoring), Policy Keeper (to perform decision on scaling), and Submitter (to provide submission endpoint) microservices. During operation, MiCADO workers are instantiated on demand and join the cluster managed by the MiCADO Master.
MiCADO supports multi-cloud and cross-cloud deployments on various public and private cloud infrastructures. It also provides interoperability and portability by means of a TOSCA-based Application Description Template (ADT), which comprises three sections: a) the definition of the individual applications making up a Kubernetes Deployment, b) the specification of the VM and c) the implementation of scaling policies for both VM and Kubernetes scaling levels. ADTs can be consumed by means of a web-based dashboard or a REST API.
Cloud resources are manually selected when configuring VMs. The application life-cycle is handled by MiCADO itself, which leverages Occopus and Kubernetes for managing VMs and containers, respectively. Only automatic workflows are supported. MiCADO allows automated scaling depending on VM and container metrics gathered by two built-in exporters on each MiCADO worker: Prometheus Node Exporter 39 and CAdvisor 40 . Scaling policies can be defined specifically for the applications. Lastly, continuous delivery capabilities are supported via "rolling updates" on Kubernetes Deployments.

MODAClouds
MODAClouds (MOdel-Driven Approach for the design and execution of applications on multiple Clouds) [58,59] is an open-source design-time and run-time platform for developing and operating multi-cloud applications with guaranteed QoS. The MODAClouds framework has been developed within the homonymous project funded by the European Commission [68].
The MODAClouds Toolbox consists of three main components: Creator4Clouds, Venues4Clouds, and Ener-gizer4Clouds. Creator4Clouds is a design-time platform which allows to design multi-cloud applications, carry out performance and cost evaluation, and plan the deployment strategy by choosing the service providers that best suit all business and QoS requirements. Venue4Clouds is a decision support system (DSS) to choose the most suitable cloud providers depending on different aspects such as application architecture, business risk, quality and cost. Energizer4Clouds is a run-time platform to deploy, manage, monitor and assure operations of multi-cloud services. Specifically, Tower4Clouds sub-component is responsible for collecting, analysing, and storing monitoring information, whereas SpaceOps4Clouds subcomponent enacts application self-adaptation in order to meet predefined objectives and/or constraints whenever changes happen.
MODAClouds supports multi-cloud and cross-cloud deployments on both IaaS and PaaS providers. It leverages an MDE approach in order to support interoperability between cloud providers. In particular, MODA-CloudML is a set of UML extensions enabling developers to model multi-cloud applications through three level of abstractions: Cloud-enabled Computation Independent Models (CCIM), Cloud-Provider Independent Models (CPIM), and Cloud-Provider Specific Models (CPSM). These models facilitate portability, since they are mostly reusable. Cloud resources can be automatically selected and optimised via Venues4Clouds and SpaceDev4Clouds, and managed through either shell scripts or Puppet. Only automatic workflows are supported.
Within the MODAClouds runtime environment, the Models@Runtime engine is responsible for enacting adaptation actions such as application scaling and bursting, data and application migration, and continuous delivery on both infrastructure and component levels. Failure recovery is partially supported for data migration and scaling/bursting scenarios.

SeaClouds
SeaClouds (SEamless Adaptive multi-Cloud management of service-based applicationS) [60,61] is an open-source platform for deploying and managing multi-component applications over heterogeneous clouds. The SeaClouds framework has been investigated and implemented within the homonymous project funded by the European Commission [69].
The information needed by the Deployer (based on Apache Brooklyn [47]) to deploy, configure and run the application. The Monitor collects infrastructure and application level metrics from the targeted cloud providers in order to verify that QoS requirements are met. And if not, reconfiguration actions can be triggered. The SLA Service enforces business-oriented policies and business actions to apply in case of violation. SeaClouds supports multi-cloud and cross-cloud deployments on both IaaS and PaaS providers. It also promotes interoperability and portability by adopting a TOSCA-based representation for AAMs and ADPs, as well as a CAMP-based description for DAMs. Cloud resources are automatically selected and optimised by the Planner. Changes to the binding can also occur in case of reconfiguration actions. Only automatic workflows are supported.
SeaClouds allows repairing actions, such as scaling horizontally and vertically cloud resources, or restarting and replacing failed components. It also supports replanning in order to handle the cases that cannot be solved by repairing. A migration of application modules may happen in this process. Continuous delivery is not supported out of the box.

Other initiatives
In this section, we briefly review a number of other research approaches derived from related EU projects which address, to varying degrees, multi-cloud orchestration, interoperability and portability. Specifically, a few works target semantic interoperability (i.e., moSAIC, cloud4SOA), some explore the benefits of federated cloud networks (BEACON, ATMOSPHERE), whereas others focus on application portability via non-standard (i.e, Claudia, OPTIMIS, ASCETiC, HARNESS), partially-standard (i.e, soCloud) and fully-standard (i.e., CELAR, CloudLightning) cloud modelling.
mOSAIC [70,71] is an open-source API and platform for multiple clouds designed and developed within the homonymous project [72]. Application deployment and portability across multiple clouds are facilitated by means of a common API and a high-level abstraction of cloud resources. mOSAIC also enables application developers to specify resource requirements in terms of a cloud ontology, whereas the platform, using a brokering mechanism, performs a matchmaking process in order to find the best-fitting cloud services. In so doing, developers can postpone their decision on the procurement of cloud services until runtime. However, even though a platformindependent component-based programming model is used, applications need to be implemented by leveraging one of the supported language-dependent APIs (Java, Python).
Cloud4SOA [73,74] is a multi-cloud broker-based solution developed under the homonymous project [75], which addresses semantic interoperability and portability challenges at the PaaS layer. It supports multi-platform matchmaking, management, monitoring and migration of applications by semantically interconnecting heterogeneous PaaS offerings. Similar to mOSAIC, Cloud4SOA introduces a cloud ontology establishing a set of abstractions among different PaaS offerings while exposing a multi-PaaS standardised API for the seamless application deployment and management across different cloud platforms. Despite being independent of specific APIs offered by the underlying PaaS offerings, adapters acting as a middleware between the Cloud4SOA API and native PaaS APIs are still needed.
The main goal of the BEACON project [76] is to develop techniques to federate cloud network resources, and to enable an efficient and secure deployment of federated cloud applications. Specifically, the proposed approach is to build a homogeneous virtualisation layer on top of heterogeneous underlying physical networks, computing and storage infrastructures. By leveraging the combination of Cloud federation, Software Defined Networking (SDN), and Network Function Virtualization (NFV) technologies, the project has delivered an innovative design of a Federation Management system acting as an external service provider dealing with federated networking services among multiple federated OpenStack Clouds [77].
ATMOSPHERE [78] aims to design and implement a framework and platform relying on lightweight virtualisation, hybrid resources and Europe and Brazil federated infrastructures to develop, build, deploy, measure and evolve trustworthy, cloud-enabled applications. Orchestration and deployment of complex application topologies is achieved through the TOSCA standard. In the context of the project, partners developed a federated network architecture [79] by creating multi-tenant overlay networks across different sites. The developed framework offers services such as distribution and inter-site migration of VMs, resource management, and network management.
Claudia [80] is a service management system implementing an abstraction layer that allows for the automatic service deployment and scaling depending on both infrastructure and service status. Conversely to mOSAIC and Cloud4SOA, each service in Claudia is defined by its corresponding Service Description File (SDF) whose syntax is based on the OVF standard, thereby providing vendor and platform portability. However, special OVF extensions must be defined in order to support automatic scalability, deployment-time customisation and external connectivity specification.
OPTIMIS [81] is a toolkit which addresses and optimises the whole service lifecycle on the basis of aspects such as trust, risk, eco-efficiency and cost, taking into consideration a number of cloud scenarios, namely, cloud federation, multi-cloud, hybrid cloud. However, regarding multi-cloud, interoperability with non-OPTIMIS providers can only be achieved by using APIs and adapters externally to the OPTIMIS components. According to OPTIMIS programming model, each service is defined as a collection of core elements being packed along with any external software components into VM images. Similar to Claudia, these VM images are configured by means of a service manifest based on the OVF standard, but a set of OVF extensions are required in order to specify the functional and non-functional requirements of the service. ASCETiC [82] is an open architecture and approach to multi-cloud optimising energy efficiency, designed within the homonymous EU project [83]. Analogous to OPTI-MIS, the OVF specification is employed to define a complete set of VMs to be deployed at an IaaS provider. Nevertheless, OVF extensions are necessary in order to support SLA negotiation and self-adaptation rules.
The HARNESS project [84] develops a cloud computing platform incorporating non-traditional and heterogeneous computational, networking and storage resources into the data centre stack to provide high performance at low cost. HARNESS envisions an enhanced cloud PaaS software stack that not only supports existing commodity technologies, but also incorporates heterogeneous technologies such as Dataflow Engines (DFEs), programmable routers and different types of storage devices [85]. The project demonstrated its results via extensions to Open-Stack.
soCloud [86] is a service-oriented component-based PaaS for managing portability, elasticity, provisioning, and high availability across multiple clouds. Application descriptors are based on the OASIS Service Component Architecture (SCA) standard [87]. However, since the SCA model doesn't allow to define non-functional requirements, special SCA extensions are required. A custom DSL is also used in order to describe elasticity. Additionaly, not only does soCloud support only SCA-based applications, but maintaining the mappings to various cloud providers and keeping up with recent features of supported clouds are a concern. CELAR [88, 89] is a resource management platform able to automatically deploy, monitor and scale applications over a cloud infrastructure. Applications are described using TOSCA, which ensures the portability of application descriptions across different IaaS platforms. However, every time a new application is to be deployed, users need to issue the request to the appropriate CELAR Server instance inside the cloud they want to deploy their application to. In contrast to mOSAIC, cloud4SOA and ASCETiC, no brokering mechanism is defined in order to best fit cloud resource requirements. Furthermore, cross-cloud is not supported.
CloudLightning [90] is a heterogeneous cloud service management and delivery model developed within the homonymous EU project [91]. Based on the principles of self-organisation and self-management, CloudLightning allows users to design and deploy their applications without the need for selecting the most suitable resources. This separation of concerns is made possible using a CloudLightning-specific service description language (CL-SDL), which extends TOSCA in order to capture specific attributes . The declarative approach is enriched with resource discovery mechanisms allowing easier identification and consumption of a variety of heterogeneous resources. CloudLightning proposes a solution based on a Gateway Service, which relies on two open-source tools: Alien4Cloud acting as the Gateway Service UI and Brooklyn-TOSCA 41 acting as the deployment orchestrator. In view of the above, the same remarks made in "Brooklyn" section are applicable to CloudLightning.

Critical discussion
Tables 2, 3, 4 and 5 summarise the CROFs presented in "Review of cROFs" section by outlining the features debated in "Analysis framework" section. We discuss the main characteristics of these frameworks next.
Most of the reviewed CROFs provide different access modes, including web-based dashboards and APIs, and allow both multi-cloud and cross-cloud deployments, except for Heat and CloudFormation which, as infrastructure-centric services, only support their own IaaS providers (i.e., OpenStack and Amazon, respectively). Besides, some of them natively support deployments on BYON (e.g., Cloudify, Brooklyn, Roboconf, SeaClouds). Interoperability between cloud providers is mainly achieved by means of open standards and open libraries/abstraction layers (e.g. jclouds). Open standards appear to be gaining ground, especially in academic scenarios. As such, a number of academic CROFs provide interoperability via OCCI (e.g., Roboconf ) or CDMI (e.g., INDIGO-DataCloud) support, while others do via TOSCA (MiCADO, SeaClouds). Despite being the focus of previous research efforts (e.g., mOSAIC, Cloud4SOA), semantic approaches seem to be no longer a priority compared to the adoption of open standards. Of all the initiatives, MODAClouds is the only one to employ model-driven methodologies.
With regard to application portability, CROFs from both industry and academia are placing ever-increasing importance on modelling standards. However, while taking TOSCA (e.g., Cloudify, MiCADO) and CAMP (e.g., Brooklyn, SeaClouds) as reference models, they happen to customise and extend standard types. Thus, a further effort would be appropriate in order to ensure greater compliance with the aforementioned specifications. Aside from the adoption of standard models, model reusability is encouraged by means of modules shared either locally or remotely. Besides, since containers provide improved application encapsulation and abstraction from resources, most of the CROFs support containers as well as container orchestration.
As regards resource provisioning, there are different aspects of the matter that need to be considered, such as: selection, configuration and deployment of resources. In multi-cloud scenarios, selection is far from being a trivial task due to the diversity of cloud services' characteristics and QoS. While manual selection is supported in the majority of CROFs, automatic and optimised selections are almost exclusively supported by academic CROFs. The optimised selection leverages QoS and technical requirements, and is carried out either based on static information on the service quality provided by cloud providers or through dynamic negotiation of SLAs. A few multicloud projects (e.g., INDIGO-DataCloud, MODAClouds, SeaClouds) provide support for SLA management, even though multi-cloud SLAs are not covered. Limited support is currently available for dynamic selection (i.e., SeaClouds).
Resource deployment can be manual or automatic. While most commercial CROFs support both manual and automatic workflows, academic CROFs exclusively support automatic ones. Using standard models such as TOSCA, where applicable, proves useful both for defining a custom workflow and for automatically generating one. However, since current standards lack support for modelling the semantics related to the instantiation of relationships between component instances, the actual wiring of component instances depends on the capabilities offered by the CROF enacting the deployment. On that note, standard extensions in support of sophisticated wiring on instance level would be desirable. As for resource configuration, on the one hand scripts are extensively supported, but on the other hand configuration management tools are mostly supported by commercial CROFs. Nonetheless, a few academic projects (e.g., INDIGO-DataCloud and MODAClouds) exploit these tools in order to enact DevOps practices as well.
Monitoring plays a key role in keeping track of the status of applications as well as physical and virtual resources. Monitoring metrics at different abstraction levels (e.g., infrastructure and application ones) and capturing dependencies between these levels allow to perform root cause analysis, such that any issues at infrastructure level can automatically lead to run-time infrastructure adaptation which best fits run-time application requirements. While infrastructure metrics are widely supported by both commercial and academic CROFs, application and custom metrics necessitate further investigation. Metric aggregation mechanisms are available for a large majority of CROFs. Nevertheless, in light of multi-cloud scenarios, where applications and resources may be largely distributed, metric collection and aggregation from heterogeneous cloud environments are necessary. As a result, standardised interfaces and formats should be inspected.
Monitoring data allows for different purposes such as enforcing SLAs, enabling elasticity, ensuring QoS. SLAs can be used as a basis for cloud services and respective applications to be managed during their lifecycle. Multi-cloud management requires specific mechanisms for run-time adaptation across a diversity of cloud set-ups, including scalability, migration, faulttolerance, continuous delivery. While reactive approaches to run-time adaptation are fairly consolidated among all CROFs, predictive approaches (based on workload prediction models and machine learning optimisation) are only supported in some commercial CROFs (e.g., AWS CloudFormation) and should be more explored.
Both academic and commercial CROFs largely provide support for threshold-based horizontal scaling. Policybased approaches, especially in the academic landscape, are gaining in importance as well. Migration support is still limited in both industry and academia, as it is closely linked to portability in all its facets, i.e., VM portability, application portability, data portability. Although platform-independent standards (TOSCA) and virtualisation techniques (containers) have improved application encapsulation and abstraction from resources, platformindependent data representation and standardisation of data import and export across diverse and heterogeneous clouds need to be inspected. In this regard, MODAClouds provides a solution to the data migration issue, albeit in the context of scalable NoSQL databases.
Both academic and commercial CROFs support failure recovery mechanisms based on restarting/replacing failed components or, in a worst-case scenario, rolling back entire application stacks. Of all academic CROFs, Cloudiator, MODAClouds and SeaClouds allow to identify abnormal and undesirable states of the system and apply a limited set of autonomic actions. However, the emergence of decentralised multi-cloud setups connecting a wider variety of entities and resources requires autonomic management systems that consider self-organisation, selfmanagement and self-healing across a diversity of cloud deployments. Continuous delivery is well supported in the commercial landscape, and it is also gaining ground in the academic one because of the ever-growing use of DevOps methodologies.

Conclusion
Cloud computing technology has greatly evolved over the past few years, transforming the traditional infrastructure, platform and software resources into elastic and on-demand virtual components. However, heterogeneous and multi-layer resources have to be orchestrated in an effective way in order to ensure that end-users are provided with acceptable quality levels.
In this work we thoroughly analysed the cloud orchestration landscape: after presenting a taxonomy of relevant features and dimensions, we mapped and evaluated several cloud resource orchestration frameworks against it, especially focusing on multi-cloud capabilities. This systematic analysis has allowed to identify key open research issues, also proposing a set of future research directions in the cloud orchestration scenario.