Degrees of tenant isolation for cloud-hosted software services: a cross-case analysis

Ochei, Laud Charles; Bass, Julian M.; Petrovski, Andrei

doi:10.1186/s13677-018-0121-8

Research
Open access
Published: 17 December 2018

Degrees of tenant isolation for cloud-hosted software services: a cross-case analysis

Journal of Cloud Computing volume 7, Article number: 22 (2018) Cite this article

8663 Accesses
7 Citations
1 Altmetric
Metrics details

Abstract

A challenge, when implementing multi-tenancy in a cloud-hosted software service, is how to ensure that the performance and resource consumption of one tenant does not adversely affect other tenants. Software designers and architects must achieve an optimal degree of tenant isolation for their chosen application requirements. The objective of this research is to reveal the trade-offs, commonalities, and differences to be considered when implementing the required degree of tenant isolation. This research uses a cross-case analysis of selected open source cloud-hosted software engineering tools to empirically evaluate varying degrees of isolation between tenants. Our research reveals five commonalities across the case studies: disk space reduction, use of locking, low cloud resource consumption, customization and use of plug-in architecture, and choice of multi-tenancy pattern. Two of these common factors compromise tenant isolation. The degree of isolation is reduced when there is no strategy to reduce disk space and customization and plug-in architecture is not adopted. In contrast, the degree of isolation improves when careful consideration is given to how to handle a high workload, locking of data and processes is used to prevent clashes between multiple tenants and selection of appropriate multi-tenancy pattern. The research also revealed five case study differences: size of generated data, cloud resource consumption, sensitivity to workload changes, the effect of the software process, client latency and bandwidth, and type of software process. The degree of isolation is impaired, in our results, by the large size of generated data, high resource consumption by certain software processes, high or fluctuating workload, low client latency, and bandwidth when transferring multiple files between repositories. Additionally, this research provides a novel explanatory framework for (i) mapping tenant isolation to different software development processes, cloud resources and layers of the cloud stack; and (ii) explaining the different trade-offs to consider affecting tenant isolation (i.e. resource sharing, the number of users/requests, customizability, the size of generated data, the scope of control of the cloud application stack and business constraints) when implementing multi-tenant cloud-hosted software services. This research suggests that software architects have to pay attention to the trade-offs, commonalities, and differences we identify to achieve their degree of tenant isolation requirements.

Background

Cloud services are applications delivered over the Internet [1] that offer users on-demand access to shared resources such as infrastructures, hardware platforms, and software application functionality [2].

Cloud services provide a dedicated instance of applications to multiple users using multitenancy. Multitenancy allows a single instance of a service to serve multiple users while retaining dedicated configuration information, application data and user management for each tenant [3].

A challenge when implementing multi-tenant services is to ensure that the performance demands and resource consumption of one tenant does not adversely affect another tenant; this is known as tenant isolation. Specifically, software architects need to be able to control the required degree of isolation between tenants sharing components of a cloud-hosted application [4].

Varying degrees of tenant isolation is possible, depending on the type of component being shared, the process supported by the component and the location of the component on the cloud application stack (i.e., application level, platform level, or infrastructure level) [3]. For example, the degree of isolation required for a component that cannot be shared due to strict laws and regulations would be much higher than that of a component that has to be reconfigured for some tenants with specific requirements.

We previously conducted separate case studies to empirically evaluate the degree of tenant isolation in three cloud-hosted software development process tools: continuous integration (with Hudson), version control (with File SCM Plugin and Subversion), and bug/issue tracking (with Bugzilla) [5–7]. The case studies allowed us to investigate instances (i.e., evaluating the degree of tenant isolation) of a contemporary software engineering phenomenon within its real-life context using real-world Global Software Development (GSD) tools deployed in a cloud environment [8]. To build an integrated body of knowledge from these individual case studies we decided to perform a case study synthesis. The case study synthesis allows us to extend the overall evidence base beyond the existing case studies. We can thus, make a new whole out of the parts, to provide new insights regarding degrees of tenant isolation.

As a consequence of these goals, we consider three research questions:

1.
What are the commonalities and differences in the degrees of tenant isolation for our three case studies?
2.
What are the deployment trade-offs for achieving the required degree of tenant isolation for our three case studies?
3.
What are the key challenges and recommendations for implementing tenant isolation for our three case studies?

We conducted the case study synthesis by employing a cross-case analysis technique to identify commonalities and differences in tenant isolation characteristics across the previous case studies. The cross-case analysis allows us to triangulate evidence from each case study to build a more substantial body of knowledge on tenant isolation [8, 9]. The cross-case analysis study was carried out in three phases: data reduction, data display, and conclusion drawing and verification. These phases were conducted in an iterative manner during the analysis to reach the conclusion [9].

The main contribution of this article is to provide an explanatory framework and new insights for explaining the commonalities and differences in the design, implementation and deployment of cloud-hosted services, and the trade-offs to consider when implementing tenant isolation [10]. The contributions of this article are summarised as follows:

1. Providing patterns of commonalities and differences across the existing cases studies:

(i) the study revealed five case study commonalities: disk space reduction, use of locking, low cloud resource consumption, customization and use of plugin architecture, and choice of multi-tenancy pattern. Two of these factors have a negative impact on tenant isolation. The degree of isolation is reduced when there no strategy to reduce disk space and customization and plugin architecture is not adopted. In contrast, the degree of isolation improves when careful consideration is given to handling a high workload, locking of data and processes is used to prevent clashes between multiple tenants, data transfer between repositories and selection of appropriate multi-tenancy pattern. (see “Results” section).

(ii) our research reveals five areas of case study differences: size of generated data, cloud resource consumption, sensitivity to workload changes, the effect of the software process, client latency and bandwidth, and type of software process). The large size of generated data, high resource consumption processes, high or fluctuating workload, low client latency, and bandwidth when transferring multiple files between repositories reduces the degree of isolation. The type of software process is challenging because it depends on the cloud resource being optimised. (See “Results” section)

2. Providing a novel explanatory framework for:

(i) mapping the degrees of tenant isolation to different software processes, cloud resources and layers of the cloud application stack (see “Results” and “Analysis” sections).

(ii) explaining the different trade-offs which includes tenant isolation versus (resource sharing, the number of users/requests, customizability, the size of generated data, the scope of control of the cloud application stack and business constraints) to be considered for optimal deployment of components with a guarantee of the required degree of tenant isolation (see “Results” and “Analysis” sections).

3. Presenting challenges and recommendation for implementing the required degree of tenant isolation. The challenges identified in this study are related to the type and location of components to be shared, customizability of the software tool, optimization of resources to cope with changing workload, hybrid cloud deployment conditions, tagging components with the required degree of isolation, error messages and security challenges during implementation. This study suggests among other things the following recommendations: (i) allowing the software architect to have more control over layers of the cloud infrastructure for configuration and provisioning of resources; (ii) splitting complex software processes into phases and isolation implemented for each phase in turn (ii) categorizing a cloud-hosted service into different aspects that can or cannot be customised (see “Discussion” section).

The rest of the paper is organized as follows - “Related work on degrees of tenant isolation for cloud-hosted services” section gives an overview of related work on degrees of tenant isolation for cloud-hosted services. “Methods” section discusses the cross-case analysis methodology used in this paper and how it fits into the overall case study research process. “Summary of the case studies” section provides a summary of the previous case studies. In “Results” section, presents the results of the cross-case analysis. “Analysis” section is a further analysis of the data produced from the cross-case analysis. “Discussion” section is the discussion of the results of the study. The limitations and validity of the study are presented in “Threats to validity” section. “Conclusions” section concludes the paper with the direction of future work.

Related work on degrees of tenant isolation for cloud-hosted services

Research work on multitenancy has acknowledged that there are varying degrees of isolation tenants accessing a multitenant cloud-hosted service [11–14]. Chong and Carraro discuss three approaches to managing multi-tenant data, and it is stated that the distinction between the shared data and isolated data is more of a continuum, where many variations are possible between the two extremes [11]. Three multitenancy patterns have been identified which express the degree of isolation between tenants accessing a shared component of an application [3]. These patterns are referred to as shared component, tenant-isolated component and dedicated component. The shared component represents the lowest degree of isolation between tenants while the dedicated component represents the highest. The degree of isolation between tenants accessing a tenant-isolated component would be in the middle.

Wang et al. explored key implementation patterns of data tier multi-tenancy based on different aspects of isolation such as security, customization and scalability [12]. For example, under the resource tier design pattern, the authors identified the following patterns: (i) totally isolated (dedicate database pattern); (ii) partially shared (dedicate table pattern); and (iii) totally shared (share table pattern). These patterns are similar to the shared component, tenant-isolated component and dedicated component patterns at the data tier, respectively [3]. Vengurlekar describes three forms of database consolidation which offer differing degrees of inter-tenant isolation as follows: (i) multiple application schemas consolidated in a single database, multiple databases hosted on a single platform; and (iii) a combination of both [13].

Mietzner et al. described how the services (or components) in a service-oriented SaaS application can be deployed using different multi-tenancy patterns and how the chosen patterns influence the customizability, multi-tenant awareness and scalability of the application [15]. These patterns are referred to as a single instance, single configurable instance and multiple instances. Although this work describes how individual services of a SaaS application can be deployed with different degrees of customizability, we believe that these concepts are similar to different degrees of isolation between tenants.

The three main aspects of tenant isolation are performance, stored data volume and access privileges. For example, in performance isolation, other tenants should not be affected by the workload created by one of the tenants. Guo et al. evaluated different isolation capabilities related to authentication, information protection, faults, administration, etc [16]. Bauer and Adams discuss how to use virtualization to ensure that the failure of one tenant’s instance does not cascade to other tenant instances [4].

Walraven et al implemented a run-time enforcement of performance isolation to comply with tenant-specific SLAs over distributed environments for multitenant SaaS using a real-world workflow-based SaaS offering (i.e., an online B2B document processing) executing on a JBoss AS-based private cloud platform [14]. Walraven et al. has also implemented a middleware framework for enforcing performance isolation using a multitenant implementation of a hotel booking application deployed on top of a cluster for illustration [17]. Krebs et al. implemented a multitenancy performance benchmark for a web application based on the TCP-W benchmark where the authors evaluated the maximum throughput and the number of tenants that can be served by a platform [18].

Youngs et al. discussed the decomposition of application functionality into application components and later the summarization of these components to tiers to achieve isolation between tenants [19]. Varia’s research work generally motivates why applications should be split into separate components when using the Amazon Web Services on the cloud to guarantee tenant isolation [20,21]. Varia has also discussed different migration scenarios for existing applications to Amazon Web Services (AWS) or other cloud storage [22].

At the very basic degree of multitenancy, tenants share application components as much as possible which translates to increased utilisation of underlying resources. However, while some application components may benefit from a low degree of isolation between tenants, other components may need a higher degree of isolation because the component may either be too critical or needs to be configured very specifically for individual tenants because of their unique deployment requirements. Again, tenant-specific requirements, such as laws and corporate regulations, may even further increase the degree of isolation required between tenants.

A component-based approach to tenant isolation through request re-routing, (COMITRE), was developed in our previous work. This approach was then applied to three case studies that empirically evaluated the varying degrees of isolation between tenants enabled by multitenancy patterns for three different Global Software Development (GSD) tools and associated processes: continuous integration with Hudson, version control with File System SCM plugin, and bug tracking with Bugzilla [5–7]. The GSD tools were deployed to the cloud using different cloud multitenancy patterns which represent varying degrees of isolation between tenants.

The aim of this paper is, therefore, to extend the overall evidence beyond the existing case studies, by synthesizing the findings of the three primary case studies that have been conducted to evaluate the degree of isolation between tenants accessing different cloud-hosted GSD tools deployed using different multitenancy patterns. The findings will provide the commonalities and differences across the existing case studies, and an explanatory framework for mapping degrees of tenant isolation to different software processes and explaining the trade-offs to be considered for optimal deployment of components with a guarantee of the required degree of tenant isolation.

Methods

In this section, we will first present the cross-case analysis method used in this paper and thereafter discuss how the cross-case analysis phase fits into the overall research procedure from the exploratory phase to the development of the explanatory framework.

Cross-case analysis

This study uses cross-case analysis to synthesize the findings of the three primary case studies. In a cross-case analysis, evidence from each primary study is summarised and coded under broad thematic headings, and then summarised within themes across studies with a brief citation of primary evidence. A cross-case analysis was selected because it involves a highly systematic process and allows us to include diverse types of evidence [9,23].

As shown in Fig. 1, the methodology can be divided into three phases: gathering findings from primary case studies, cross-case analysis and framework development. In the first phase the findings from the three primary case studies are gathered, summarised and pushed into the second phase which is the cross-analysis. In the cross-case analysis phase, we mobilise knowledge from individual case studies, compare and contrast the cases, and in doing so, produce a new knowledge. This knowledge is further refined to form an explanatory framework explaining the commonalities and differences in the design, implementation and deployment of cloud-hosted applications and the trade-offs to consider when implementing tenant isolation [10].

This paper adopts Miles and Huberman’s approach for conducting the cross-case analysis. The approach consists of three mains steps: data reduction, data display, and conclusion drawing and verification [9].

Data reduction

This mainly involves the identification of items of evidence in the primary case studies such as the paired sample test, plots of the estimated marginal means of change, discussion of findings and recommendations from previous case studies [9].

Data display

This step involves organising and assembling information that allows the drawing of conclusions using tools such as meta-matrices/tables and cause and relationships graphs. The data display steps will be tackled from two approaches to cross-case comparisons [24]:

(i) Variable oriented approach: This approach focuses on the variables to explain why the cases vary. These variables are related to performance and resource consumption which are known to affect the varying degrees of isolation between tenants.

(ii) Case-oriented approach: This approach focuses on the case itself instead of the variables to explain in what ways the cases are alike. By knowing the aspects in which the cases are alike it is then possible to generalise our findings, for example, to identify factors that appear to lead to a high (or low) degree of tenant isolation with a corresponding effect on resource consumption.

Conclusion drawing

This step involves further refining the above steps to produce conclusions concerning a particular aspect of interest. The outcomes of this step are a summary of: (i) key conclusions from the statistical analysis, and (ii) the recommended patterns for achieving the required degree of tenant isolation.

In summary, the focus of this paper is on a qualitative cross-case analysis of three quantitative case studies. We employed this approach because the context for each of the cases is rather different (for example, the requirements of bug tracking applications are not the same as those of version control applications) and because we are not trying to synthesise results here, but rather analyse the three cases and draw out commonalities and differences

Overall case study research process

The overall research procedure used in this study from the initial selection of the Global Software Development tools and processes up to the development of an explanatory framework (after carrying out cross-analysis of the case studies) is shown in Fig. 2. The overall research process can be regarded as a multimethod method research approach which entails combining different research methods: exploratory study, case study and cross-case analysis. The different research methods were chosen based on the qualities that each research method can contribute to answering our research question. For example, the exploratory study allowed us to carry out an empirical study to find out the type of software process and the supporting tools used in Global Software development projects and also explore the different cloud deployment patterns for deploying services to the cloud. The case study provided us with an in-depth understanding of the software processes and the effect of varying degrees of tenant isolation on the performance and resource consumption of tenants. The cross-case analysis allowed us to accumulate knowledge, compare and contrast cases, and in doing so, produce new knowledge such as an explanatory framework for explaining the commonalities and differences in the case studies and the trade-offs to consider when implementing tenant isolation [10].

As shown in Fig. 2, the research process started with an exploratory study which entails reviewing related work on multitenancy and tenant isolation, carrying out empirical studies on widely used software tools, formulating research questions, and developing the scope of study. The next step involved selecting the software tools and supporting processes based on findings of the previous step, and then designing the experimental procedure, data collection and analysis of results of from the primary case studies. The third step, which is the focus of this study, involves carrying out the synthesis of findings of the primary case studies in order to provide an explanatory framework and new insights on tenant isolation.

In the next section, we will provide a summary of the case studies. The summary is important for two main reasons: (i) to summarise the primary case studies and show how the cross-case analysis fits into the overall research process; (ii) to summarise the findings of the case studies.

Summary of the case studies

In this section, we summarise the overall research process with particular reference to how the cases studies were conducted.

Selection of GSD tools and processes

This study used the most-similar technique (i.e., purposive, non-random selection procedure) to select the cases, since random sampling is inappropriate as a selection method [25]. The three GSD tools we selected were inspired by the dataset of GSD tools and processes derived from a previous exploratory study on how to create and use a taxonomy for selecting applicable deployment patterns for cloud deployment of GSD tools. In that study, we derived a set of five GSD tools - JIRA, VersionOne, Hudson, Subversion and Bugzilla [26]. Each of these tools supports different GSD processes such as continuous integration, issue tracking, bug tracking, version control, source code management and agile project management. From this dataset, we then selected three GSD processes that are widely used in Global Software Development: continuous integration with Hudson, version control with FileSystem SCM Plugin, and bug tracking. These GSD processes were chosen for three main reasons: (i) these processes are widely used in GSD; (ii)there are open-source tools and/or plugins that are specifically developed to support them; and (iii) they are flexible to be customize and extended.

We have to point out here that the emphasis of this study is not on the GSD tools or plugins themselves but on the GSD process they support. There are different tools and plugins that can be used to trigger these processes. Our intention was to choose a tool or plugin that is specifically developed to trigger the GSD processes we have selected. For example, Hudson can be used to trigger several GSD processes but it was mainly developed to support continuous integration. Therefore, we wanted to know how these processes and the data they generate impact on tenant isolation with respect to performance and resource utilization.

Conducting case studies

Case studies are well suited for software engineering research since they study contemporary phenomena in its natural context (i.e., real-world open-source GSD tools in our case) [27]. This study is based on three primary case studies (published separately [5–7]) which evaluated empirically the degree of isolation between tenants enabled by multitenancy patterns for Cloud-hosted GSD tools and processes under different cloud deployment conditions. Software tools used for Global Software Development (GSD) are increasingly being deployed on the cloud, and therefore it is essential to properly isolate the code files and processes of tenants so that the required performance, resource utilization, and access privileges of one tenant does not affect other tenants.

Design of the case study

The specific design of the case study is multiple-case design with multiple embedded units of analysis. This case study design represents a form of mixed method research which relies on more holistic data collection strategy for studying the main case but then calls upon more quantitative techniques (in this case, experimentation) to collect data about the embedded unit(s) of analysis [25,27]. The experiments within the case study will enable us to collect data for evaluating the degree by which the performances of the different deployment architecture/patterns would differ under realistic cloud deployment conditions of GSD tools.

Figure 3 shows the component of the design for the first case study. Case study two and three can be captured using the same diagram by simply replacing “The Case” with multi-tenancy deployment patterns for version control system and bug tracking systems respectively. The main context of the study is deployment patterns for cloud-hosted services. Each case study focuses on multitenancy patterns that can be used to deploy services to the cloud. For each case study, there are three units of analysis and each unit of analysis represent each of the three multitenancy patterns (i.e., shared pattern, tenant-isolated pattern, and dedicated pattern). Each multitenancy pattern captures the varying degrees of isolation between tenants.

The results of the study were analysed using relevant statistically techniques. We adopted the Repeated Measures Design and Two-way Repeated Measures (within-between) ANOVA for the experimental design and statistical analysis respectively [28,29].

COMITRE: a framework for implementing tenant isolation in the case studies

The implementation of the tenant isolation is done within the framework of COMITRE and its supporting algorithms. We refer the reader to [5,6,30] for the architecture, implementation procedure and supporting algorithms for COMITRE that are integrated into the GSD tools to support the implementation of the varying degrees of tenant isolation.

In a nutshell, the Component-based approach to Multitenancy Isolation through Request Re-routing (COMITRE) is an approach for implementing the varying degrees of multitenancy isolation for cloud-hosted services/applications. It captures the essential properties required for the successful implementation of multitenancy isolation while leaving large degrees of freedom to cloud deployment architects depending on the required degree of isolation between tenants. Figure 4 captures the architecture of COMITRE.

The actual implementation of the COMITRE is anchored on shifting the task of routing a request from the server to a separate component at the application level of the cloud-hosted GSD tool. For example, this component could be a program component (e.g., Java class file) or a software component (e.g., plugin) which can be integrated into the GSD tool. Once the request is re-routed to a component and captured, then important attributes of the request can be extracted and configured to reflect the degree of isolation required by the tenant.

Implementation of tenant isolation

In this section, we present a short description of how the three case studies were implemented. The three primary case studies have been published separately, and we refer the reader to [5–7] for details. The case studies are summarised below:

1.
Case Study One - Continuous Integration with Hudson: Hudson is a continuous integration server, written in Java for deployment in a cross-platform environment [31]. Large organisations such as Apple and Oracle use Hudson for setting up production deployments and automating the management of cloud-based infrastructure [32]. Our main interest is in capturing the isolation of a tenant’s data and process during automated build verification and testing, an essential development practice when using a continuous integration system. Tenant isolation was implemented by modifying Hudson using the Hudson’s Files-Found-Trigger plugin, which polls one or more directories and starts a build if certain files are found within those directories [33]. This involved introducing a Java class into the plugin that accepts a filename as an argument. During execution, the plugin is loaded into a separate class loader to avoid conflict with Hudson’s core functionality. As the build process is being carried out, data is logged into a database every time a change is detected in the file.
2.
Case Study Two - Version Control with File System SCM plugin: Filesystem SCM plugin can be used to simulate the file system as a source control management (SCM) system by detecting changes such as the file system’s last modified date [33]. Our interest was in simulating the process on a local development machine by pointing a build configuration to the locally checked out code and modified files on a shared repository residing on a private cloud. The File System SCM plugin was integrated into Hudson to represent a scenario where a code file is checked into a shared repository for Hudson to build. Tenant isolation was then implemented by modifying this plugin within Hudson. This involved introducing a Java class into the plugin that accepts a file path and the type of file(s) that should be included when checking out from the repository into Hudson workspace. During execution, the plugin is loaded into a separate class loader to avoid conflict with Hudson’s core functionality.
3.
Case Study Three - Bug Tracking with Bugzilla: Bugzilla was modified using the recommended Bugzilla Extension mechanism. Extensions can be used to modify either the source code or user interface of Bugzilla, which can then be distributed to other users and re-used in later versions of Bugzilla. Bugzilla maintains a list of hooks which represent areas in Bugzilla that an extension can hook into, thereby allowing the extension to perform any required action during that point in Bugzilla’s extension [34]. For our experiments, a special extension was written and then “hooked” into Bugzilla using the hook named install_before_final_checks. This hook allows the execution of custom code before the final checks are done in checksetup.pl, and so the COMITRE algorithm was implemented in this hook.

In this study, it is important to note that the use of plugins in the case study experiments is basically a mechanism for customizing, modifying and extending the GSD tools (e.g., Hudson and Bugzilla) to support the implementation of multitenancy architectures and is not a limitation to providing insights about the trade-offs across the different multitenancy patterns. Most open-source tools are easily extensible using plugins by relying on a series of extension points provided to extend its functionality. For example, the extension points provided within Hudson plugin was customised to implement the three multitenancy patterns (i.e., shared component, tenant-isolated component, dedicated component) and thus support multitenancy isolation. During execution, the plugin is loaded into a separate class loader which does not conflict with the core functionality of the GSD tool.

Evaluation of the case studies

The three case studies were evaluated using the same experimental design, setup, procedure and statistical analysis. The evaluation of the case studies is summarised in the sections that follow.

Experimental design

A set of four tenants (T1, T2, T3, and T4) are configured to access an application component deployed using three different types of multitenancy patterns (i.e., shared component, tenant-isolated component, and dedicated component). Each pattern is regarded as a group in this experiment. Treatment was created for configuring T1 so that it will experience a high workload. For each experimental run, one of the four tenants (i.e., T1) is configured to experience a demanding deployment condition (e.g., large instant loads) while accessing the application component. The performance metrics (e.g., response times) and systems resource consumption (e.g., CPU) of each tenant are measured before the treatment (pre-test) and after the treatment (post-test) was introduced. The hypothesis of the experiment is that the performance and system’s resource utilisation experienced by tenants accessing an application component deployed using each multitenancy pattern changes significantly from the pre-test to the post-test.

Based on this information, a two-way repeated measures (within-between) ANOVA was adopted as the experimental design. This experimental design is used when there are two independent variables (factors) influencing one dependent variable [29]. In our case, the first factor is multitenancy deployment pattern, and the second factor is time. Multitenancy pattern is the between factor because our interest is in looking at the differences between the groups using different multitenancy patterns for deployment. Time is the within factor because our interest is in measuring each group twice (pre-test and post-test). The data view of our experimental design is composed of a Group column that indicates which of the three groups the data belongs to, and 2 columns of actual data, one for the Pre-test and one for the Post-Test.

Experimental setup

The experimental setup consists of a private cloud setup using Ubuntu Enterprise Cloud (UEC). UEC is an open-source private cloud software that comes with Eucalyptus [35]. Figure 5 is the UEC configuration used for the experiment. It has six physical machines - one client node(i.e, the head node) and five server node (i.e., the sub-nodes). The experimental setup is based on the typical minimal Eucalyptus configuration where all user-facing and back-end controlling components (Cloud Controller(CLC), Walrus Storage Controller (WS3), Cluster Controller (CC), and Storage Controller (SC)) are grouped on the first machine, and the Node Controller (NC) components are installed on the second physical machine. The guidelines for installing UEC as outlined in [36] were followed to extend the configuration by installing the NC on all the other sub-nodes to achieve scalability. The head node was also used as the client machine since it does not have to be a dedicated machine. Installing UEC is like installing Ubuntu server; the only difference is the additional configuration screens for the UEC components. The hardware configuration of the head node and sub-nodes is summarised in Table 1.

Table 1 Hardware and Network Configuration of the UEC

Full size table

Setup of the UEC used for experiments

Experimental procedure

An abstract summary of the experimental procedure used for all the case studies is captured in Fig. 6. We refer the reader to [5,6] for specific details of how the GSD tools were modified for each case study, how the processes associated the GSD tools were simulated, and the experimental procedure, so these details will not be repeated here due to space limitations.

In a nutshell, the GSD tool used for each case study is modified to support tenant isolation. This involved developing a plugin and integrating it with the GSD tool so that it can be accessed by different tenants. The GSD tool is then bundled as a VM image and uploaded to a private cloud with a typical minimal UEC configuration.

To evaluate the degree of tenant isolation between tenants, four tenants (referred to as tenant 1, 2, 3, and 4) were configured based on access to the functionality/component of the GSD tool that is to be served to multiple tenants. Accesses to this functionality is associated with a tenant identifier that is attached to every request. Based on this identifier, a tenant-specific configuration is retrieved from the tenant configuration file and used to adjust the behaviour of the GSD tool’s functionality that is being accessed.

A remote client machine was used to access the GSD tool running on the instance via its public IP address. Apache JMeter is used as a load balancer as well as a load generator to generate workload (i.e., requests) to the instance and monitor responses [37]. To measure the effect of tenant isolation, tenant 1 was configured to simulate a large instant load by: (i) increasing the number of requests using the thread count and loop count; (ii) increasing the size of the requests by attaching a large file to it; (iii) increasing the speed at which the requests are sent by reducing the ramp-up period by one-tenth, so that all the requests are sent ten times faster; and (iv) creating a heavy load burst by adding the Synchronous Timer to the Samplers in order to add delays between requests, such that a certain number of requests are fired at the same time. This treatment type can be likened to an unpredictable (i.e., sudden increase) workload [3] and aggressive load [17].

Each tenant request is treated as a transaction composed of all types of request simulated. For example, the HTTP request triggers a build process while the JDBC request logs data into a database which represents an application component that is being shared by the different tenants. The transaction controller is used to group all the samplers in order to get the total metrics (e.g., response) for carrying out the two requests.

The setup values for the experiment are shown in Table 2. It is important to note that since different processes are being simulated using different GSD tools, the setup values (e.g., thread count, loop count, and loop controller count) will vary slightly. To have a balanced basis for comparison, the workload was carefully varied to cope with the private cloud used in such a way that: (i) the number of requests sent by tenant 1 (i.e., the tenant that experiences a very high workload or aggressive load) is two times more, five times heavier, and ten times faster than the other tenants; and (ii) all other tenants regardless of the type of request being simulated sends the same number of requests. Ten iterations for each run were performed and the values reported by JMeter used as a measure for response times, throughput and error%. For system activity, the average CPU, memory, disk I/O and system load usage at a one-second interval were recorded.

Table 2 Setup parameters used in the experiments

Full size table

The generated workload for the experiments were realistic and appropriate for each of the applications in the case studies. The number and size of requests sent to the application component during the case study experiments were within the limit of the private cloud used (i.e., Ubuntu Enterprise Cloud). This was achieved by carefully varying the setup values to get the maximum capacity of the software process triggered by the GSD tool (e.g., Hudson’s build processes) running on the private cloud before conducting the experiments.

Statistical analysis of the case studies

Based on the information about the experimental design, the two-way Repeated Measures (within-between) ANOVA was adopted for the statistical analysis because (i) it does not require many subjects since each subject would be measured twice. and (ii) it eliminates the difficulty of trying to match subjects perfectly between the different conditions in all respects [38].

The tool used for statistical analysis is SPSS v21. The two-way (within-between) ANOVA was performed first to determine if the groups had significantly different changes from Pre-test to the Post-test. After that, planned comparisons were carried out involving the following:

(i) a one-way ANOVA followed by Scheffe post hoc tests to determine which groups showed statistically significant changes relative to the other groups. The dependent variable (simply called “Change”), used in the one-way ANOVA test was determined by subtracting the Pre-test from Post-test values.

(ii) a paired sample test to determine if the subjects within any particular group changed significantly from pre-test to post-test were measured at 95% confidence interval. This would give an indication whether or not the workload created by one tenant affected the performance and resource utilisation of other tenants. The “Select Cases” feature in SPSS was used to select the three tenants (i.e., the T2, T3, T4 that did not experience large instant loads) for each pattern which gives a total of 3 cases for each metrics measured [28,29].

After the first two steps outlined above, the plots of estimated marginal means were analysed in combination with ANOVA (plus post hoc test) and paired sample test results from SPSS output. These plots are referred to as the “Estimated Marginal Means of Change (EMMC)”. Note that the word “Change” refers to the transformed variable used as the dependent variable in the one-way ANOVA. The plot of EMMC is simply a plot of the mean value for each combination of factor level.

Results of the case studies

In this section, we present a summary of results of each case study.

Results for case study 1 - continuous integration

The results of the case study are analysed based on the results of the paired sample t-test shown in Table 3, and supplemented with information from the plots of Estimated Marginal Means of Change(EMMC) ^{Footnote 1}. The key used in constructing the paired sample t-test table is as follows: YES - represents a significant change between the metrics measured from pre-test to post -test. NO - represents some level of change which cannot be regarded as significant; no significant influence on the tenants. The symbol “-” implies that the standard error of the difference is zero and hence no correlation and t-test statistics can be produced. This means that the difference between the pre-test and post-test values are nearly constant with no chance of variability. Figures 8, 9, 10, 11, 12, 13 and 14 in the Appendix A show the plots of the estimated marginal means of change for the parameters measured^{Footnote 2}.

(1) Response times and Error%: Table 3 shows that the response times and error% of tenants did not change significantly except for the dedicated component. The plot of the EMMC revealed that the magnitude of change for response times showed a much larger change for the dedicated component. This is due to the overhead incurred because of opening multiple connections to the database each time a JDBC request is made to a different database. For error%, the magnitude of change was larger for tenants deployed based on the shared component than for other patterns. A possible explanation for this is that there is resource contention since multiple connections are opened while sending requests that log all the data into the same component (i.e., database table) that is being shared. Overall, this causes delay in completion times thereby producing a negative effect on error%.

(2) Throughput: The paired sample test result showed that the throughput changed significantly, implying a low degree of isolation. In this situation, the shared component is not recommended for avoiding a situation where requests are struggling to gain access to the same application component, thereby resulting in some request either being delayed or rejected. For a tenant-isolated component and dedicated component, there would not be much change in throughput because requests are not concentrated on one application component but instead are directed to the separate components reserved for different tenants. Throughput can be likened to bandwidth, and so it means that the bandwidth was not sufficiently large to cope with the size, number and frequency of requests sent to the CI system.

(3) CPU and System Load: The paired sample test showed that the CPU consumption of tenants did not change significantly for most patterns except for the shared component. Therefore, once a reasonable CPU size (e.g., multiple CPUs or a multi-core CPU) is used, there should be no problem in performing builds. Builders are not known to consume much CPU. For example, Hudson does not consume much CPU; a build process can even be setup to run in the background without interfering with other processes [32].

One of the most significant findings of this study is that the system load did not influence any of the patterns. The paired sample test results were similar in all patterns; that is, the standard error difference was the same for tenants (or components) deployed using all the three multitenancy patterns. This result shows that the system load was nearly constant with no variability in the values from pretest to post-test. Therefore, in a real cloud deployment, the system load would not be a problem especially if CPU is reasonably large enough to allow the application to scale well.

(4) Memory: The paired sample test result showed that there was a significant change in memory consumption for all three patterns. Complex and difficult builds are those that are composed of a vast number of modular components including different frameworks, components developed by different teams or vendors, and open source libraries [39]. Compilers and builders consume a lot of memory especially if the build is difficult and complex [32]. In a large project, it is expected that multiple builds will interact with multiple components to create several dependencies and supported behaviour with each other thereby making builds difficult and complex.

(5) Disk I/O: Compilers and builders are known to consume disk I/O especially for I/O intensive builds [32]. The results show that only the shared component showed no significant change in disk I/O usage. This is understandable because multiple transactions are channelled to the same component which would either be delayed or blocked because of sharing the components. Further analysis of the plot of the EMMC confirmed that the magnitude of change for the shared component was the least, and therefore is recommended for builds that particularly involve intensive I/O activity especially when locking is enabled.

Table 3 Paired sample test analysis for case study 1

Full size table

Results for case study 2 - version control

The results of the case study are analysed based on the paired sample t-test (shown in Table 4) and supplemented with information from the plots of Estimated Marginal Means of Change (EMMC). Figures 15, 16, 17, 18, 19, 20 and 21 in the Appendix B show the plots of the estimated marginal means of change for the parameters measured.

(1) Response times and Error%: The paired sample test results showed that response times changed significantly for most of the patterns. As expected, the plot of the EMMC demonstrated that the magnitude of change for response times was much higher for the shared component and the tenant-isolated component. The results seem to show that there were no long delays that affected the error% rate. The error% showed no significant change based on the paired sample t-test. One aspect where error% (i.e., unacceptably slow response times) is known to have an impact is when committing a large number of files to a repository that is directly based on the native OS file system (e.g., FSFS). Delays usually arise when finalising a commit operation which could cause tenants requests to time out while waiting for a response.

(2) Throughput: The paired sample t-test results show that throughput changed significantly for all the patterns. Further analysis of the plots of the EMMC showed that the magnitude of change for the shared component was much higher than the other patterns. Since locking was enabled, it seems to show that it had an adverse impact on a tenant deployed based on a shared component. Therefore, the dedicated component would be recommended for tenants accessing bugs, especially if the bugs are stored in a database with locking enabled.

(3) CPU and System Load: The paired sample t-test showed that CPU changed significantly for all patterns. A possible reason for this is the overhead incurred in transferring data from the shared repository based on FSFS to the database (i.e., MySQL). The plot of the EMMC showed that the magnitude of change in CPU increased steadily across the three patterns with the dedicated component being the most influenced. Therefore, if there is need to avoid high CPU consumption, then the dedicated component is therefore not recommended for version control. This is because storing or retrieving bugs could involve locking or blocking other tenants from accessing a component that is being shared.

Table 4 shows that system load was nearly constant with no chance of variability, and so this means that system load did not influence any of the patterns. Therefore, with a reasonably high-speed network connection and CPU size, there should be no problem with system load when sending data across a shared repository residing in a company’s LAN or VPN.

(4) Memory and Disk I/O: Memory consumption changed significantly for all patterns based on the paired sample t-test result. The plot of the EMMC showed that the magnitude of change for the shared component was higher than the other patterns. Therefore, the shared component would not be recommended when there is a need for better memory utilisation. The paired sample t-test revealed that the usage of disk I/O by tenants changed significantly from pre-test to post-test for all the patterns. This is due to the intense frequency of the I/O activities in the disk because of the file upload and download operations. The dedicated component would be recommended since this would allow each tenant to have exclusive access to the component being shared, thereby reducing a possible contention for disk I/O and other resources when the number and frequency of request increase suddenly.

Table 4 Paired sample test analysis for case study 2

Full size table

Results for case study 3 - bug tracking

This section presents a summary of the experimental results for case study 3. The results of the paired sample t-test are summarised in Table 5, while the plots of the estimated marginal means of change are shown in Figs. 22, 23, 24, 25, 26, 27 and 28 in the Appendix C.

(1) Response times and Error%: From the plots of the estimated marginal means of change (EMMC), it can be seen that the dedicated component showed a lower magnitude of change in response time, and it is recommended for achieving isolation between tenants accessing bugs in a database with locking enabled. However, the plots of EMMC show that the number of requests with unacceptable response times was much higher for shared components compared to tenant-isolated and dedicated components. This is possibly due to the effect of locking on the database which causes a delay in the time it takes for requests to be committed. Using the dedicated component ensures a high degree of isolation, but with limitations of increased resource consumption (e.g., memory and disk I/O). To address this challenge, it is suggested storing large bug attachments on the disk and then storing the links to these files on the bug database to improve performance, especially when retrieving data.

(2) Throughput: The paired sample test result showed that there was no significant change in throughput for most of the patterns unlike two previous case studies. This result is similar to that of the two previous case studies where throughput was relatively stable. The implication of this is that if the component being shared is a database, then throughput should not be expected to change drastically. Based on the plot of the EMMC, the shared component would be recommended when bugs are stored in a database with locking enabled.

(3) CPU and System Load: The results of the paired sample test show that there was a significant change in CPU for all the patterns. By analysing the plots of the EMMC, the results show that the dedicated component changed the most and so would not be recommended if optimising CPU usage is a key requirement. As with other case study results, there was no influence on any of the patterns for system load. The plots of EMMC showed that system load increased steadily across the patterns from shared component to dedicated component.

(4) Memory and Disk I/O: The paired sample test for both the memory and disk I/O showed a highly significant difference from pretest to post-test both for memory and disk I/O. For memory, the plot of the EMMC similarly showed that the dedicated component had the highest significant change compared to the other patterns. This is possibly due to running Bugzilla in a mod_perl environment, and so using a dedicated component would not be a good option for optimising system resources. It is well known that running Bugzilla in mod_perl environment consumes a lot of RAM [34]. The significant change in disk I/O consumption is due to the intense frequency of read/write activities in the database. For disk I/O consumption, having enough storage space would be required, especially if a large volume of bugs with attachments is expected. If a large number of users are expected, then applying disk space saving measures such as purging unwanted error or log files regularly could reduce disk I/O consumption and improve the chance of having a higher degree of isolation.

Table 5 Paired Sample Test Analysis for Case Study 3

Full size table

Results

This section explains how the findings from the three case studies were synthesised. The case synthesis was done using cross-case analysis approach and then complemented with narrative synthesis.