Skip to main content

Advances, Systems and Applications

QoS-based ranking and selection of SaaS applications using heterogeneous similarity metrics


The plethora of cloud application services (Apps) in the cloud business apps e-marketplace often leads to service choice overload. Meanwhile, existing SaaS e-marketplaces employ keyword-based inputs that do not consider both the quantitative and qualitative quality of service (QoS) attributes that characterise cloud-based services. Also, existing QoS-based cloud service ranking approaches rank cloud application services are based on the assumption that the services are characterised by quantitative QoS attributes alone, and have employed quantitative-based similarity metrics for ranking. However, the dimensions of cloud service QoS requirements are heterogeneous in nature, comprising both quantitative and qualitative QoS attributes, hence a cloud service ranking approach that embrace core heterogeneous QoS dimensions is essential in order to engender more objective cloud selection. In this paper, we propose the use of heterogeneous similarity metrics (HSM) that combines quantitative and qualitative dimensions for QoS-based ranking of cloud-based services. By using a synthetically generated cloud services dataset, we evaluated the ranking performance of five HSM using Kendall tau rank coefficient and precision as accuracy metrics benchmarked with one HSM. The results show significant rank order correlation of Heterogeneous Euclidean-Eskin Metric, Heterogeneous Euclidean-Overlap Metric, and Heterogeneous Value Difference Metric with human similarity judgment, compared to other metrics used in the study. Our results confirm the applicability of HSM for QoS ranking of cloud services in cloud service e-marketplace with respect to users’ heterogeneous QoS requirements.


Cloud computing is a model of service provisioning in which dynamically scalable and virtualized resources, that includes infrastructure, platform, and software, are delivered and accessed as services over the internet [1, 2]. The popularity of the cloud attracts a variety of providers that offer a wide range of cloud-based services to users in an e-marketplace environment, culminating in an exponential increase in the number of available functionally equivalent cloud services [3, 4]. Currently, there exist a number of cloud-based digital distribution services such as,Footnote 1 Appexchange.comFootnote 2 (viz. cloud e-marketplaces), which host SaaS cloud services (business cloud apps) that are designed to provide specific user-oriented services when selected. The proliferation of cloud application services in the cloud e-marketplace without a systematic framework to guide the selection of the most relevant ones usually leaves the users with the problem of which service to select, a phenomenon that can be described as service choice overload [5,6,7,8]. Currently, these existing cloud service e-marketplaces elicit keyword-based search queries that do not allow users to indicate their preferences in terms of quality of service (QoS) requirements and present search results as an unordered list of icons that must be explored individually by a user before making a decision [9]. This mode of presentation does not enable the user to discriminate among services in terms of their suitability with respect to user’s request, which complicates decision making [10]. Decision making can be simplified and service choice overload can be reduced by considering user’s QoS requirements and ranking of services based on their QoS attributes so that users can gain quicker insight on the best services that are more likely to satisfy their requirements.

QoS are measurable non-functional attributes that describe and distinguish services and forms the basis for service selection [11, 12]. However, QoS attributes are usually heterogeneous in nature, covering both quantitative and qualitative (or categorical) attributes. The Service Measurement Index (SMI) [13] defines seven main categories to be considered when comparing QoS of cloud services, which are a combination of quantitative and qualitative measures. These are Accountability, Agility, Assurance, Financial, Performance, Security and Privacy, and Usability. Each category has multiple attributes, which are either quantitative or qualitative in nature. For example, quantitative attributes such as service response time, accuracy, availability, and cost can be measured quantitatively by using relevant software and hardware monitoring tools, whereas qualitative attributes such as usability, flexibility, suitability, operability, elasticity etc. which cannot be quantified are mostly deduced based on user experiences. These qualitative attributes are measured using an ordinal scale consisting of a set of predefined qualifier tags such as good, high, medium, fair, excellent rating etc. [13,14,15]. Most of the existing cloud service selection approaches hitherto reported in the literature have overlooked critical dimensions of QoS requirements that are qualitative such as security and privacy, usability, accountability, and assurance in formulating a basis for cloud service ranking and selection.

A number of cloud service selection approaches are based on a content-based recommendation scheme that explores the similarity between the QoS attributes of the user’s requirements and the features description of specific cloud services in order to rank them [16,17,18,19]. Most of these approaches have only considered quantitative attributes for their ranking of services, which is based on the assumption that all QoS attributes are quantitative in nature, and therefore used quantitative similarity metrics such as exponential weighted difference metric or weighted difference metric [17]. This form of assumption is deficient to adequately model the heterogeneous nature of QoS requirements, as a precursor to creating a credible basis for comparing and ranking cloud services. Also, there are instances such as [20, 21], where steps were taken to quantify specific qualitative attributes such as security or usability in order to apply homogeneous distance metrics on them for the purpose of decision making. The drawback of this approach is that since cloud QoS attributes are usually heterogeneous in nature, heterogeneous metrics are more likely to produce better generalization over time on heterogeneous data [22, 23]. This scenario imposes a limitation on approaches where quantification of qualitative attributes has been undertaken for the purpose of cloud service ranking and selection.

In order to achieve an effective QoS-based ranking of cloud services in cloud service e-marketplaces, there is a need for a service selection approach that considers both the quantitative and qualitative QoS dimensions that characterises cloud services and is able to rank cloud services accurately with respect to user requirements using heterogeneous similarity metrics.

In this paper, we propose the use of in similarity metrics that combines quantitative and qualitative dimensions to rank cloud services in cloud e-marketplace context based on QoS attributes. An experimental study of five heterogeneous similarity metrics was conducted to ascertain their suitability for cloud service ranking and selection using a simulated dataset of cloud services. This is in contrast to previous work in the domain of cloud service selection.

The remaining part of this paper is as follows: Section “Background and Related Work” provides background to the context of this work, and also a discussion of related work. In section “Heterogeneous Similarity Metrics for Cloud Service Ranking and Selection” we give the descriptions of the five heterogeneous similarity metrics used in this study, while the empirical results of the comparison of the ranking performance of the metrics were presented in Section “Experimental Evaluation and Results”. A discussion of the findings of this study is contained in Section “Discussion”. The paper is concluded in Section “Conclusion” with a brief note and an overview of further work.

Background and related work

The relevant concepts that underpin this study and an overview of related work are presented in this section.

Cloud service e-marketplace

The e-marketplace of cloud services provides an electronic emporium where service providers offer users a wide range of services for users to select from [24,25,26]. Similar to AmazonFootnote 3 or AlibabaFootnote 4 that deal in commodity products, the goal of a cloud service e-marketplace such as SaaSMax, and AppExchange is to provide a facility for finding and consuming cloud services, by allowing users to search for suitable business apps that offer user-oriented services that match their QoS requirements. However, unlike commodity products, cloud services possess QoS attributes that distinguish functionally equivalent services from each other. The profitability of the cloud service e-marketplace is realised by users’ ability to easily and quickly find and select suitable services that meet their QoS requirements. However, most cloud service e-marketplaces in existence do not consider QoS information from the users but rely on keyword matching, and the results are not ranked in a manner that makes the differences among the services to be obvious with respect to users’ requirements. This leads to service choice overload because a large number of services are presented as an unordered list of icons that require the user to further investigate the differences between the services by checking them one after the other. The discrimination of services based on their QoS information is a panacea towards reducing service choice overload as the cloud service QoS model encompasses Key Performance Indicators for decision making [27]. Besides, the QoS model comprises the important comparable characteristics of each service, and suitable for matching user QoS requirements to services’ QoS attribute [28]. One of the most comprehensive International Standard Organization (ISO) certified QoS model for cloud services is the Service Measurement Index (SMI) [13].

Service measurement index

The Service Measurement Index (SMI) is developed by the Cloud Services Measurement Initiative Consortium (CSMIC). The SMI is a framework of critical characteristics, associated attributes, and metrics that can be used to compare and evaluate cloud-based services from different service providers [27, 29]. SMI was designed as the standard method to measure any type of cloud service (i.e. XaaS) based on the user requirements. The SMI is a hierarchical framework, with seven top-level categories, which are Accountability, Agility, Assurance, Financial, Performance, Security and Privacy, and Usability and each category is further broken into four or more attributes that underscore the categories. Based on the SMI QoS model, it is obvious that some metrics are quantitative in nature while others are qualitative. Quantitative QoS metrics are those which can be measured and quantified (e.g. response time, throughput); whereas, qualitative QoS metrics is subjective in nature and are only inferred by user’s feedback (e.g. security, usability etc.). Cloud services can be assessed and ranked based on both QoS metric dimensions, i.e., quantitative and qualitative, by comparing the similarity of user’s QoS requirements and service QoS properties, thus following a content-based approach.

QoS similarity-driven cloud service ranking

The similarity is a measure of proximity between two or more objects or variables [30] and it has been applied in domains that require distance computation. Similarity can be measured on two types of data: quantitative data (also called numerical data) and qualitative (also called categorical/nominal data) [31]. Many metrics have been proposed for computing similarity on either quantitative data or qualitative data. However, few metrics have been proposed to handle datasets containing a mixture of both quantitative and qualitative data. Such metrics usually combines quantitative and qualitative distance functions. For quantitative data, a generic method for computing distance is Minkowsky [32], with widely used specific instances such as the Manhattan (of order 1) and Euclidean (of order 2). The computation of similarity for quantitative data is more direct, compared to qualitative data, because quantitative data can be completely ordered, while comparing two qualitative values is somewhat complex [31]. For example, the overlap metric [33], assigns a similarity value of 1 when two qualitative values are the same and 0 otherwise. In the context of selecting cloud services from the list of available services, the ranking of services based on the heterogeneous QoS model necessitates the application of similarity metrics that can handle mixed QoS data. The notion of similarity considered in this paper is between vectors with the same set of QoS properties, which might differ in their QoS values i.e. users’ QoS requirements and service QoS descriptions.

Related work

The success of a cloud service e-marketplace is hinged on adequate support for satisfactory selection based on the QoS requirements of the user. So far in the literature, the approaches used for cloud service ranking and selection can be broadly classified as content-based filtering, collaborative filtering, and multi-criteria decision-making methods. Instances of collaborative filtering-based approaches include CloudRank, which is a personalised ranking prediction framework that utilises a greedy-based algorithm. It was proposed in [18] to predict QoS ranking by leveraging on similar cloud service user’s past service usage experiences of a set of cloud services. The ranking is achieved by finding the similarity between the user-provided QoS requirements and those of other users in the past. Similar users are identified based on these similarity values and services are ranked accordingly. In contrast to our work, CloudRank [18] did not consider the computation of vector similarity between cloud services and user-defined QoS requirements.

CloudAdvisor, a Recommendation-as-a-Service platform was proposed in [34] for recommending optimal cloud offerings based on a given user preference requirements. Users supply preference values to each property (energy-level, budget, performance etc.) of the cloud offerings, and the platform recommends available optimal cloud offerings that match user’s requirements. Service recommendations in [34] are determined by solving a constraint optimization model and users can compare several offerings automatically derived by benchmarking-based approximations. However, the QoS dimensions considered in [34] are mainly quantitative and do not reflect the holistic heterogeneous QoS model of cloud services.

Selection of cloud services in the face of many QoS attributes is a type of Multi-criteria Decision Making (MCDM) [14]. Considering the multiple QoS criteria involved in selecting cloud services, [14] propose a ranking mechanism based on Analytical Hierarchical Process (AHP) to assign weights to non-functional attributes to quantitatively realise cloud services ranking. Apart from the complexity in computing the pairwise comparisons of the attributes of the cloud service alternatives, this approach is most suitable when the number of cloud services is few, which is not the case in a cloud service e-marketplace that comprises numerous services. Besides, in the approach proposed in [14], users cannot determine the desired values of the QoS service properties, and services are ranked based on quantitative QoS attributes alone.

Content-based filtering approaches include [17] in which a ranked list of services that best match user requirements is returned based on the nearness of user’s QoS requirement to the QoS properties of cloud services in the marketplace. Also, Rahman et al. [17] proposed an approach to select cloud service based on multiple criteria that select services that best match the user’s QoS requirements from a list of services by comparison. The authors introduced two methods, Weighted Difference, and Exponential weighted Difference, for computing similarity values. It is however assumed in [17] that all cloud service QoS attributes are quantitative, thereby ignoring the qualitative QoS attributes of services. In [35] a QoS-driven approach called MSSOptimiser, which supports the service selection for multi-tenant cloud-based software applications (Software as a Service - SaaS) was proposed. In the work, certain qualitative and non-numerical QoS parameters such as reputation were mapped to numerical values based on a pre-defined semantics-based hierarchical structure of all possible values of a non-numerical QoS parameter in order to quantify the qualitative parameters. Also, in [20] Multi-attribute Decision-Making framework for cloud adoption - MADMAC was proposed. The framework allows the comparison of multiple attributes with diverse units of measurements in order to select the best alternative. The work requires the definition of Attributes, Alternatives and Attribute Weights, to construct a Decision Matrix and arrive at a relative ranking to identify the optimal alternative. An adapted Likert-type scale from 1 to 10 was used by the MADMAC to convert all qualitative attributes to their quantitative equivalent, where 1 indicates very unfavourable, 5 indicates neutral, 6 indicates favourable, and 10 indicates a near perfect solution. However, in all of these cases, a standard cloud services measurement and comparison model such as SMI was not considered, which means that the QoS attributes used only covered a limited range of heterogeneous dimensions (qualitative and quantitative), which may not provide a sufficiently robust basis for decision making on cloud services.

In contrast to previous approaches, our approach considers the heterogeneity of cloud QoS Model that combines quantitative and qualitative QoS data, which to the best of our knowledge, represents a first attempt to use heterogeneous similarity metrics for QoS ranking and selection of services in the context of a cloud service e-marketplace.

Heterogeneous similarity metrics for cloud service ranking and selection

By giving due consideration to the heterogeneous nature of the cloud services QoS model, this paper proposes the use of heterogeneous similarity metrics (HSM) for cloud service ranking and selection. In this Section, we present an overview of HSM, the rationale for selection of HSM that have been selected in this study, and a description of the five selected HSM for cloud service ranking and selection.

Overview of heterogeneous similarity metrics

To measure the similarity between quantitative data, metrics such as Murkowski metrics [32], its derivatives (Manhattan and Euclidean), Chebyshev and Canberra metrics have been proposed. Also, metrics such as Overlap [33], Eskin [36], Lin [37] and Goodall [38], have also been proposed for qualitative similarity computations. However, these quantitative or qualitative metrics alone are insufficient for handling heterogeneity, except when combined into a unified metric that applies different similarity metrics to different types of QoS attributes [22]. The resultant combination can be referred to as a heterogeneous similarity metric (HSM) [22]. Authors in [22] proposed Heterogeneous Euclidean-Overlap Metric (HEOM) and Heterogeneous Value Difference Metric (HVDM) as metrics for computing similarity operations on heterogeneous datasets. The HEOM metric employs range-normalized Euclidean metric (Eq. 4) for quantitative QoS attributes, while Overlap metric is employed for qualitative QoS attributes; while the HVDM uses the standard-deviation-normalized Euclidean distance (Eq. 7) and value difference metric, for quantitative and qualitative QoS attributes respectively. The HEOM and HVDM have been applied for feature selection and instance-based learning in real-world classification tasks [22].

Rationale for selected qualitative metrics

A number of qualitative similarity metrics have been proposed in the literature and we selected at least one qualitative metric from each of the categories defined in [31] to create additional heterogeneous similarity metrics for QoS-based cloud service ranking and selection. The categories are as follows:

  • Metrics that fills diagonal entries only: Qualitative metrics that fall into this category include the Overlap [33] and Goodall qualitative metrics [38]. In the overlap metric, the similarity between two multivariate data points is directly proportional to the number of attributes or dimensions in which they both match. However, the overlap metric does not distinguish between the different values taken by an attribute as it treats all similarities and dissimilarities in the same manner. On the other hand, the Goodall metric takes into account the frequency distribution of different attribute values in a given dataset and computes the similarity between two qualitative attribute values by assigning higher similarity to a match when the attribute value is frequent.

  • Metrics that fill off-diagonal entries only: an example of a metric in this category includes the Eskin metric [36]. The Eskin metric gives more weight to mismatches that occur on attributes that take many values. In addition, the maximum value is attained when all the attributes have unique values.

  • Metrics that fill both diagonal and off-diagonal entries: the Lin metric [37] is a typical example of such metrics. The Lin qualitative metric is applied in contexts that involve ordinal, string, word and semantic similarities. The metric assigns higher weights to matches on frequent values, and lower weight to mismatches on infrequent values.

Five heterogeneous similarity metrics for cloud service ranking and selection

Apart from HEOM and HVDM, we introduced an additional three HSM by combining existing similarity metrics used for either quantitative or qualitative data alone. The new HSM are as follows: Heterogeneous Euclidean-Eskin Metric (HEEM), Heterogeneous Euclidean-Lin Metric (HELM), and Heterogeneous Euclidean-Goodall Metric (HEGM). HEEM combines range-normalized Euclidean distance for the quantitative dataset, while Eskin metric [36] was employed for qualitative QoS. While the range-normalized Euclidean distance is employed for computing quantitative QoS values in both HELM and HEGM, HELM applies the Lin metric and HEGM used the Goodall metric to compute on qualitative QoS values.

In all, the five HSM considered in this paper are as follows: HEOM (Eq. 1), HVDM (Eq. 5), HEEM (Eq. 9), HELM (Eq. 12) and HEGM (Eq. 15). While the components for measuring quantitative and qualitative data aspects are shown in Table 1, the underlying mathematical equations that describe each of the HSM are presented subsequently based on the assumption that X and Y are vectors representing the values of the user QoS requirements and a QοS vector of a cloud service si belonging to service list S, such that X = (x1, x2, … xm) and Y = (y1, y2, … ym); xm and ym corresponds to the value of the mth QoS attribute of the users requirement and QoS attribute of the cloud service si respectively.

Table 1 Summary of Heterogeneous Similarity Metrics

Subsequently, we describe each of the proposed heterogeneous metrics in details.

Heterogeneous Euclidean-overlap metric (HEOM)

$$ HEOM\left(x,y\right)=\sqrt{\sum \limits_{i=1}^m{h}_i{\left({x}_i,{y}_i\right)}^2} $$


$$ {h}_i\left(x,y\right)=\left\{\begin{array}{l}1,\kern4em if\kern0.5em x\kern0.5em or\kern0.5em y\kern0.5em is\kern0.5em unknown,\kern0.5em else\\ {} overlap\left(x,y\right), if\kern0.5em i\kern0.5em is\kern0.5em norminal\kern0.5em data, else\\ {} rn\_{diff}_i\left(x,y\right)\end{array}\right. $$

And overlap (x, y) and rn _ diffi (x, y) are defined as

$$ overlap\left(x,y\right)=\left\{\begin{array}{l}0,\kern3.5em if\kern0.5em x=y\\ {}1,\kern3.5em Otherwise\end{array}\right. $$
$$ rn\_{diff}_i\left(x,y\right)=\frac{\left|x-y\right|}{{\mathit{\operatorname{Max}}}_i-{\mathit{\operatorname{Min}}}_i} $$

Heterogeneous value difference metric

$$ HVDM\left(x,y\right)=\sqrt{\sum \limits_{i=1}^m{d}_i{\left({x}_i,{y}_i\right)}^2} $$


$$ {d}_i\left(x,y\right)=\left\{\begin{array}{l}1,\kern4.5em if\kern0.5em x\kern0.5em or\kern0.5em y\kern0.5em is\kern0.5em unknown, else\\ {}{vdm}_i\left(x,y\right),\kern4.5em if\kern0.5em i\kern0.5em is\kern0.5em Qualitative\\ {}{diff}_i\left(x,y\right),\kern4.9em if\kern0.5em i\kern0.5em is\kern0.4em Quantitative\end{array}\right. $$
$$ {vdm}_i\left(x,y\right)=\sqrt{\sum \limits_{c=1}^C{\left|\frac{N_{qi,\kern1em x,\kern1em c}}{N_{qi,\kern1em x}}-\frac{N_{qi,\kern1em y,\kern1em c}}{N_{qi,\kern1em y}}\right|}^2}=\sqrt{\sum \limits_{c=1}^C{\left|{P}_{qi,x,c}-{P}_{qi,y,c}\right|}^2} $$
$$ {diff}_i=\frac{\left|x-y\right|}{4{\sigma}_{qi}} $$


  • \( {N}_{q_i,x} \) is the number of instances (cloud app services) available in the marketplace that have value x for QoS attribute qi;

  • \( {N}_{q_i\kern0.5em ,x\kern0.5em ,c} \) is the number of instances available on the marketplace that have value x for QoS attribute qi and output class c;

  • C is the number of output classes in the problem domain (in this case, C = 3, corresponding to the High, Medium and Low);

  • \( {P}_{q_i,\kern0.5em x,c} \) is the conditional probability of output class c given that QoS attribute qi has the value x, i.e. P(c| qi = x), computing as \( \frac{N_{q_i\kern0.5em ,x\kern0.5em ,c}}{N_{\begin{array}{cc}{q}_i&, x\end{array}}} \). However, if \( {N}_{q_i\kern0.5em ,x}=0 \), then P(c| qi = x) is also regarded as 0.

Heterogeneous Euclidean-Eskin metric

$$ HEEM\left(x,y\right)=\sqrt{\sum \limits_{i=1}^m{e}_i{\left({x}_i,{y}_i\right)}^2} $$
$$ {e}_i\left(x,y\right)=\left\{\begin{array}{l}1,\kern4.5em if\kern0.5em x\kern0.5em or\kern0.5em y\kern0.5em is\kern0.5em unknown,\kern0.5em else\\ {}{eskin}_i\kern0.5em \left(x,y\right), if\kern0.5em i\kern0.5em is\kern0.5em norminal\kern0.5em data,\kern0.5em else\\ {} rn\_{diff}_i\left(x,y\right)\kern0.5em \end{array}\right. $$
$$ {eskin}_i\left(x,y\right)=\left\{\begin{array}{l}0,\kern5em if\kern0.5em x=y\\ {}\frac{n_i^2}{n_i^2+2}\kern2em Otherwise\end{array}\right. $$

Heterogeneous Euclidean-Lin metric

$$ HELM\left(x,y\right)=\sqrt{\sum \limits_{i=1}^m{l}_i{\left({x}_i,{y}_i\right)}^2} $$
$$ {l}_i\left(x,y\right)=\left\{\begin{array}{l}1,\kern4.5em if\kern0.5em x\kern0.5em or\kern0.5em y\kern0.5em is\kern0.5em unknown,\kern0.5em else\\ {}{lin}_i\left(x,y\right),\kern1.5em if\kern0.5em i\kern0.5em is\kern0.5em norminal\kern0.5em data,\kern0.5em else\\ {} rn\_{diff}_i\left(x,y\right)\end{array}\right. $$
$$ {lin}_i\left(x,y\right)=\left\{\begin{array}{l}2\kern0.5em \log {\widehat{p}}_{qi}(x),\kern7em if\kern0.5em x\kern0.5em =y\\ {}2\kern0.5em \log \left({\widehat{p}}_{qi}(x)+{\widehat{p}}_{qi}(y)\right)\kern2.8em Otherwise\end{array}\right. $$

Heterogeneous Euclidean-Goodall metric

$$ HEGM\left(x,y\right)=\sqrt{\sum \limits_{i=1}^m{g}_i{\left( xi, yi\right)}^2} $$
$$ {g}_i\left(x,y\right)=\left\{\begin{array}{l}1,\kern4.4em if\kern0.5em x\kern0.5em or\kern0.5em y\kern0.5em is\kern0.5em unknown,\kern0.3em else\\ {}g{oodall}_i\left(x,y\right), if\kern0.3em i\kern0.5em is\kern0.5em norminal\kern0.5em data, else\\ {} rn\_{diff}_i\left(x,y\right)\end{array}\right. $$

\( g{oodall}_i\left(x,y\right)=\left\{\begin{array}{l}{\widehat{p}}_{qi}^2(x)\kern4.5em if\kern0.5em x=y\\ {}0\kern6em Otherwise\end{array}\right. \) (17)

  • Where ni = the number of values that QoS attribute qi can assume (e.g. for security QoS attribute denoted by qsecurity, nsecurity = 3; corresponding to the number of values that security QoS attribute can assume: High, Medium and Low)

  • Where \( {\widehat{p}}_{qi}(x) \) and \( {\widehat{p}}_{qi}^2(x) \) are the sample probability of QoS attribute qi to take the value of x in the data set (in this case the available services on the e-marketplace); computed as \( {\widehat{\mathrm{p}}}_{\mathrm{q}\mathrm{i}}\left(\mathrm{x}\right)=\frac{{\mathrm{N}}_{{\mathrm{q}}_{\mathrm{i}}\kern0.5em ,\mathrm{x}}}{\mathrm{N}} \) and \( {\widehat{p}}_{qi}^2(x)=\frac{N_{qi,x}\left({N}_{qi,x}-1\right)}{N\left(N-1\right)} \)

  • The total number of services is denoted as N.

Experimental evaluation and results

In this section, we present an experimental assessment of the ranking accuracy of the five selected HSM on a synthetically generated dataset for cloud services. A synthetically generated QoS dataset was used because a real QoS dataset for cloud services that perfectly fit the context of our experiment could not be found. Alkalbani et al. [39] alluded to the paucity of viable datasets for cloud services. The Blue Pages dataset in [39] is the closest dataset on cloud services that we got, but it is not based on QoS cloud services. Rather, it provides data on different service offerings such as service name, the date the service was founded, service category, free trial (yes/no), mobile app (yes/no), starting price, service description, service type, and provider link as extracted from two cloud services review sites – , and , which does not fit perfectly for the purpose of this study. However, we found some previous studies on cloud services that relied on a synthetically generated dataset or simulated datasets to perform experiments on cloud services [40,41,42,43], which motivated our decision to use a synthetically generated dataset. In order to synthesise the dataset, 6 attributes were selected from 6 categories of the SMI (see Table 2). The SMI was used as the basis for data synthesis because it provides a standardised method for measuring and comparing cloud-based business services [14]. The 6 selected attributes comprising 3 quantitative and 3 qualitative attributes were those considered to be relevant to the context of SaaS. The 6 selected attributes are service response time, availability, cost, security, usability, and flexibility.

Table 2 Definition and Description of the Six QoS Attributes

The goal of the experiment is to investigate the ranking accuracy of the HSM compared to a gold standard obtained by human similarity judgment.

Dataset preparation

The data values for the selected SMI attributes were synthesised based on examples from previous evaluation studies [44,45,46,47], and related papers on cloud service selection such as [14, 28, 41, 47] that revealed acceptable data formats for quantitative attributes such as response time, cost, and availability. We generated random qualifier values for the other qualitative attributes, which are usability, security, and flexibility. Consequently, we used a total of six QoS attributes with a typical data format as shown in Table 3. For simplicity, we limited the qualifier values for usability, security, and flexibility to high, medium and low. We simulated multiple instances of the adopted format for the six attributes in order to obtain a dataset comprising a total of 63 services after sorting by response time in ascending order. It must be said that in order to deploy our approach in a real case scenario, the QoS attributes of a service will have to be specified by the service provider and made accessible to the user as part of the service documentation that a user needs to consider in order to take a decision on which service to select. One of the available means to do this is to leverage relevant SMI measurement templates provided by the Cloud Service Measurement Index Consortium (CSMIC) [48].

Table 3 Perfect Match of services and user requirements

Furthermore, the initial set of SMI templates by CSMIC has been extended by Scott Feuless in [49] to evolve metrics and SMI scored frameworks that enable specific SMI attributes to be scored by an organisation. The purpose of the SMI scored framework [50] is to enable a customer to evaluate a cloud service in order to make a right choice. By using the SMI scored framework or a similar model, the cumulative scores for specific SMI categories, and the scores for individual SMI attributes of a cloud service can be obtained. However, determining the cumulative scores for each SMI attribute is a manual process that is qualitatively driven by experts within an organization. Thus, having the SMI scored frameworks (or similar scoring models) for several cloud services, creates the basis for the application of the HSM that this paper proposes. The HSM can be applied for automated ranking and selection of the cloud services in real-time in order to determine the best cloud service offerings in the midst of several alternatives. This will offer a major advantage over the use of a manually-generated SMI scored frameworks [50] for ranking and selection of cloud services.

Evaluation metrics

Kendal tau coefficient

Kendall’s tau coefficient, denoted as τ is used to measure the ordinal association between two variables. The Kendall correlation between two variables will be high when the top-k list produced by the five HSM and gold standard has a correlation value of 1, and low with a correction value of − 1. The Kendall tau coefficient is computed as follows:

$$ \tau =\frac{\left(C-D\right)}{\frac{k\left(k-1\right)}{2}} $$

Where C = Concordant pairs; D = Discordant pairs; k is the number of top-k items produced by the methods.

Precision metric

Precision, a measure used in information retrieval domains, was adapted here to evaluate the relevance of the output obtained from each metric with respect to the content of the gold standard. Precision is the fraction of cloud services obtained from the HSM that is contained in the gold standard. The gold standard output was used as the benchmark to determine the precision of each metric as we determined how many of the top-k services returned by the metrics include the services contained in the gold standard. We computed the precision of each metric as we varied the number of k. We define Precision as:

$$ \frac{\left|\mathbf{TKS}\kern0.4em \bigcap \kern0.4em \mathbf{GS}\right|}{\mathbf{TKS}} $$

Where TKS = Top-k Cloud Services returned by HSM and GS = Number of Services in Gold Standard.

Experiment design and protocol

We recruited 12 undergraduates students in Computing and Engineering fields (male = 9, Female =3), on the basis that 12 participants offer an acceptably tight confidence interval [51]. We used one of the services from the dataset as the user requirements and asked participants to rank the remaining 63 services according to similarity to the user requirements. The user requirements vector R selected is as follows {302.75, 126, 99.99, Medium, Low, Low} respectively corresponding to values for Response Time, Cost, Availability, Usability, Security Management, and Flexibility.

To simplify the similarity judgement exercise, we converted the QoS values of the services in the dataset into line graphs, such that the user requirements is plotted against each of the remaining 63 services; and the qualitative values High, Medium and Low were mapped to numerical values of 50, 30 and 10 respectively for illustration purposes. For example, Fig. 1 shows the line graphs of the user requirement with another service, based on the QoS information contained in Table 4.

Fig. 1
figure 1

Line Graph showing Cloud Service QoS Vs. User QoS Requirements. The line graph graphically depicts the similarity of the QoS properties of the cloud services and the QoS requirements of the users. Panel (a) shows that there is a perfect match between the User’s QoS requirement and the QoS properties of the cloud service; while Panel (b) shows a variance between the QoS properties of the cloud service and the QoS requirement of the user

Table 4 Difference in Service and User Requirements

The participants were taken through a 15 min tutorial to explain the purpose of the experiment and basic training on the similarity evaluation exercise. After the training, the participants were shown the 63 line graphs and were asked to agree or disagree (on a 1 to 7 Likert scale) with the proposition: ‘The two Lines graphs are similar.’ The questionnaire contained 63 items corresponding to the 63 services been ranked. The responses from the 12 participants were analysed and we determined the Mean of the response to each item, which indicates unanimously which service is most similar to the user requirements. We aggregated the responses from all participants by finding the median responses across the 63 items presented in the questionnaire. The median scores were sorted in descending order to indicate the degree of similarity of the 63 services to the user requirements. Higher median scores indicate higher similarity and vice versa.

The HSM was implemented in Java and used to rank the 63 services used in this experiment with respect to the user requirements. The simulation was conducted on an HP Pavilion with Intel Core (TM) i3-3217 U CPU at 1.80GHz 1.80 GHz processor and 4.00GB RAM on 64-bit Operating System, an × 64-based processor running Windows 8.1. The ranking produced by the HSM was compared with those produced by human subjects using the Kendall tau coefficient, while the accuracy of the ranking produced was measured using the gold standard as a benchmark based the precision metric.


Rank correlation coefficient

We applied the Kendall Tau Rank Correlation Coefficient metric to measures the rankings obtained from the HSM. Table 5 shows the rank order correlation among all five HSM, as well as the ranking obtained from human similarity judgment. The results show that 10 of 15 correlations were statistically significant at 2-tailed (p < 0.01). The strongest correlations occur for HEEM-HEOM (τ = 0.929, p < 0.01), HVDM-HEOM (τ = 0.515, p < 0.01), HVDM-HEEM (τ = 0.573, p < 0.01), and HEGM-HELM (τ = 0.436, p < 0.01). The weaker correlations occur among the following: HELM with HEOM, HVDM, and HEEM; HEGM with HEOM, HVDM, and HEEM. However, there are positive correlations among the ranking results from the human similarity judgements with HEOM, HVDM and HEEM; and a negative correlation with HELM and HEGM. The ranking produced by HEEM (τ = 0.449, p < 0.01) correlates highly with the human similarity judgements, closely followed by HEOM (τ = 0.229, p < 0.01). HVDM and HELM have a weak rank correlation with human similarity judgements, whereas HEGM had a significant negative correlation with human similarity judgements.

Table 5 Kendall Tau Rank Correlation Coefficients


High precision connotes that the heterogeneous similarity metrics ranked and returned more relevant services as contained in the gold standard. We used the ranking produced by HEOM as the gold standard and served as the benchmark to measure the precision of the rankings produced by the other HSM used in the evaluation. The value of top-k ranged from 5, 10, 15, 20 and 25. Based on the analysis shown in Fig. 2, we observed that HEEM consistently gave the highest precision accuracy across the ranges of k, followed slightly by HVDM, meanwhile HELM had the least.

Fig. 2
figure 2

Precision Score of the heterogeneous similarity metrics (HEEM, HEGM, HVDM, HELM). Precision of the heterogeneous similarity metrics (HSM) measures how many relevant cloud services were ranked and returned by HSM as contained in the gold standard. The gold standard contained the ranking of services produced by HEOM, and it served as the benchmark to measure the precision of the rankings produced by other HSM including HEEM, HEGM, HVDM and HELM. The value of top-k ranged from 5, 10, 15, 20 and 25. HEEM had the highest precision score on all values of k compared to other HSM


Based on the results of the rank order correlation and ranking accuracy measured by precision metrics precision, HEEM performed relatively well in comparison to HVDM viz a viz the ranking produced by HEOM. Although the HEOM and the HVDM are known heterogeneous similarity metrics and have been employed for similarity computations [22, 52], this paper was the first to apply these metrics, together with the three proposed in this paper, to rank cloud services by considering heterogeneous nature of cloud services QoS model. The application of HSM in ranking cloud services provides a more credible basis for cloud service ranking and selection. In this paper, we have been able to consider the heterogeneous dimensions of the QoS model that defines cloud services that have been hitherto overlooked by previous cloud ranking and selection approaches. Based on the results of the experimental evaluations, we showed that not only is the HEEM a promising metric for ranking heterogeneous dataset, it can also be applied to accurately rank cloud services in cloud service e-marketplace contexts with respect to user requirements. Generally, the results of the experimental evaluation show the suitability of HSM for ranking cloud services in a cloud service e-marketplace context. More specifically, HEOM, HEEM, and HVDM show considerable ranking accuracy compared to HEGM and HELM. Therefore, a cloud service selection approach that uses HSM to rank cloud services is more suitable compared to approaches that consider only quantitative QoS attributes.


The emergence of cloud service e-marketplaces such as AppExchange, SaaSMax, and Google Play Store as a one-stop shop for demand and supply of SaaS applications further contributes to the popularity of cloud computing, as a preferred means of provisioning and purchasing cloud-based services. Despite the fact that existing cloud e-marketplaces do not consider user’s QoS requirements, the search results are presented as an unordered list of icons making it difficult for users to discriminate among services shown. Moreover, existing cloud service ranking approaches assume that cloud services are only characterised by quantitative QoS attributes. The main objective of this paper is to extend existing approaches by ranking cloud services in accordance with user requirements while considering the heterogeneous nature of QoS attributes. We demonstrated the plausibility of applying heterogeneous similarity metrics in ranking cloud services and evaluated the performance of five (two known metrics and three new metrics) heterogeneous similarity metrics using rankings produced by the human judgement as a benchmark. The experimental results show that the QoS rankings obtained from HEOM, HEEM and HVDM correlates closely with human similarity assessments compared to other heterogeneous similarity metrics used in this study. Thus, confirming the suitability of heterogeneous similarity metrics for QoS-based ranking of cloud services with respect to the user’s QoS requirements in the context of a cloud service e-marketplace. Although we have used only one user’s QoS requirements as an example to describe the scenario of a QoS-based ranking of cloud services, similar studies can be performed using a variety of user QoS requirements and QoS datasets to further validate the results obtained in this paper. In the nearest future, the proposed heterogeneous similarity metrics will be integrated into a holistic framework for cloud service selection, and more experimental evaluations would be performed to ascertain the user experience of metrics proposed to rank and select cloud services in cloud service e-marketplace.








Heterogeneous Euclidean-Eskin Metric


Heterogeneous Euclidean Goodall Metric


Heterogeneous Euclidean-Lin Metric


Heterogeneous Euclidean-Overlap Metric


Heterogeneous Similarity Metric


Heterogeneous Value Difference Metric


Quality of Service


Service Measurement Index


  1. Rimal BP, Jukan A, Katsaros D, Goeleven Y (2011) Architectural requirements for cloud computing systems: an Enterprise cloud approach. J Grid Comput 9:3–26.

    Article  Google Scholar 

  2. Ezenwoke A, Omoregbe N, Ayo CK, Sanjay M (2013) NIGEDU CLOUD: model of a national e-education cloud for developing countries. IERI Procedia 4:74–80.

    Article  Google Scholar 

  3. Buyya R, Yeo CS, Venugopal S (2008) Market-oriented cloud computing. IEEE, pp 5–13

  4. Fortiş TF, Munteanu VI, Negru V (2012) Towards a service friendly cloud ecosystem. In: Proceedings - 2012 11th international symposium on parallel and distributed computing, ISPDC 2012, pp 172–179

    Google Scholar 

  5. Townsend C, Kahn BE (2014) The “visual preference heuristic”: the influence of visual versus verbal depiction on assortment processing, perceived variety, and choice overload. J Consum Res 40:993–1015.

    Article  Google Scholar 

  6. Chernev A, Böckenholt U, Goodman J (2012) Choice overload: a conceptual review and meta-analysis. J Consum Psychol 25:333–358.

    Article  Google Scholar 

  7. Toffler A (1970) The future shock. Amereon Ltd. ISBN: 0553277375, New York

    Google Scholar 

  8. Alrifai M, Skoutas D, Risse T (2010) Selecting skyline services for QoS-based web service composition. In: Proceedings of the 19th international conference on world wide web - WWW’10. ACM, p 11

  9. Ezenwoke A, Daramola O, Adigun M (2017) Towards a visualization framework for service selection in cloud E-marketplaces. In: Proceedings - 2017 IEEE 24th international conference on web services, ICWS 2017

    Google Scholar 

  10. Ezenwoke AA (2018) Design of a QoS-based framework for service ranking and selection in cloud e-marketplaces. Asian J Sci Res 11:1–11

    Article  Google Scholar 

  11. Chen X, Zheng Z, Liu X et al (2013) Personalized QoS-aware web service recommendation and visualization. IEEE Trans Serv Comput 6:35–47.

    Article  Google Scholar 

  12. Abdelmaboud A, Jawawi DNA, Ghani I et al (2015) Quality of service approaches in cloud computing: a systematic mapping study. J Syst Softw 101:159–179.

    Article  Google Scholar 

  13. CSMIC (2014) Service measurement index framework version 2.1 introducing the service measurement index (SMI). Accessed 3 Feb 2018

  14. Garg SK, Versteeg S, Buyya R (2011) SMICloud: a framework for comparing and ranking cloud services. In: Proceedings - 2011 4th IEEE international conference on utility and cloud computing, UCC 2011. IEEE, pp 210–218

  15. Soltani S, Asadi M, Gašević D et al (2012) Automated planning for feature model configuration based on functional and non-functional requirements. Proc 16th Int Softw Prod Line Conf:56–65.

  16. Gui Z, Yang C, Xia J et al (2014) A service brokering and recommendation mechanism for better selecting cloud services. PLoS One 9.

  17. Mirmotalebi R, Ding C, Chi CH (2012) Modeling user’s non-functional preferences for personalized service ranking. In: Liu C, Ludwig H, Toumani F, Yu Q (eds) Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer, Berlin Heidelberg, pp 359–373

    Google Scholar 

  18. ur Rehman Z, Hussain FK, Hussain OK (2011) Towards Multi-criteria Cloud Service Selection. In: 2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing. IEEE, pp 44–48

  19. Zheng Z, Wu X, Zhang Y et al (2013) QoS ranking prediction for cloud services. IEEE Trans Parallel Distrib Syst 24:1213–1222.

    Article  Google Scholar 

  20. Qu L, Wang Y, Orgun MA et al (2014) Context-aware cloud service selection based on comparison and aggregation of user subjective assessment and objective performance assessment. In: Proceedings - 2014 IEEE international conference on web services, ICWS 2014, pp 81–88

    Chapter  Google Scholar 

  21. Saripalli P, Pingali G (2011) MADMAC: multiple attribute decision methodology for adoption of clouds. In: Proceedings - 2011 IEEE 4th international conference on CLOUD computing, CLOUD 2011. IEEE, pp 316–323

  22. He Q, Han J, Yang Y et al (2012) QoS-driven service selection for multi-tenant SaaS. In: Proceedings - 2012 IEEE 5th international conference on CLOUD computing, CLOUD 2012. IEEE, pp 566–573

  23. Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34.

    Article  MathSciNet  MATH  Google Scholar 

  24. Zhai X, Peng Y, Xiao J (2013) Heterogeneous metric learning with joint graph regularization for cross-media retrieval. Twenty-Seventh AAAI Conf Artif Intell Heterog:1198–1204

  25. Menychtas A, Vogel J, Giessmann A et al (2014) 4CaaSt marketplace: an advanced business environment for trading cloud services. Futur Gener Comput Syst 41:104–120.

    Article  Google Scholar 

  26. Khadka R, Saeidi A, Jansen S, et al (2011) An evaluation of service frameworks for the management of service ecosystems. In: Pacific Asia conference on information systems (PACIS) 2011 proceedings. P paper 93

  27. Vigne R, Mach W, Schikuta E (2013) Towards a smart web service marketplace. In: Proceedings - 2013 IEEE international conference on business informatics, IEEE CBI 2013. IEEE, pp 208–215

  28. Garg SK, Versteeg S, Buyya R (2013) A framework for ranking of cloud computing services. Futur Gener Comput Syst 29:1012–1023.

    Article  Google Scholar 

  29. Tajvidi M, Ranjan R, Kolodziej J, Wang L (2014) Fuzzy cloud service selection framework. In: 2014 IEEE 3rd international conference on cloud networking, CloudNet 2014. IEEE, pp 443–448

  30. Ayeldeen H, Shaker O, Hegazy O (2015) Distance similarity as a CBR technique for early detection of breast Cancer: an Egyptian case study similarity measure. In: Information Systems Design and Intelligent Applications, pp 449–456

    Chapter  Google Scholar 

  31. Boriah S, Chandola V, Kumar V (2008) Similarity measures for categorical data: a comparative evaluation. In: Proceedings of the 2008 SIAM international conference on data mining, pp 243–254

    Chapter  Google Scholar 

  32. Batchelor BG (1977) Pattern recognition: ideas in practice. Springer US

  33. Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228.

    Article  Google Scholar 

  34. Jung G, Mukherjee T, Kunde S et al (2013) CloudAdvisor: A recommendation-as-a-service platform for cloud configuration and pricing. In: Proceedings - 2013 IEEE 9th world congress on SERVICES, SERVICES 2013. IEEE, pp 456–463

  35. He Q, Han J, Yang Y et al (2012) QoS-driven service selection for multi-tenant SaaS. In: Proceedings - 2012 IEEE 5th international conference on CLOUD computing, CLOUD 2012, pp 566–573

    Google Scholar 

  36. Eskin E, Arnold A, Prerau M et al (2002) A geometric framework for unsupervised anomaly detection. Springer, Boston, pp 77–101

    Google Scholar 

  37. Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning

    Google Scholar 

  38. Goodall DW (1966) A new similarity index based on probability. Biometrics 22:882.

    Article  Google Scholar 

  39. Alkalbani AM, Ghamry AM, Hussain FK, Hussain OK (2015) Blue pages: software as a service data set. In: 2015 10th international conference on broadband and wireless computing, Communication and Applications (BWCCA). IEEE, pp 269–274

  40. Sundareswaran S, Squicciarini A, Lin D (2012) A brokerage-based approach for cloud service selection. In: CLOUD computing (CLOUD), 2012 IEEE 5th international conference on. IEEE, pp 558–565

    Chapter  Google Scholar 

  41. Le S, Dong H, Hussain FK et al (2014) Multicriteria decision making with fuzziness and criteria interdependence in cloud service selection. In: IEEE International Conference on Fuzzy Systems, pp 1929–1936

    Google Scholar 

  42. Karim R, Ding C, Miri A (2013) An End-to-End QoS Mapping Approach for Cloud Service Selection. In: 2013 IEEE ninth world congress on services. IEEE, pp 341–348

  43. Sun L, Ma J, Zhang Y et al (2016) Cloud-FuSeR: fuzzy ontology and MCDM based cloud service selection. Futur Gener Comput Syst 57:42–55.

    Article  Google Scholar 

  44. Li A, Yang X, Kandula S, Zhang M (2010) CloudCmp: comparing public cloud providers. Proc 10th Annu Conf internet Meas - IMC’10 1.

  45. Schad J, Dittrich J, Quiané-Ruiz J-A (2010) Runtime measurements in the cloud. Proc VLDB Endow 3:460–471.

    Article  Google Scholar 

  46. Iosup A, Yigitbasi N, Epema D (2011) On the performance variability of production cloud services. In: 2011 11th IEEE/ACM international symposium on cluster, Cloud and Grid Computing. IEEE, pp 104–113

  47. Rehman ZU, Hussain OK, Hussain FK (2014) Parallel cloud service selection and ranking based on QoS history. Int J Parallel Prog 42:820–852.

    Article  Google Scholar 

  48. Service Measurement Index (SMI) Measures Definitions (2014) Available at:

    Google Scholar 

  49. Feuless S (2016) The cloud service evaluation handbook: how to choose the right service, CreateSpace Independent Publishing Platform

    Google Scholar 

  50. Feuless S (2016) Sample Scored Framework (2014): available at

    Google Scholar 

  51. Nielsen J (2006) Quantitative Studies: How Many Users to Test? Available at:

  52. Tiihonen J, Felfernig a. (2010) Towards recommending configurable offerings. Int J Mass Cust 3:389.

    Article  Google Scholar 

Download references


This research was funded by the Landmark University Centre for Research and Development (LUCERD), and Covenant University Centre for Research, Innovation and Discovery (CUCRID). The Cape Peninsula University of Technology provided support during the revision stage of this paper.

Availability of data and materials

The simulated QoS dataset used for this study can be found at

Author information

Authors and Affiliations



AE designed and conducted the experiments, performed the statistical analysis, and prepared the initial draft of the manuscript. OD significantly revised and rewrote fundamental portions of the manuscript, as well as contributed to the data collection methodology of the simulation experiment. MA formulated the research hypothesis, provided the guideline for the design of the experiments and contributed in writing to the revised manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Azubuike Ezenwoke.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ezenwoke, A., Daramola, O. & Adigun, M. QoS-based ranking and selection of SaaS applications using heterogeneous similarity metrics. J Cloud Comp 7, 15 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Cloud service selection
  • E-marketplace
  • QoS
  • SaaS
  • Similarity metrics