Journal of Cloud Computing

Advances, Systems and Applications

Journal of Cloud Computing Cover Image
Open Access

A survey on securing the virtual cloud

Journal of Cloud Computing: Advances, Systems and ApplicationsAdvances, Systems and Applications20132:17

https://doi.org/10.1186/2192-113X-2-17

Received: 3 June 2013

Accepted: 30 October 2013

Published: 6 November 2013

Abstract

The paper presents a survey and analysis of the current security measures implemented in cloud computing and the hypervisors that support it. The viability of an efficient virtualization layer has led to an explosive growth in the cloud computing industry, exemplified by Amazon’s Elastic Cloud, Apple’s iCloud, and Google’s Cloud Platform. However, the growth of any sector in computing often leads to increased security risks. This paper explores these risks and the evolution of mitigation techniques in open source cloud computing. Unlike uniprocessor security, the use of a large number of nearly identical processors acts as a vulnerability amplifier: a single vulnerability being replicated thousands of times throughout the computing infrastructure. Currently, the community is employing a diverse set of techniques in response to the perceived risk. These include malware prevention and detection, secure virtual machine managers, and cloud resilience. Unfortunately, this approach results in a disjoint response based more on detection of known threats rather than mitigation of new or zero-day threats, which are often left undetected. An alternative way forward is to address this issue by leveraging the strengths from each technique in combination with a focus on increasing attacker workload. This approach would make malicious operation time consuming and deny persistence on mission time-scales. It could be accomplished by incorporating migration, non-determinism, and resilience into the fabric of virtualization.

Keywords

Vulnerability amplifierMalware prevention and detectionSecure virtual machine managersCloud resilienceZero-dayIncreasing attacker workloadVirtual machineView comparison-based malware detection

Introduction

Virtualization of servers in the cloud operates by adding a new layer to the software stack known as the hypervisor [1] or Virtual Machine Monitor (VMM) [2]. The hypervisor encapsulates the hardware, allowing it to be used by multiple operating system instances concurrently. This flexibility, coupled with the cost and performance advantages of sharing the underlying hardware, has revolutionized the computing industry: large numbers (i.e. hundreds of thousands) of generic hardware platforms, using multi-core blade technology, are now coupled through high-performance networking to produce a generic computing surface. Any subset of this collection can be combined to operate in tandem for a particular application using a multitude of operating systems.

Conceptually, the hypervisor presents a virtual machine abstraction that restricts malicious code, executing within one instance of an operating system, from affecting a different instance. Unfortunately, hypervisors have introduced their own new security challenges: Adversaries now actively attempt to detect the presence of an operating hypervisor in order to tailor attacks accordingly [3]. A wide range of hypervisor detection techniques have already appeared against popular systems such as VMWare, VirtualPC, Bochs, Hydra, Xen, and QEMU [4]. Often, these techniques operate by exploiting timing differences between virtualized and non-virtualized operations [5]. Alternatively, they detect unusual memory locations associated with key operating system data structures [6]. For example, the Red Pill technique works by using the SIDT X-86 instruction to determine the location in memory of the interrupt descriptor table; a machine running above a hypervisor will return a location much higher in memory than one that is not [7]. Following hypervisor detection, the adversary then attacks either the operating system, the virtual switch (vSwitch) sharing network connectivity between virtual machines, or the hypervisor itself [8].

The presence of a hypervisor has no impact on the vulnerabilities associated with the operating system. As a result, any exploit that leverages a known vulnerability will still operate successfully [9]. Although, a remote exploit gives the adversary control of a single virtual machine, by using the exploit in a virus the entire cloud could be compromised. It is this vulnerability amplification that poses the most significant threat to the future of cloud computing.

Direct attacks against a vSwitch may undermine the operation of multiple virtual machines on a single host by denying connectivity to all of them simultaneously. The vSwitch provides the same functionality as a physical switch and in consequence exhibits the same vulnerabilities, enabling the same exploits [10]. For example, Address Resolution Protocol (ARP) spoofing, involves the interception of valid network packets by sending fake ARP packets to a switch [11].

Hypervisor attacks involve the direct exploitation of vulnerabilities in the hypervisor. All virtual machines executing on a hypervisor have distinct data structures, separated in hardware. This separation forms a semantic gap [12] that prevents virtual machines from having visibility or impact upon each other’s data structures [13]. Direct Kernel Structure Manipulation (DKSM) bridges the semantic gap by patching virtual machine data structures and redirecting hypervisor accesses to shadow copies. This allows the virtual machine to present false information to the hypervisor regarding the virtual machine state, allowing implants, such as rootkits [14], to persist without detection.

Virtualization provides inherent redundancy and appears to provide robust, large-scale, cost-effective availability of shared resources [15]. However, this perception is tempered by the known risk of vulnerability amplification and the paucity of knowledge regarding zero-day exploitation in clouds: history has shown that lack of detection does not imply lack of infection. Current mitigation techniques reviewed by this paper have already evolved based on malware detection and prevention, secure virtual machine managers, and cloud resilience. These three categories and their roles in preventing an attacker from gaining access to the cloud is illustrated in Figure 1. Omitted from Figure 1 are cloud services that provide authentication such as lightweight active directory protocol servers and trusted computing techniques as they are outside the scope of this survey. Initially, the attacker has to overcome or bypass the intrusion detection and prevention systems typically employed at the cloud boundary. They are then faced with a secure hypervisor usually installed on a single host; whose purpose is to restrict access to kernel and hypervisor data structures. Finally, cloud resilience is used by a host to restore a single compromised or failed virtual machine to a known good state. Although not currently prevalent throughout the industry, hypervisors offer the opportunity to restrict the attacker’s access to the base of the software stack. Since typically the number of vulnerabilities is directly related to the number of source lines of code [16], this would allow tight control of the hardware and allow operating system designers to build successive layers on a secure base of trust. The small size of the hypervisor also opens the door to formal reasoning concerning its security properties [17]. Unfortunately, these ideas have yet to be cohesively integrated and their impact upon security quantified. In the sections that follow we explore the building blocks that are available for improving cloud security and assess them on the basis of their performance impact, ability to reduce the attack surface, detect known and zero-day threats, resolve detected threats, and increase attacker workload by denying either surveillance or persistence.
Figure 1

The three cloud security techniques reviewed by this paper: intrusion detection & prevention, secure hypervisors, and virtual machines.

Threat model

The security implementation analyzed in this survey address the threat model for intrusions employing remote control outlined in Figure 2. It may involve several steps including surveillance to determine if a vulnerability exists [18], use of an appropriate exploit or other access method [18], privilege escalation [19], removing exploit artifacts, and hiding behavior [14]. Surveillance may involve obtaining a copy of the binary code and using reverse engineering [20, 21] or fuzzing [22] to facilitate a broad range of attack vectors including return oriented programming [23]. The implant then persists for a time sufficient enough to carry out some malicious effect, obtain useful information, or propagate intrusion to other systems.
Figure 2

The threat model, detailing the process from surveillance to exploitation in the cloud.

Unlike the time to execute an exploit, the time spent in surveillance and persistence may range from minutes to months or even years depending upon the intended effect. Moreover, the presence of an intrusion may never be detected by network defenses but instead may be recognized indirectly due to either a deviation from expected behavior, or may be derived from intelligence sources.

Nevertheless, each cloud security technique represents an integral building block in the multilayered defense of the cloud. Malware detection and prevention systems are the initial line of defense in preventing an attacker from gaining a foothold on a cloud. The secure hypervisors present a hardened code base that restricts access to hardware to all, but the most privileged operations. Lastly, cloud resilient solutions are present to protect against the unknown exploits, which may allow an attacker to operate on a cloud indefinitely.

Malware detection and prevention

Malware detection was one of the first techniques implemented after the introduction of hypervisors. To achieve this, researchers paired the proven technology of Intrusion Detection Systems (IDS) with the ability to hide in a virtual machine. In this scenario, the IDS still performs the same function of identifying patterns of malicious behavior on a system that may be compromised [24]; for example a proof of concept based on the Snort IDS successfully prevented a Distributed Denial of Service (DDoS) attack [25]. This implementation installed a virtual machine that ran Snort on top of the VMware hypervisor to monitor network traffic to all guest virtual machines attached to a virtual switch. Once running, the IDS dealt with DDoS attacks in two steps: Initially, attacking computers were blocked by Snort; subsequently, the virtual server automatically moved the application under attack to a new location in the cloud. This demonstrated that an IDS can function inside the cloud; however, the implementation was just as vulnerable to zero-day attacks as non-virtualized IDS’s [26]: attacks were missed due to IDS configuration and the failure of signatures to detect new attacks.

The Hybrid Virtual IDS is a solution that leverages the strengths of the cloud and improves upon the previous Snort implementation [27]. The approach combines resilience of a virtual IDS and the versatility offered by a host based IDS. This is possible through the use of integrity checking [28] and system call trace analysis [29]. Integrity checking is a static detection process in which a changed file is compared to a gold standard to determine if the change is malicious. System call trace analysis dynamically flags anomalous system call behavior as potentially dangerous. These two approaches are implemented inside of a virtual machine to provide an isolated environment. A custom hypervisor is then used to ensure the isolation between all virtual machines. To provide functionality to the IDS, the hypervisor has hooks that allow the inspection of other guest virtual machines running on the hypervisor. This allows the hybrid virtual IDS to remain isolated from other running virtual machines, while still allowing it to access data from the virtual machines it is monitoring. This technique performed well in testing conducted by the authors of the Hybrid Virtual IDS, but returned unexpected performance results: as the IDS decreases the length of time between inspecting of the monitored virtual machine, the workload processing time did not increase linearly as to be expected and instead became erratic. The cause of this erratic performance is open to additional research.

With the introduction of a hypervisor and a virtualized IDS, it was only a matter of time before firewalls were moved into the cloud. One of these virtual firewall implementations is VMwall [30], which runs in the privileged virtual machine that controls the Xen hypervisor and uses virtual machine introspection [31]. This is the process of inspecting the data structures of a separate virtual machine. To enable this functionality, the Xen hypervisor has added hooks that capture all network connections created by a process. The data pertaining to these connections is then passed to VMwall for analysis. The connection is either allowed or blocked by using a whitelist (a list of approved processes and connection types). To deter false data during introspection, kernel integrity checking [32] is used to verify the state of kernel data structures in the guest virtual machines. This is necessary, as the primary method of inspecting traffic is through these data structures; malicious modification may compromise the monitoring of traffic. However, VMwall may be vulnerable to hijacking of a whitelisted process or an already established connection. The only method of detection against the compromise of an approved process is through the checksumming of the in-memory image of that process. This is performed by ensuring that the hash of a process has not changed from that of one contained in the whitelist. Due to the performance impact of hash analysis, this method is generally not implemented. Hijacking an established connection can be partially prevented through time outs associated with kernel rules contained in the whitelist. To fully prevent this type of compromise, deep packet inspection could be used, but is not currently employed by VMwall. Importantly, the employed introspection techniques cause a minimal performance impact: the additional overhead is 7% for file transfers from hypervisor to guest and 1% for file transfers from a guest to the hypervisor. Added overhead for Transmission Control Protocol (TCP) and User Data Protocol (UDP) connections are negligible; increases are measured in microseconds.

An alternative approach to detection techniques, like VMwall and hybrid IDS, are prevention methods. One security appliance that performs prevention is Malaware, which is designed to prevent malware that tailors attacks upon detection of a hypervisor [33]. To deter this initial identification of a virtual environment, a signature based method is used. In this instance, a signature is an instruction that should not be executed by an unprivileged process. As an example, when a process such as Red Pill attempts to run the SIDT instruction, it will be flagged as malicious. However, as the authors of Malaware have stated, a signature based approach is only effective against known types of malware. To combat zero-day threats, two behavior based approaches that utilize dynamic analysis are proposed [34]. This could be accomplished by first learning the current process and its page table base address. With this, it is possible to check if the current instruction register belongs to the process’ code pages. If this mapping does not exist, Malaware could flag the process as malicious. The second dynamic analysis method suggested is taint tracking. Changes to the system, otherwise known as taint, are created, when a process modifies any code or memory location. Accordingly, when taint is created in monitored locations, the offending process is immediately flagged as malicious. An added benefit of taint tracking is it defeats malicious code that has been transformed to look harmless, also known as code obfuscation [35]. Once loaded into a monitored region, the obfuscated code is immediately marked as tainted and the associated process is flagged as malicious. Unfortunately, only the signature based piece of the detection has been implemented and no data relating to added overhead has been collected. However, the initial detection results were promising with a malware detection rate of 76%. Lastly, it is important to note that techniques that alter known memory states, such as address space layout randomization (ASLR) may increase the difficulty of this type of taint tracking [36].

Another prevention method, guest view casting [37], moves malware prevention from the guest virtual machine to the hypervisor. This approach reconstructs the data structures of the guest for analysis at the hypervisor level. This is achieved by translating guest virtual memory addresses to physical memory addresses, then reading the raw data from the guest’s virtual hard drive. The reassembled state in the hypervisor can then be compared to the guest’s state using viewing tools such as Windows Task Manager and memory dump to display all processes in memory. The presence of discrepancies between the two states may indicate the existence of malware in the guest. The authors have labeled this method of searching for discrepancies between states as view comparison-based malware detection. An outgrowth of this method is to use anti-virus software to scan the guest’s state from inside of the hypervisor. The use of anti-virus outside of the guest shows that it identifies malware more effectively than anti-virus running inside a virtual machine. Additionally, performance of anti-virus is improved outside of the virtual machine. The primary drawback to this approach is the assumption that the hypervisor has not been compromised. The authors agree that malicious code that targets the hypervisor [38] can compromise their approach.

Although detection and prevention are important, the last two decades have demonstrated that it is unlikely that malware can be eliminated completely [39]. Security researchers in an attempt to understand these attacks have to rely on system logs that lack integrity [40] and are often incomplete [41]. The ReVirt IDS [42], which runs on UMLinux [43]; was created in an attempt to improve upon these inadequacies. This is accomplished by creating logs for all of the relevant system level information needed to replay what transpired at an instruction by instruction level for a specific virtual machine. This allows administrators to determine all the relevant information pertaining to an attack. The overhead of performing these functions is 13-58% for kernel tasks and up to 8% for logging tasks.

Secure virtual machine managers

Hypervisors have afforded researchers with new security capabilities. However, the hypervisor itself has come under attack as a way of gaining control of a system [44]. This has led to the introduction of Secure Hypervisors that reduce the attack surface and increase reliability by reducing the number of lines of code [16]. sHype [45], designed by IBM, increases security by taking the idea of control flow enforcement first seen in SELinux [46] and applying those controls on information flows between virtual machines through a mandatory access control model. Using intricate security policies; unfortunately, these make it difficult to guarantee security and can be over 50,000 lines of code [47]. To remove this level of complexity, sHype affords the same control flow protections, but at the hypervisor level and without the need of a policy administrator. These information flows are maintained through the use of a reference monitor that decides what connections to accept and deny between virtual machines. The sHype approach creates a flexible architecture, which allows it to support many different security modules [48]. This is accomplished in around 11,000 lines of code; SELinux alone is over 85,000 lines of code.

The performance impact of sHype enforcement policies is less than 1% [45]. However, sHype’s primary shortfall is that it does not completely protect against unauthorized transfer of information between two virtual machines that are not allowed to share information. Figure 3 illustrates the problem: nodes A, B, and C represent three different virtual machines and all are connected to a reference monitor. Virtual machines A and B are not allowed to share information, but both are allowed to share information with virtual machine C. A covert channel is created, when virtual machine C acts as an intermediary and passes information between A and B. In this case the reference monitor would not intervene, as it only sees information being transferred from A to C and from C to B. Fortunately, the addition of a Chinese wall (communication rules) can be added to sHype to protect against this covert channel [49]. In this case, the rule would only allow two of the three virtual machines to run at any one time. However, this method has the drawback of causing a decrease in performance of up to 9.1% [50]. This performance impact can be mitigated by performing Chinese wall policy checks at virtual machine creation and then caching these decisions. Since, policy changes are infrequent, this configuration reduces the performance impact to less than 1% [51].
Figure 3

An example of a covert channel, where node A transfers information to node B, through the intermediary node C.

A different direction from control flow enforcement is used in the noHype hypervisor [52]. This minimalist approach removes as much as possible from the hypervisor; unfortunately, no published numbers for lines of code are available. However, the first prototype was based on a stripped down version of Xen 4.0; implying that it falls somewhere less than 1.6 million lines of code [53]. The code count was reduced by shrinking the size of the hypervisor by following four rules. First, noHype pre-allocates processor cores and memory to virtual machines. This allows the virtual machine to control its own hardware, which improves performance. Second, each virtual machine is assigned its own I/O device. Being in the cloud, it is assumed that these virtual machines only need network interface cards (NIC). The issue here is that servers have a limited number of NICs. Thankfully, newer NICs take advantage of Single-Root I/O Virtualization [54], which allows them to present themselves as multiple NICs. Thus, each virtual machine on a server is able to receive its own NIC, even if there are more virtual machines than NICs. Third, noHype provides the user with a predefined guest virtual machine in order to control the discovery of hardware. This also prevents a user from uploading a malicious guest virtual machine, which could attack the hypervisor. Lastly, noHype avoids indirection that occurs through the creation of virtual cores and memory, since cores and memory are assigned directly to each virtual machine. These four principles were tested against a standard Xen 4.0 install and startup time was reduced by 1% in the noHype implementation. However, noHype loses the ability to perform any introspection of the guest virtual machines as the hypervisor is limited in functionality. Thus, a virtual machine in the noHype cloud could become infected without noHype being aware of the infection.

Another popular feature of the cloud is live migration of virtual machines [55]. This can be seamlessly accomplished with little downtime thanks to virtualization. However, migrations lose the states maintained by stateful firewalls [56] and IDS’ [57]. These states can be maintained using a network security enabled hypervisor (NSE-H) designed on top of the Xen hypervisor [58]. This builds on the concepts used in secure hypervisors, but adds support for secure file transfers. The performance impact of this method is measured in downtime, which is the time a virtual machine is not available during transfer. The cost of securing these migrations is up to a 15% increase in downtime versus downtime of non-secure transfers [58]. This downtime occurs for two reasons when maintaining the security context of the virtual machines being migrated. The first is the additional time needed to securely copy a virtual machine’s memory space from one host to another. The second is the NSE-H security additions, as they are using additional resources on the system.

Cloud resilience

An often over-looked aspect of cloud computing is Resilience, defined as the ability for a system to recover and continue to provide services when a loss of hardware or software occurs [59]. One such system, Cloud Resilience for Windows (CReW) [60], expands the idea of resilience to the security domain through the use of strong security in guest virtual machines [61], and introspection [62]. Implementation is on top of the 270,000 plus lines of code that comprise the kernel-based virtual machine hypervisor [63]. This has enabled CReW to effectively prevent attacks from some rootkits and repair any damage they may have caused, but at a cost to performance as the number of virtual machines increases or security level is raised. At a strict level with three virtual machines, CReW adds ~48% increase in time needed for CPU tasks and ~279% increase in time required for I/O related tasks. For the paranoid setting, CReW adds ~116% increase in time for CPU related tasks and adds ~347% increase in time for I/O related tasks [60].

A technique that builds upon the ideas presented in CReW and supports other operating systems is that of hypervisor-based efficient proactive recovery[64]. This approach makes the assumption that no matter what defense is implemented on the cloud, a machine will eventually be maliciously compromised or taken offline. Thus, after particular failure conditions are met, the guest virtual machine is refreshed from a gold standard. A prototype of these concepts was developed using a modified Xen hypervisor [65]. Testing has shown there is a balance between throughput and availability. Thus, a user of this method can choose between lower throughput and higher availability or higher throughput and lower availability when faults occur.

The Bear operating system is a minimalist implementation that builds resiliency on top of a secure hypervisor [66]. A key design choice is the strong enforcement of separating core functionality into four layers, which is typical of modern micro kernels, like the MINIX operating system [67]. Importantly, the attack surface is reduced with a shared code base (>50%) of 10,903 lines of code shared between the Bear Hypervisor and Kernel. The size is attributable to a small custom hypervisor and small custom kernel. Resiliency is derived from non-deterministically refreshing the virtual machines on the hypervisor to a gold standard after a period of time. This refresh is done by starting a second virtual machine from the known valid state and then transferring functionality to it, all while simultaneously tearing down the first virtual machine. By using this method, control is seamlessly transferred between virtual machines and without an impact to performance. Also, any known or zero-day malware present on the torn down virtual machine will not be present on the newly started virtual machine.

Comparative analysis

Table 1 presents a summary comparison, of the approaches based on reduction of the attack surface, prevention of zero-day threats, and overhead. The “Reduces Attack Surface” category shows that all of the technologies other than sHype and Bear rely on a large code base. This poses a concern, as demonstrated by the authors of “Reliability Issues in Open Source Software”, who have shown that errors occur at a rate of .09 defects per thousand lines of open source code. This problem is worse for closed source systems, with .57 defects per thousand lines of code. Although the numbers will vary with code base naturally, this result that indicates Xen will have 144 defects, KVM 25, UMLinux 162, sHype and Bear each present a single defect. An interesting comparison was provided between open source software and closed source software. Due to the partial unintended release of 300,000 lines of VMware kernel code; the code could contain up to 171 defects, which is more defects then a full install of UMLinux. Obviously, sHype and Bear systems are a bare minimum install and have less functionality when compared to the other hypervisors. This has led to the sHype architecture being ported to the Xen hypervisor by the authors of “Building a MAC-Based Security Architecture for the Xen Open-Source Hypervisor”, which has the net effect of increasing functionality and potential number of defects. The key takeaway is that a small code size and open source distribution are desirable to prove a system to be reliable and secure. However, closed source systems, which are outside of the purview of this article, do exist and provide similar security features. Two such commercial hypervisors not reviewed are Citrix XenClient and HyTrust.
Table 1

Comparison summary of surveyed systems

Cloud security implementation

Reduces attack surface (lines of code)

Malware detection

Mitigates zero-day threats

Added overhead (%)

Malaware

> 725 K

Yes

No

No data

Guest view casting

> 1,600 K

Yes

No

Reduced up to 70%

Virtual snort

> 300 K

Yes

No

No data

Hybrid IDS

> 300 K

Yes

No

~4-36%

VMwall

~ 1,600 K

Yes

No

1-7%

ReVirt

~ 1,800 K

No

Yes

8-58%

NSE-H

> 1,600 K

No

No

15%

Shype

~ 11 k

No

No

< 1%

Shype with Chinese wall in critical path

> 1,600 K

No

No

9.1%

Shype with Chinese wall outside critical path

> 1,600 k

No

No

< 1%

NoHype

< 1,600 K

No

No

Reduced up to 1%

CReW

> 270 K

Yes

Yes

~48-347%

Hypervisor-based proactive recovery

~ 1,600 K

Yes

Yes

~8-12.7%

Bear

~ 11 k

Not applicable

Yes

< 1%

After evaluating each system on its abilities to perform “Malware Detection” and “Prevents Zero-Days”; there were two clear outliers. Malware detection and prevention methods primarily protect against known threats, because of their use of whitelists and signatures. However, ReVirt is the outlier in this category, as it provides capabilities to remove zero-days; unlike its counterparts, it has no ability to detect malware. Secure hypervisors restrict access to the hypervisor but generally provide no malware detection abilities or zero-day prevention. Lastly, resilient systems such as CReW and hypervisor based proactive recovery have shown promising results in both categories. The model of whitelists and signatures is replaced with restoration upon detection of anomalous system behavior. Thus, both known malware and zero-days are removed from the system when it is restored to a valid state. Resilient systems do not prevent the initial compromise from known threats, unlike malware prevention and detection systems. The outlier in this group is Bear, which makes no attempt to check for anomalous behavior. Instead, it assumes the system will eventually be compromised and therefore refreshes the system non-deterministically. This has the same end result of removing any known or zero-day attacks that may be present, but also invalidates surveillance and prevents persistence. Nevertheless, the effectiveness of resilient systems warrants further research.

The final category of “Added Overhead” is important, as no technique should overly impact system performance. Both secure hypervisors and malware prevention and detection schemes can minimally impact and in some cases improve performance. The larger resilient prototypes such as CReW and hypervisor proactive recovery have not yet reached this level of performance. Bear however, has had a negligible impact on performance when refreshing virtual machines. Research into future resilient system implementations should aim to maintain the performance levels set by intrusion detection and prevention systems, secure hypervisors, and the Bear operating system. This can be achieved by leveraging the proven practices of either adding functionality to the hypervisor as seen in Guest View Casting or reducing the hypervisor foot print as accomplished by NoHype and Bear. Once this performance requirement is met, further capabilities can be added to resilient systems, which allow for the creation of a new cloud security architecture.

Related fields of work

One field of study that has not been included in this survey is the idea of trust [68] in regards to the unauthorized access of data. One approach to handle trust in data security is that of security labels in the cloud [69]. The goal of this approach is to isolate customer virtual machines from each other to prevent data leakage across virtual machines. This work is an enhancement of a trusted hypervisor that extends trust to network storage [70]. In regards to privacy, customers are concerned that their personal information will be leaked to those who should not have access to it. A current solution to this problem is the use of encryption with access control [71]. Using public key cryptography in the cloud, the user can be sure that their data is safe and only they have access to it.

Conclusion

All of the techniques reviewed in this paper have produced gains in making cloud computing more secure. Most of the solutions strive to race to the bottom of the software stack to combat known risks, rather than unknown zero-day risks. Moreover, it is currently left up to the cloud provider to pick from a grab bag of techniques to secure their infrastructure. This has led to a diverse set of approaches in cloud security, each with its own goals. The most successful approaches could be combined to build new cloud infrastructure. A starting point would be to begin with the idea of resilience as discussed in this paper. Non-determinism could then be added through process specific virtual machines. Multiple copies of these machines could refresh some processes in a non-deterministic manner. Lastly, secure migrations of processes and whole virtual machines can be added. Combining all these techniques could provide a cloud computing environment that drastically increases attacker workload.

Notice

The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Defense Advanced Research Projects Agency (DARPA) or the U.S. Government.

Declarations

Acknowledgements

This material is based on research sponsored by the Defense Advanced Research Projects Agency (DARPA) under agreement number: FA8750-11-2-0257.

Authors’ Affiliations

(1)
Thayer School of Engineering at Dartmouth College

References

  1. Barham P, Dragovic B, Fraser K, Hand S, Harris T, Ho A, Neugebauer R, Pratt I, Warfield A: “Xen and the Art of Virtualization. Proceedings of the nineteenth ACM symposium on Operating systems principles 2003, 164–177.View ArticleGoogle Scholar
  2. Goldberg RP: A survey of virtual machine research. Proceedings of Computer 7th edition. 1974, 34–45.Google Scholar
  3. Paleari R, Martignoni L, Roglia GF, Bruschi D: A fistful of red-pills: how to automatically generate procedures to detect CPU emulators. Proceedings of the 3rd USENIX conference on Offensive technologies 2009.Google Scholar
  4. Ferrie P: “Attacks on Virtual Machine Emulators”. Symantec Advanced Threat Research; 2006.Google Scholar
  5. Fitzgibbon N, Wood M, Conficker C: A Technical Analysis. SophosLabs, Sophos Inc; 2009.Google Scholar
  6. Quist D, Smith V: “Detecting the Presence of Virtual Machines Using the Local Data Table”. 2006. http://www.offensivecomputing.net/files/active/0/vm.pdfGoogle Scholar
  7. Rutkowska J: “Red Pill”. 2006. http://www.hackerzvoice.net/ouah/Red_%20Pill.htmlGoogle Scholar
  8. Ibrahim AS, Hamlyn-Harris J, Grundy J: “Emerging Security Challenges of Cloud Virtual Infrastructure”. Proceedings of APSEC Cloud Workshop 2010.Google Scholar
  9. Corregedor M, Von Solms S: “Implementing Rootkits to Address Operating System Vulnerabilities”. Proceeding of Information Security South Africa; 2011:1–8.Google Scholar
  10. Cabuk S, Dalton CI, Edwards A, Fischer A: “A Comparative Study on Secure Network Virtualization”. HP Laboratories: Technical Report HPL-2008–57; 2008.Google Scholar
  11. De Vivo M, De Vivo GO, Isern G: “Internet Security Attacks at the Basic Levels”. 32nd edition. ACM SIGOPS Operating Systems Review; 1998:4–15.Google Scholar
  12. Chen PM, Noble BD: “When virtual is better than real”. Proceedings of the Eighth Workshop on Hot Topics in Operating Systems; 2001:133–138.Google Scholar
  13. Bahram S, Jiang X, Zi W, Grace M, Li J, Srinivasan D, Rhee J, Xu D: “DKSM: subverting virtual machine introspection for fun and profit”. 29th IEEE International Symposium on Reliable Distributed Systems; 2010:82–91.Google Scholar
  14. Hoglund G, Butler J: Rootkits: subverting the windows Kernel. Addison-Wesley Professional; 2005. USA USAGoogle Scholar
  15. Neal L: Is Cloud Computing Really Ready for Prime Time? 42nd edn., of Computer 2009, 42: 15–20.Google Scholar
  16. Pandey RK, Tiwari V: “Reliability issues in open source software”. Proceedings of the International Journal of Computer Applications 34th edition. 2011, 1.Google Scholar
  17. Klein G, Elphinstone K, Heiser G, Andronick J, Cock D, Derrin P, Elkaduwe D, Engelhardt K, Kolanski R, Norrish M, Sewell T, Tuch H, Winwood S: “seL4: formal verication of an OS Kernel”. Proceedings of 22nd ACM Symposium on Operating Systems Principles 2009.Google Scholar
  18. Kennedy D, O’Gorman J, Kearns D, Aharoni M: Metasploit: the penetration testers guide. No Starch Press; 2011.Google Scholar
  19. Davi L, Dmitrienko A, Sadeghi AR, Winandy M: Privilage escalation attacks on android. Information Security, Springer; 2011.Google Scholar
  20. Eagle C: The IDA Pro Book. San francisco, USA: No Starch Press; 2011.Google Scholar
  21. Eilam E: Reversing. New York, USA: Wiley; 2005.Google Scholar
  22. Forresterm JE, Miller BP: “An Empirical Study of the Robustness of Windows NT Applications Using Random Testing”. Seattle: 4th USENIX Windows Systems Symposium; 2000. Appears (in German translation) as “Empirische Studie zur Stabilität von NT-Anwendungen”, iX, September 2000 Appears (in German translation) as “Empirische Studie zur Stabilität von NT-Anwendungen”, iX, September 2000Google Scholar
  23. Checkoway S, Halderman JA, Feldman AJ, Felten EW, Kantor B, Shacham H: “Can DREs provide long-lasting security? The case of return-oriented programming and the AVC advantage”. Proceedings of the USENIX/ACCURATE/IAVoSS Electronic Voting Technology Workshop 2009. August 2009 August 2009Google Scholar
  24. Denning D: “An intrusion-detection model”. Proc IEEE Trans Softw Eng 1987, SE-13(2):222–232.View ArticleGoogle Scholar
  25. Bakshi A, Yogesh B: “Securing cloud from DDOS Attacks using Intrusion Detection System in Virtual Machine”. Second International Conference on Communication Software and Networks; 2010:26–264.Google Scholar
  26. Lippmann R, Haines JW, Fried DJ, Korba J, Das K: “The 1999 DARPA Off-Line Intrusion Detection Evaluation”. Proceedings The International Journal of Computer and Telecommunications Networking - Special issue on recent advances in intrusion detection systems 34th edition. 2000, 4: 579–595.Google Scholar
  27. Garfinkel T, Rosenblum M: “A Virtual Machine Introspection Based Architecture for Intrusion Detection”. Proceedings of Network and Distributed Systems Security Symposium 2003.Google Scholar
  28. Kim GH, Spafford EH: “The design and implementation of tripwire: a file system integrity checker”. Proceedings of the 2nd ACM Conference on Computer and communications security 1994, 18–29.Google Scholar
  29. Hofmeyr SA, Forrest S, Somayaji A: Intrusion detection using sequences of system calls. J Comput Secur 1998, 6: 151–180.Google Scholar
  30. Srivastava A, Giffin J: “Tamper-Resistant, Application-Aware Blocking of Malicious Network Connections”. Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection 2008, 39–58.View ArticleGoogle Scholar
  31. Pfoh J, Schneider C, Eckert C: “A Formal Model for Virtual Machine Introspection”. Proceedings of the 1st ACM workshop on Virtual machine security 2009, 1–10.View ArticleGoogle Scholar
  32. Loscocco PA, Wilson PW, Pendergrass JA, McDonell CD: “Linux Kernel Integrity Measurement Using Contextual Inspection”. Proceedings of the ACM workshop on Scalable trusted computing 2007, 21–29.Google Scholar
  33. Zhu D, Chin E: “Detection of vm-aware malware”. 2007.Google Scholar
  34. Egele M, Kruegel C, Kirda E, Yin H, Song D: “Dynamic Spyware Analysis”. Proceedings USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference 18th edition. 2007.Google Scholar
  35. You I, Yim K: “Malware obfuscation techniques: a brief survey”. Proceedings of International Conference on Broadband, Wireless Computing, Communication and Application 2010.Google Scholar
  36. Livshits B: “Dynamic taint tracking in managed runtimes”. Microsoft Research Technical Report; 2012. MSR-TR-2012–114 MSR-TR-2012-114Google Scholar
  37. Jiang X, Wang X, Xu D: “Stealthy Malware Detection Through VMM-Based “Out-of-the-Box” Semantic View Reconstruction”. Proceedings of the 14th ACM conference on Computer and communications security 2007, 128–138.View ArticleGoogle Scholar
  38. Klein T: “Scooby Doo-VMware Fingerprint Suite”. 2003. http://www.trapkit.de/research/vmm/scoopydoo/index.htmlGoogle Scholar
  39. Giffin J: “The Next Malware Battleground Recovery after Unknown Infection”. Proceedings of IEEE Journal on Security and Privacy 2010, 74–76.Google Scholar
  40. CERT Coordination Center: “CERT/CC security improvement modules: analyze all available information to characterize an intrusion”. 2001.Google Scholar
  41. Dittrich D: “Report on the Linux Honeypot Compromise”. 2000. http://project.honeynet.org/challenge/results/dittrich/evidence.txtGoogle Scholar
  42. Dunlap GW, King ST, Cinar S, Basrai MA, Chen PM: “ReVirt: enabling intrusion analysis through virtual-machine logging and replay”. Proceedings of the 5th symposium on Operating systems design and implementation 2002, 211–224.View ArticleGoogle Scholar
  43. Buchacker K, Buchacker K, Sieh V, Sieh V, Alexander F, Universität Erlangen-nürnberg: “Framework for testing the fault-tolerance of systems including OS and network aspects”. Proceedings IEEE High-Assurance System Engineering Symposium 2001.Google Scholar
  44. Rutkowska J, Tereshkin A: “Bluepilling the Xen Hypervisor”. USA: Black Hat; 2008.Google Scholar
  45. Sailer R, Valdez E, Jaeger T, Perez R, Van Doorn L, Griffin JL, Berger S, Sailer R, Valdez E, Jaeger T, Perez R, Doorn L, Linwood J, Berger GS: “sHype: Secure Hypervisor Approach to Trusted Virtualized Systems”. IBM Research Report RC23511; 2005.Google Scholar
  46. Loscocco P, Smalley S: “Integrating Flexible Support for Security Policies into the Linux Operating System”. Proceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference 2001, 29–42.Google Scholar
  47. Jaeger T, Sailer R, Zhang X: “Analyzing integrity protection in the SELinux example policy”. Proceedings of the 12th conference on USENIX Security Symposium 12th edition. 2003, 59. 74 74Google Scholar
  48. Vogl S: “Secure hypervisors”. Proceedings of 12th International Conference on Enterprise Information System 2010.Google Scholar
  49. Cheng G, Jin H, Zou D, Ohoussou AK, Zhao F: “A Prioritized Chinese Wall Model for Managing the Covert Information Flows in Virtual Machine Systems”. Proceedings of The 9th International Conference for Young Computer Scientists 2008, 1481–1487.Google Scholar
  50. Wang G, Li M, Weng C: “Chinese Wall Isolation Mechanism and Its Implementation on VMM”. Proceedings of Systems and Virtualization Management: standards and the cloud 71st edition. 2010, 13–18.View ArticleGoogle Scholar
  51. Sailer R, Jaeger T, Valdez E, Cáceres R, Perez R, Berger S, Griffin JL, Van Doorn L: “Building a MAC-Based Security Architecture for the Xen Open-Source Hypervisor”. Proceedings of Computer Security Applications Conference, 21st Annual 2005.Google Scholar
  52. Szefer J, Keller E, Lee RB, Rexford J: “Eliminating the Hypervisor Attack Surface for a More Secure Cloud”. Proceedings of the 18th ACM conference on Computer and communications security 2011, 401–412.Google Scholar
  53. Murray D, Milos G, Hand S: “Improving Xen Security through Disaggregation”. Proceedings of the Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments 2008, 151–160.View ArticleGoogle Scholar
  54. Dong Y, Yu Z, Rose G: “SR-IOV networking in Xen: architecture, design and implementation”. Proceedings of the First conference on I/O virtualization 2008.Google Scholar
  55. Clark C, Fraser K, Hand S, Hansen JG, July E, Limpach C, Pratt I, Warfield A: “Live Migration of Virtual Machines”. Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation 2nd edition. 2005, 273–286.Google Scholar
  56. Gouda MG, Liu AX: “A Model of Stateful Firewalls and its Properties”. Proceedings of the IEEE International Conference on Dependable Systems and Networks 2005.Google Scholar
  57. Kruegel C, Valeur F, Vigna G, Kemmerer R: “Stateful Intrusion Detection for High-Speed Networks”. Proceedings of the 2002 IEEE Symposium on Security and Privacy 2002.Google Scholar
  58. Xianqin C, Han W, Sumei W, Xiang L: “Seamless Virtual Machine Live Migration on Network Security Enhanced Hypervisor”. Proceedings of Broadband Network & Multimedia Technology 2009, 847–853.Google Scholar
  59. Laprie J, T LAAS-CNRS: “Resilience for the scalability of dependability”. Proceedings of ISNCA 2005, 5–6.Google Scholar
  60. Lombardi F, Di Pietro R, Soriente C: “CReW: cloud resilience forwindows guests through monitored virtualization”. Proceedings of the 29th IEEE Symposium on Reliable Distributed Systems 2010, 338–342.Google Scholar
  61. Lombardi F, Di Pietro R: “Kvmsec: a security extension for linux kernel virtual machines”. Proceedings of the ACM symposium on Applied Computing 2009, 2029–2034.Google Scholar
  62. Lombardi F, Di Pietro R: “Secure virtualization for cloud computing”. Proceedings of the Journal of Network and Computer Applications 2011, 1113–1122.Google Scholar
  63. Russel R: “lguest: implementing the little Linux hypervisor”. Proceedings of the Linux Symposium 2nd edition. 2007, 173–178.Google Scholar
  64. Reiser HP, Kapitza R: “Hypervisor-Based Efficient Proactive Recovery”. Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems 2007, 83–92.Google Scholar
  65. Reiser HP, Kapitza R: “VM-FIT: supporting intrusion tolerance with virtualisation technology”. Proceedings of the 1st Workshop on Recent Advances on Intrusion-Tolerant Systems 2007, 18–22.Google Scholar
  66. Taylor S, Henson M, Kanter M, Kuhn S, McGill K, Nichols C: “Bear–a resilient operating system for scalable multi-processors”. Submitted for publication in IEEE Security and Privacy, Nov/Dec 2011; 2011.Google Scholar
  67. Tanenbaum A, Woodhull A: “Operating systems: design and implementation. Upper Saddle River, USA: Prentice Hall; 2006.Google Scholar
  68. Lehtinen R, Russell D, Gangemi GT Sr: “Computer security basics”. 2nd edition. Sebastopol, USA: O’Reilly Media; 2006.Google Scholar
  69. Berger S, Cáceres R, Goldman K, Pendarakis D, Perez R, Rao JR, Rom E, Sailer R, Schildhauer W, Srinivasan D, Tal S, Valdez E: “Security for the cloud infrastructure: trusted virtual data center implementation”. Proceedings of the IBM Journal of Research and Development 53rd edition. 2009, 560–571.Google Scholar
  70. Berger S, Cáceres R, Pendarakis D, Sailer R, Valdez E: “TVDc: managing security in the trusted virtual datacenter”. Proceedings of ACM SIGOPS Operating Systems Review 42nd edition. 2008, 40–47.Google Scholar
  71. Yu S, Wang C, Ren K, Lou W: “Achieving Secure, Scalable, and Fine-grained Data Access Control in Cloud Computing”. Proceedings of the 29th conference on Information communications 2010, 534–542.Google Scholar

Copyright

© Denz and Taylor; licensee Springer. 2013

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.