Fortifying Multi-tenant Cloud Environments via Improved CPU Management

Date

2020

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Cloud computing technology has significantly changed the computing diagram. Ever since Amazon Web Service (AWS) began offering IT infrastructure services in 2006, cloud computing technology keeps maturing, developing, and offering more benefits. A cloud computing environment allows multiple cloud tenants to share the same physical or virtual host, thus called a multi-tenant environment. A cloud tenant can be an individual user or a group of users sharing cloud resources. A multi-tenant cloud environment is supposed to be secure and fair. However, due to the inevitable resource sharing among multiple tenants, one tenant’s behavior may impact the other tenants running simultaneously. It often leads to performance anomalies and security concerns.In this dissertation, I investigate these security and performance problems in the multi-tenant cloud environment and propose novel solutions to address these issues. Given that the root causes of these issues are due to resource sharing (i.e., CPU) among multiple tenants, our solutions are always centered around CPU management, as discussed below. First, I demonstrate a virtual machine (VM) is vulnerable to side-channel attacks when a malicious neighboring VM runs on the same host. To secure the host against such attacks, I design new schemes via a different CPU scheduling strategy for VMs. It can effectively defeat such side-channel attacks with a negligible performance overhead. Second, cloud computing often employs a pay-as-you-go pricing model, which relies on precise resource accounting to allocate the configured amount of resources to VMs. However, I reveal that the existing hypervisors often fail to do an accurate resource accounting, leading to allocate the incorrect amount of CPU resources to VMs. To address this issue, I propose to redefine the resource scope of VMs using its virtual CPU (vCPU). Through experiments, I show that the CPU consumption can now be correctly accounted for and managed. Third, the cloud increasingly adopts container technology these days, and the cloud sees more and more containerized applications deployed. This trend requires even stronger isolation between applications. However, I show that containerized applications’ expected performance isolation is not yet achieved. The primary reason is that the containers are inadequately managed due to complex interactions between various scheduling mechanisms in the current OS scheduler design. I propose to augment the underlying host’s scheduling mechanism to support the container orchestration system to address this problem. Extensive evaluations show that our new scheme can bring significant improvement in resource management and performance.

Description

Keywords

Citation