Detection of Cross-Instance Cloud Data Remanence via Sector-Level Differential Analysis and Fragment Source Attribution



Journal Title

Journal ISSN

Volume Title



Modern cloud providers provision virtual machines for different customers from a common infrastructure of persistent storage, volatile memory, and processors. The hard disk space, RAM, and processor resources allocated to a new instance were previously in use by one or more other instances, where these other instances may have been used by other customers. If the cloud provider does not adequately sanitize data resident in these resources between allocations, then resources allocated to a new instance may include data from a previous instance. Such leakage across cloud virtual machine instances is an example of data remanence and may reveal personal or sensitive information to unauthorized parties. Detection of this kind of data remanence on hard disk cloud resources is the subject of this work. Data remanence concerns date back at least to the United States government-issued “Rainbow Series” books of the early 1990s, and concerns of cross-instance data remanence in cloud environments were raised as far back as 2009 (Mather, S., & Latif, 2009). To date, efforts to detect cross-instance cloud remanence have consisted of searching current instance unallocated space for fragments easily attributable to a prior user or instance, and results were necessarily dependent on the specific instances tested and the search terms employed by the investigator. In contrast, this work developed, tested, and applied a general method to detect cross-instance cloud remanence that does not depend on specific instances or search terms. This method collects unallocated space from multiple instances based on the same cloud provider template. Empty sectors and sectors which also appear in the allocated space of that instance are removed from the candidate remanence list, and the remaining sectors are compared to sectors from instances based on other templates from that same provider; a matching sector indicates likely cross-instance remanence. Matching sectors are further evaluated by considering contiguous sectors and mapping back to the source file from the other instance template, providing additional evidence that the recovered fragments are in fact from another instance. This work first found that unallocated space from multiple cloud instances based on the same template is not empty, random, nor identical - in itself an indicator of possible cross-instance remanence. This work also found sectors in unallocated space of multiple instances that mapped directly to contiguous portions of files from instances created from other templates, definitively proving cross-instance remanence. This work also identified multiple sectors in unallocated space which were not mapped to other known instances; such sectors could be from instance templates not included in these tests, from infrastructure operations, or from user data in another instance. This work contributes a general method to detect cross-instance cloud data remanence which is not dependent on a specific provider or infrastructure, instance details, or the presence of specific user-attributable remnant fragments. The method was based on the known operation of cloud environments, and a tool to implement the method was developed, validated, and then run on two enterprise cloud environments: a university and Amazon’s AWS cloud services. Cross-instance remanence was found in both cases.