Towards Efficient and Scalable SDN-based Data Center Monitoring and Management




Journal Title

Journal ISSN

Volume Title



Recent years have witnessed the great success of cloud computing that is powered by data center systems. As an essential building block for many of our cyber infrastructures, the monitoring, security, and management of data center traffic is of ultimate importance. However, the ever-increasing scale and speed of data center networks (DCNs) pose various challenges to data center traffic monitoring and management. The traditional monitoring approaches are neither flexible nor efficient to meet the security and performance needs of DCNs. With the flexibility and programmability of Software-Defined Networking (SDN) technology, this dissertation presents efficient solutions to address DCNs monitoring and management challenges.First, we investigate the feasibility and efficiency of building software-defined monitoring (SDM) functionalities into network edges. For this purpose, We design and implement four measurement schemes in Open vSwitch (OVS), by either integrating forwarding and measurement functions into a pipeline, or decoupling them into parallel operations. We further include another eBPF-based approach in our comparative study. Then, we quantitatively show the various trade-offs among network performance and system overhead that different schemes strike to balance, and demonstrate the feasibility of instrumenting OVS with monitoring capabilities. Second, we leverage the flexibility and efficiency offered by SDM to enhance the security of DCNs. Accordingly, we design BotSifter, an SDN-based scalable, accurate and runtime bot detection framework for DCNs. To improve the detection scalability, BotSifter utilizes centralized learning with distributed detection by distributing detection tasks across the network edges. Furthermore, it employs multiple novel mechanisms for parallel detection of Command and Control (C&C) channels and botnet activities, which greatly enhances the detection robustness. Third, the traffic surges in DCNs are common, due to the attacking (e.g., via botnets) traffic or the increase of legitimate traffic. They can easily degrade the performance of containerized clouds these days. To address this challenge, we further propose to adaptively offload the DCNs traffic and optimize the network virtualization performance in DCNs. For this purpose, we propose EZPath, a novel system that can seamlessly expedite container traffic by leveraging the programmable Top-of-Rack (ToR) switches in clouds. EZPath dynamically offloads selected (e.g., performance-critical) network flows to the in-network programmable hardware, which not only optimizes the network performance, but also effectively mitigates the negative impact of attack traffic.