Performance of Open vSwitch-based Kubernetes Cluster in Pathological Cases

Šraier, Václav

Výkon síťového clusteru s Open vSwitch a Kubernes v patologických situacích

diploma thesis (DEFENDED)

View/Open

Záznam o průběhu obhajoby (347.2Kb)

Permanent link

http://hdl.handle.net/20.500.11956/184131

Identifiers

Study Information System: 251396

Consultant

Tůma, Petr

Referee

Yaghob, Jakub

Faculty / Institute

Faculty of Mathematics and Physics

Discipline

Computer Science - Software Systems

Department

Department of Distributed and Dependable Systems

Date of defense

6. 9. 2023

Publisher

Univerzita Karlova, Matematicko-fyzikální fakulta

Language

English

Grade

Excellent

Keywords (Czech)

kubernetes|ovs|výkonnost

Keywords (English)

kubernetes|ovs|performance

S nástupem cloud computingu, kontejnerů a horizontálně škálovatelné infrastruktury, se nedílnou součástí datových center staly softwarově defino- vané sítě (SDN). Jedním z běžně nasazovaných řešení je Kubernetes a Open vSwitch (OVS). V této diplomové práci hledáme možná výkonnostní ome- zení OVS při použití v rámci Kubernetes. Zaměřujeme se na problémy způ- sobené neobvyklým síťovým provozem. Výsledkem je objev několika typů pa- ketů způsobujících nadměrné zatížení uzlů clusteru. Jako hlavní příčinu jsme identifikovali řadu filtračních pravidel v OpenFlow a chybu v návrhu OVS, která brání jejich efektivnímu vyhodnocování. Při specifické konfiguraci sys- tému toto potenciálním útočníkům umožňuje využít objevenou neefektivitu k praktickému Denial-of-Service útoku na místní uzel clusteru, který způsobí kompletní síťový výpadek pro všechny kontejnery.

Abstract (English)

With the adoption of cloud computing, horizontally scalable infrastruc- ture, and containerized deployments, Software Defined Networking (SDN) became an integral part of data centers, Kubernetes and Open vSwitch (OVS) being one of the commonly deployed solutions. Our work explores the possible performance limitations of OVS under Kubernetes, focusing on pathological traffic patterns. We discovered several types of packets causing excess system load on the cluster nodes. We identified the root cause as a series of drop rules in OpenFlow and a design flaw in OVS that prevents their efficient evaluation. We investigated the impact of this problem and our research revealed a specific system configurations under which an adversary can use the discovered inefficiencies for a practical denial of service attack on the local cluster node, bringing the whole networking stack down for all neighbouring containers.

Citace dokumentu

Metadata

Show full item record