ReNo

Resilient Integration of Machine Learning for Enhanced Network Operation

Abstract

Due to the important role that communication networks, and the Internet in particular, play in our society, such failures are seen as significant threats for our daily life’s. The main reason behind these failures? Networks are becoming more and more complex. While network operators are already struggling to control and to configure well-established protocols to patch failures or to prepare the Internet for the future, new technologies such as reconfigurable, softwarized, and programmable networks increase the amount of possible configuration knobs.

Yet, network operators get left behind with their traditional network operation tools and solutions. Machine learning based network management solutions propose a way to handle complexity in network management. Yet, ML can itself harm network resilience by making wrong decisions and being understood as black-box solutions that cannot be fully understood.

The goal of this project is to improve network resilience by identifying situations in which machine learning helps in making networks more resilient. To realize this goal, this project aims at deriving a methodology for performing risk assessment of (ML-driven) network management approaches. The influence of a given network management approach (e.g., ML-based) on network resilience will be captured in a resilience function that is derived by our methodology.

In a second step, we capture the non-trivial interaction effects that can occur when deploying ML-based network management functions independently in different networks. To exemplify our methods, we target a wide area networking use case with focus on two problems: i) Internet routing and ii) attack detection and mitigation. We argue that ML can be suitable for managing complex networks, when it can be sufficiently controlled. The application of ML should actually increase but not harm network resilience. Our framework helps to realize the vision of having resilience-by-design when new ML-based solutions are proposed: it should become an integral part of analyzing ML solutions before putting them into effect.