As traffic patterns and network topologies become more and more complicated in current enterprise data centers and TOP500 supercomputers, the probability of network congestion increases. If no countermeasures are taken, network congestion causes long communication delays and degrades network performance. A congestion control mechanism is often provided to reduce the consequences of congestion. However, it is usually difficult to configure and activate a congestion control mechanism in production clusters and supercomputers due to concerns that it may negatively impact jobs if the mechanism is not appropriately configured. Therefore, simulations for these situations are necessary to identify congestion points and sources, and more importantly, to determine optimal settings that can be utilized to reduce congestion in those complicated networks. In this paper, we use OMNeT++ to implement the IEEE 802.1Qbb Priority-based Flow Control (PFC) and RoCEv2 Congestion Management (RCM) in order to simulate clusters with RoCEv2 interconnects.
- Pub Date:
- September 2015
- Computer Science - Networking and Internet Architecture;
- Computer Science - Performance
- Published in: A. F\"orster, C. Minkenberg, G. R. Herrera, M. Kirsche (Eds.), Proc. of the 2nd OMNeT++ Community Summit, IBM Research - Zurich, Switzerland, September 3-4, 2015, arXiv:1509.03284, 2015