AuthorsJ. Rocher-Gonzalez, E. G. Gran, S. Reinemo, T. Skeie, J. Escudero-Sahuquillo, P. J. García, and F. J. Quiles
TitleAdaptive Routing in InfiniBand Hardware
AfilliationCommunication Systems
Project(s)Simula Metropolitan Center for Digital Engineering, Department of High Performance Computing
StatusAccepted
Publication TypeProceedings, refereed
Year of Publication2022
Conference NameThe 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
Publisher IEEE
Abstract

Interconnection networks are the communication backbone of modern high-performance computing systems and an optimised interconnection network is crucial for the performance and utilisation of the system as a whole. One element of the interconnection network is the routing algorithm, which directly influences how we are able to utilise the physical network topology. InfiniBand is one of the most common network architectures used in high-performance computing and traditionally it only supported static routing. For multi-path networks such as Fat-trees, static routing is inefficient because it cannot balance traffic in real-time nor utilise multiple paths efficiently under adversarial traffic. This again potentially leads to unnecessary contention and an underutilised network, which has led to numerous proposals on how to avoid this by using adaptive routing. Adaptive routing has recently  been introduced in InfiniBand and in this paper we evaluate to what extent the expected benefits of adaptive routing is true for InfiniBand. Through a set of experiments on HDR InfiniBand equipment we describe the basic behaviour of adaptive routing in InfiniBand, its benefits in Fat tree topologies and the unfortunate side effects related to unfairness that adaptive routing in general might introduce, including such phenomena as the reverse parking lot
problem and congestion spreading.

Citation Key34107

Contact person