Projects
ERAC: Efficient and Robust Architecture for the Big Data Cloud

The ERAC research project, Efficient and Robust Architecture for the Big Data Cloud, is a collaboration between Oracle, Lyse, the University of Oslo (UiO), the University of Stavanger (UiS), and Simula, funded by the Research Council of Norway and the industrial partners Lyse and Oracle. In ERAC, we focus on selected problem areas where current technologies are not well suited to meet the promises of cloud computing: rapid elasticity, massive scale, resilience, virtualization, ease-of-use, trust and energy efficiency. These areas are of vital importance for realizing the potential of cloud computing and big data, where there are significant architectural and technological challenges that must be resolved.
More specifically, in the ERAC project we conduct research along the following three axes:
- A self-adaptive network architecture for clouds (Simula/UiO): To make the cloud better suited for big data and high performance computing (HPC) workloads, the network architecture of the cloud needs to be elastic and ensure fair and efficient dynamic provisioning of network resources, adapting to the demanding needs of the current set of users and applications. In ERAC we develop algorithms for optimized routing, path selection, and traffic prioritization, based on constant feedback from the network. The mechanisms need to include real-time monitoring, feedback procedures and on-the-fly network reconfiguration, where the tradeoff between the reconfiguration costs and the post-reconfiguration increase in performance and utilization is considered when making reconfiguration decisions.
- A fully virtualized architecture for clouds (Simula/UiO): Virtualization is a key technology to realize a cost-effective, highly elastic, and flexible cloud. In ERAC we seek to combine efficient virtualization techniques and live migration, with highly optimized HPC interconnection network technologies to make the cloud architecture suited for big data. Furthermore, the resulting virtualized cloud architecture should incorporate methods for both self-detection and prediction of faults, resource shortage and free capacity to ensure resilience and facilitate rapid elasticity, i.e. quickly scaling up and down depending on the users’ needs.
- Trust and security in the cloud (UiS): For private, public and commercial users to realize the potential of cloud computing and embrace its services, it is essential that the user can rest assured no unauthorized access to data or services are given - particularly as the data and the services are hosted at remote locations. To achieve the needed level of trust, in ERAC we devise a new framework for trust and security, including novel algorithms for authentication and privacy, in the context of cloud computing.
In order to demonstrate our solutions in a real world context, we will implement them in prototypes based on well-established open-source platforms such as OpenStack, Linux and the OpenFabrics Enterprise Distribution.
Funding source:
- The Research Council of Norway
- Oracle Corporation
- Lyse
All partners:
- Oracle Corporation
- Lyse
- University of Stavanger
- University of Oslo
- Simula Research Laboratory
Publications for ERAC: Efficient and Robust Architecture for the Big Data Cloud
Journal Article
A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
IEEE Transactions on Parallel and Distributed Systems 29, no. 12 (2018): 2658-2671.Status: Published
A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
Clouds offer flexible and economically attractive compute and storage solutions for enterprises. However, the effectiveness of cloud computing for high-performance computing (HPC) systems still remains questionable. When clouds are deployed on lossless interconnection networks, like InfiniBand (IB), challenges related to load-balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. Moreover, cloud data centers incorporate a highly dynamic environment rendering static network reconfigurations, typically used in IB systems, infeasible. In this paper, we present a framework for a self-adaptive network architecture for HPC clouds based on lossless interconnection networks, demonstrated by means of our implemented IB prototype. Our solution, based on a feedback control and optimization loop, enables the lossless HPC network to dynamically adapt to the varying traffic patterns, current resource availability, workload distributions, and also in accordance with the service provider-defined policies. Furthermore, we present IBAdapt, a simplified ruled-based language for the service providers to specify adaptation strategies used by the framework. Our developed self-adaptive IB network prototype is demonstrated using state-of-the-art industry software. The results obtained on a test cluster demonstrate the feasibility and effectiveness of the framework when it comes to improving Quality-of-Service compliance in HPC clouds.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2018 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 29 |
Issue | 12 |
Pagination | 2658-2671 |
Publisher | IEEE |
DOI | 10.1109/TPDS.2018.2842224 |
Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks
Concurrency and Computation: Practice and Experience 31, no. 2 (2018).Status: Published
Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks
To meet the demands of communication-intensive workloads in the cloud, virtual machines (VMs) should utilize low overhead network communication paradigms. In general, such paradigms enable VMs to directly communicate with the hardware by means of a passthrough technology like Single-Root I/O Virtualization (SR-IOV). However, when passthrough-based virtualization is coupled with lossless interconnection networks, live-migrations introduce scalability challenges due to the substantial network reconfiguration overhead. With these challenges in mind we proposed a virtual switch (vSwitch) SR-IOV architecture for InfiniBand in (33). In this paper, we first suggest solutions to rectify the space-domain scalability issues that are present in vSwitch-enabled subnets as a result of the VMs using dedicated layer-two addresses. Then we discuss routing strategies for virtualized environments using vSwitches, and present a routing algorithm for Fat-Trees. We also present a reconfiguration method that minimizes imposed reconfiguration overhead on Fat-Trees. We perform an extensive evaluation of our prototype algorithms, and as vSwitch-enabled hardware does not yet exist, we deduce from empirical observations by emulating vSwitches with existing hardware, as well as large-scale simulations. Our results show significant reduction in the reconfiguration times as route recalculations can be eliminated, and for certain scenarios, the number of reconfiguration subnet management packets sent to switches is reduced from several hundred thousand down to a single one without degrading the routing quality.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2018 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 31 |
Issue | 2 |
Date Published | 02/2018 |
Publisher | John Wiley & Sons |
Keywords | InfiniBand, Lossless Interconnection Networks, Network Reconfiguration, Network Routing, SR-IOV, Virtualization |
Journal Article
A Fault-Tolerant Routing Strategy for KNS Topologies Based on Intermediate Nodes
Concurrency and Computation: Practice and Experience 29, no. 13 (2017).Status: Published
A Fault-Tolerant Routing Strategy for KNS Topologies Based on Intermediate Nodes
Exascale computing systems are being built with thousands of nodes. The high number of components of these systems significantly increases the probability of failure. A key component for them is the interconnection network. If failures occur in the interconnection network, they may isolate a large fraction of the machine. For this reason, an efficient fault-tolerant mechanism is needed to keep the system interconnected, even in the presence of faults. A recently proposed topology for these large systems is the hybrid k-ary n-direct s-indirect (KNS) family that provides optimal performance and connectivity at a reduced hardware cost. This paper presents a fault-tolerant routing methodology for the KNS topology that degrades performance gracefully in presence of faults and tolerates a large number of faults without disabling any healthy computing node. In order to tolerate network failures, the methodology uses a simple mechanism. For any source-destination pair, if necessary, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network) with the aim of circumventing faults. The evaluation results shows that the proposed methodology tolerates a large number of faults. For instance, it is able to tolerate more than 99.5% of fault combinations when there are ten faults in a 3-D network with 1,000 nodes using only one intermediate node and more than 99.98% if two intermediate nodes are used. Furthermore, the methodology offers a gracious performance degradation. As an example, performance degrades only by 1% for a 2-D network with 1,024 nodes and 1% faulty links.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2017 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 29 |
Issue | 13 |
Publisher | John Wiley & Sons, Ltd. |
Keywords | exascale computing, fault-tolerant routing, hybrid topology, KNS topology |
DOI | 10.1002/cpe.4065 |
PhD Thesis
Network Optimization for High Performance Cloud Computing
In University of Oslo. Vol. PhD. University of Oslo: University of Oslo, 2017.Status: Published
Network Optimization for High Performance Cloud Computing
Cloud Computing has seen a tremendous popularity in last several years. A scalable and efficient data center network is essential for a performance capable cloud computing infrastructure. This thesis provides practical solutions to enable an efficient, flexible, multi-tenant network architecture suitable for high-performance cloud computing, using InfiniBand (IB) as a demonstration technology. The work is motivated by the needs of the future data centers to provide efficient cloud solutions for increasing uptake of the cloud technology for both big data and traditional High-Performance Computing (HPC) applications.
Research contributions of this thesis lie within three main categories. First, we propose a set of improvements to the fat-tree routing algorithm to make it suitable for HPC workloads in the cloud. Fat-Tree is a popular network topology in HPC systems. Our proposed improvements to the fat-tree routing make it more efficient, provides performance isolation among tenants in multi-tenant systems, and enable routing of both physical end nodes and virtualized end nodes according to the policies set by the provider. Second, we design new network reconfiguration methods to significantly reduce the time it takes to reroute the IB network. Reduced network reconfiguration time means that the interconnection network in a HPC cloud can optimize itself quickly to adapt to changing tenant configurations, faults, running workloads, and current network conditions. Last, we demonstrate a self-adaptive network prototype for IB-based HPC clouds, fully equipped with autonomous monitoring and adaptation, and configurable through a high-level condition-action language for the service providers.
The research conducted in this thesis has potential impacts on both private cloud infrastructures, such as medium sized clusters used for enterprise HPC, and public clouds offering innovative HPC solutions to the customers at scale. The industrial application of the thesis is reflected by the eight patent applications resulted from this work.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | PhD Thesis |
Year of Publication | 2017 |
Degree awarding institution | University of Oslo |
Degree | PhD |
Date Published | 12/2017 |
Publisher | University of Oslo |
Place Published | University of Oslo |
URL | http://urn.nb.no/URN:NBN:no-62076 |
Patent
System and method for efficient network reconfiguration in fat-trees
2016.Status: Published
System and method for efficient network reconfiguration in fat-trees
Systems and methods are provided for supporting efficient reconfiguration of an interconnection network having a pre-existing routing comprising. An exemplary method can provide, a plurality of switches, the plurality switches comprising at least one leaf switch, wherein each of the one or more switches comprise a plurality of ports, and a plurality of end nodes, wherein the plurality of end nodes are interconnected via the one or more switches. The method can detect, by a subnet manager, a reconfiguration triggering event. The method can compute, by the subnet manager, a new routing for the interconnection network, wherein the computing by the subnet manager of the new routing for the interconnection network takes into consideration the pre-existing routing and selects the new routing for the interconnection network that is closest to the pre-existing routing. The method can reconfigure the interconnection network according to the new routing.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US15/073,022 |
Date Published | 03/2016 |
Patent Type | Pending |
System and Method for Providing an InfiniBand SR-IOV vSwitch Architecture for a High Performance Cloud Computing Environment
2016.Status: Published
System and Method for Providing an InfiniBand SR-IOV vSwitch Architecture for a High Performance Cloud Computing Environment
Systems and methods are provided for implementing a Virtual Switch (vSwitch) architecture that supports transparent virtualization and live migration. In an embodiment, a vSwitch with prepopulated Local Identifiers (LIDs). Another embodiment provides for vSwitch with dynamic LID assignment. Another embodiment provides for vSwitch with prepopulated LIDS and dynamic LID assignment Moreover, embodiments of the present invention provide scalable dynamic network reconfiguration methods which enable live migrations of VMs in network environments.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US15/050,901 |
Date Published | 02/2016 |
Patent Type | Pending |
Systems and Method for Providing a Dynamic Cloud with Subnet Administration (SA) Query Caching
2016.Status: Published
Systems and Method for Providing a Dynamic Cloud with Subnet Administration (SA) Query Caching
A system and method support can subnet management in a cloud environment. During a virtual machine migration in a cloud environment, a subnet manager can become a bottleneck point that delays efficient service. A system and method can alleviate this bottleneck point by ensuring a virtual machine retains a plurality of addresses after migration. The system and method can further allow for each host node within the cloud environment to be associated with a local cache that virtual machines can utilize when re-establishing communication with a migrated virtual machine.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14924281 |
Date Published | 05/2016 |
Patent Type | Pending |
Patent Number | 20160127495 |
System and method for supporting partition-aware routing in a multi-tenant cluster environment
2016.Status: Published
System and method for supporting partition-aware routing in a multi-tenant cluster environment
A system and method can support partition-aware routing in a multi-tenant cluster environment. An exemplary method can support one or more tenants within the multi-tenant cluster environment. The method can associate each of the one or more tenants with a partition of a plurality of partitions. The method can then associate each of the plurality of partitions with one or more nodes of a plurality of nodes, each of the plurality of nodes being associated with a leaf switch of a plurality of switches, the plurality of switches comprising a plurality of leaf switches and a plurality of root switches. Finally, the method can generate one or more linear forwarding tables, the one or more linear forwarding tables providing isolation between the plurality of partitions, wherein each of the plurality of nodes is associated with a partitioning order.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14927085 |
Date Published | 05/2016 |
Patent Type | Pending |
Notes | Date Filed: July 6, 2015, Published Online: May 5, 2016. |
System and method for supporting efficient load-balancing in a high performance computing (HPC) environment
2016.Status: Published
System and method for supporting efficient load-balancing in a high performance computing (HPC) environment
Methods and systems for supporting efficient load balancing among a plurality of switches and a plurality of end nodes arranged in a tree topology in a network environment. The methods and systems can sort the plurality of end nodes, wherein the plurality of end nodes are sorted in a decreasing order of a receive weight. The method and system may further route, in the decreasing order of receive weights, the plurality of end nodes, wherein the routing comprises selecting at least one down-going port and at least one up-going port. Further, the method and system can increase an accumulated downward weight on each selected down-going port by the receive weight of the routed end node, and increase an accumulated upward weight on each selected up-going port by the receive weight of the routed end node
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14792070 |
Date Published | 01/16 |
Patent Type | Pending |
Notes | Date Filed: October 19, 2015, Published Online: Jan 14, 2016. |
Proceedings, refereed
Fast Hybrid Network Reconfiguration for Large-Scale Lossless Interconnection Networks
In 15th IEEE International Symposium on Network Computing and Applications (NCA 2016). IEEE, 2016.Status: Published
Fast Hybrid Network Reconfiguration for Large-Scale Lossless Interconnection Networks
Reconfiguration of high performance lossless interconnection networks is a cumbersome and time-consuming task. For that reason reconfiguration in large networks are typically limited to situations where it is absolutely necessary, for instance when severe faults occur. On the contrary, due to the shared and dynamic nature of modern cloud infrastructures, performance-driven reconfigurations are necessary to ensure efficient utilization of resources. In this work we present a scheme that allows for fast reconfigurations by limiting the task to sub-parts of the network that can benefit from a local reconfiguration. Moreover, our method is able to use different routing algorithms for different sub-parts within the same subnet. We also present a Fat-Tree routing algorithm that reconfigures a network given a user-provided node ordering. Hardware experiments and large scale simulation results show that we are able to significantly reduce reconfiguration times from 50% to as much as 98.7% for very large topologies, while improving performance.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | 15th IEEE International Symposium on Network Computing and Applications (NCA 2016) |
Pagination | 101-108 |
Publisher | IEEE |
Keywords | Cloud computing, Fat-Tree, HPC, InfiniBand, Network Reconfiguration, scalability |
Publications
Proceedings, refereed
Adaptive Routing in InfiniBand Hardware
In The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing. IEEE, 2022.Status: Published
Adaptive Routing in InfiniBand Hardware
Interconnection networks are the communication backbone of modern high-performance computing systems and an optimised interconnection network is crucial for the performance and utilisation of the system as a whole. One element of the interconnection network is the routing algorithm, which directly influences how we are able to utilise the physical network topology. InfiniBand is one of the most common network architectures used in high-performance computing and traditionally it only supported static routing. For multi-path networks such as Fat-trees, static routing is inefficient because it cannot balance traffic in real-time nor utilise multiple paths efficiently under adversarial traffic. This again potentially leads to unnecessary contention and an underutilised network, which has led to numerous proposals on how to avoid this by using adaptive routing. Adaptive routing has recently been introduced in InfiniBand and in this paper we evaluate to what extent the expected benefits of adaptive routing is true for InfiniBand. Through a set of experiments on HDR InfiniBand equipment we describe the basic behaviour of adaptive routing in InfiniBand, its benefits in Fat tree topologies and the unfortunate side effects related to unfairness that adaptive routing in general might introduce, including such phenomena as the reverse parking lot
problem and congestion spreading.
Afilliation | Communication Systems |
Project(s) | Simula Metropolitan Center for Digital Engineering, Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2022 |
Conference Name | The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing |
Pagination | 463-472 |
Publisher | IEEE |
Journal Article
DistTune: Distributed Fine-Grained Adaptive Traffic Speed Prediction for Growing Transportation Networks
Transportation Research Record: Journal of the Transportation Research Board 2675, no. 10 (2021): 211-227.Status: Published
DistTune: Distributed Fine-Grained Adaptive Traffic Speed Prediction for Growing Transportation Networks
Over the past decade, many approaches have been introduced for traffic speed prediction. However, providing fine-grained, accurate, time-efficient, and adaptive traffic speed prediction for a growing transportation network where the size of the network keeps increasing and new traffic detectors are constantly deployed has not been well studied. To address this issue, this paper presents DistTune based on long short-term memory (LSTM) and the Nelder-Mead method. When encountering an unprocessed detector, DistTune decides if it should customize an LSTM model for this detector by comparing the detector with other processed detectors in the normalized traffic speed patterns they have observed. If a similarity is found, DistTune directly shares an existing LSTM model with this detector to achieve time-efficient processing. Otherwise, DistTune customizes an LSTM model for the detector to achieve fine-grained prediction. To make DistTune even more time-efficient, DisTune performs on a cluster of computing nodes in parallel. To achieve adaptive traffic speed prediction, DistTune also provides LSTM re-customization for detectors that suffer from unsatisfactory prediction accuracy due to, for instance, changes in traffic speed patterns. Extensive experiments based on traffic data collected from freeway I5-N in California are conducted to evaluate the performance of DistTune. The results demonstrate that DistTune provides fine-grained, accurate, time-efficient, and adaptive traffic speed prediction for a growing transportation network.
Afilliation | Communication Systems |
Project(s) | Department of High Performance Computing |
Publication Type | Journal Article |
Year of Publication | 2021 |
Journal | Transportation Research Record: Journal of the Transportation Research Board |
Volume | 2675 |
Issue | 10 |
Pagination | 211 - 227 |
Date Published | 05/2021 |
Publisher | US National Research Council |
ISSN | 0361-1981 |
DOI | 10.1177/03611981211011170 |
Proceedings, refereed
How Far Should We Look Back to Achieve Effective Real-Time Time-Series Anomaly Detection?
In Advanced Information Networking and Applications. Springer International Publishing, 2021.Status: Published
How Far Should We Look Back to Achieve Effective Real-Time Time-Series Anomaly Detection?
Anomaly detection is the process of identifying unexpected events or abnormalities in data, and it has been applied in many different areas such as system monitoring, fraud detection, healthcare, intrusion detection, etc. Providing real-time, lightweight, and proactive anomaly detection for time series with neither human intervention nor domain knowledge could be highly valuable since it reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous event occurs. To our knowledge, RePAD (Real-time Proactive Anomaly Detection algorithm) is a generic approach with all above-mentioned features. To achieve real-time and lightweight detection, RePAD utilizes Long Short-Term Memory (LSTM) to detect whether or not each upcoming data point is anomalous based on short-term historical data points. However, it is unclear that how different amounts of historical data points affect the performance of RePAD. Therefore, in this paper, we investigate the impact of different amounts of historical data on RePAD by introducing a set of performance metrics that cover novel detection accuracy measures, time efficiency, readiness, and resource consumption, etc. Empirical experiments based on real-world time series datasets are conducted to evaluate RePAD in different scenarios, and the experimental results are presented and discussed.
Afilliation | Communication Systems |
Project(s) | Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2021 |
Conference Name | Advanced Information Networking and Applications |
Pagination | 136–148 |
Publisher | Springer International Publishing |
ISBN Number | 978-3-030-75100-5 |
DOI | 10.1007/978-3-030-75100-5_13 |
SALAD: Self-Adaptive Lightweight Anomaly Detection for Real-time Recurrent Time Series
In 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, 2021.Status: Published
SALAD: Self-Adaptive Lightweight Anomaly Detection for Real-time Recurrent Time Series
Providing a lightweight self-adaptive approach that does not need offline training in advance and meanwhile is able to detect anomalies in real time could be highly beneficial. Such an approach could be immediately applied and deployed on any commodity machine to provide timely anomaly alerts. To facilitate such an approach, this paper introduces SALAD, which is a Self-Adaptive Lightweight Anomaly Detection approach based on a special type of recurrent neural networks called Long Short-Term Memory (LSTM). Instead of using offline training, SALAD converts a target time series into a series of average absolute relative error (AARE) values on the fly and predicts an AARE value for every upcoming data point based on short-term historical AARE values. If the difference between a calculated AARE value and its corresponding forecast AARE value is higher than a self-adaptive detection threshold, the corresponding data point is considered anomalous. Otherwise, the data point is considered normal. Experiments based on a real-world time series dataset demonstrates that SALAD outperforms five other state-of-the-art anomaly detection approaches in terms of detection accuracy. In addition, the results also show that SALAD is lightweight and can be deployed on a commodity machine.
Afilliation | Communication Systems |
Project(s) | Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2021 |
Conference Name | 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC) |
Pagination | 344-349 |
Date Published | 07/2021 |
Publisher | IEEE |
DOI | 10.1109/COMPSAC51774.2021.00056 |
Proceedings, refereed
Distributed Fine-Grained Traffic Speed Prediction for Large-Scale Transportation Networks Based on Automatic LSTM Customization and Sharing
In Euro-Par 2020: 26th International Conference on Parallel and Distributed Computing. Cham: Springer International Publishing, 2020.Status: Published
Distributed Fine-Grained Traffic Speed Prediction for Large-Scale Transportation Networks Based on Automatic LSTM Customization and Sharing
Short-term traffic speed prediction has been an important research topic in the past decade, and many approaches have been introduced. However, providing fine-grained, accurate, and efficient traffic-speed prediction for large-scale transportation networks where numerous traffic detectors are deployed has not been well studied. In this paper, we propose DistPre, which is a distributed fine-grained traffic speed prediction scheme for large-scale transportation networks. To achieve fine-grained and accurate traffic-speed prediction, DistPre customizes a Long Short-Term Memory (LSTM) model with an appropriate hyperparameter configuration for a detector. To make such a customization process efficient and applicable for large-scale transportation networks, DistPre conducts LSTM customization on a cluster of computation nodes and allows any trained LSTM model to be shared between different detectors. If a detector observes a similar traffic pattern to another one, DistPre directly shares the existing LSTM model between the two detectors rather than customizing an LSTM model per detector. Experiments based on traffic data collected from freeway I5-N in California are conducted to evaluate the performance of DistPre. The results demonstrate that DistPre provides time-efficient LSTM customization and accurate fine-grained traffic-speed prediction for large-scale transportation networks.
Afilliation | Communication Systems |
Project(s) | Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2020 |
Conference Name | Euro-Par 2020: 26th International Conference on Parallel and Distributed Computing |
Pagination | 234–247 |
Date Published | 08/2020 |
Publisher | Springer International Publishing |
Place Published | Cham |
ISBN Number | 978-3-030-57675-2 |
DOI | 10.1007/978-3-030-57675-2_15 |
RePAD: Real-Time Proactive Anomaly Detection for Time Series
In Advanced Information Networking and Applications. Cham: Springer International Publishing, 2020.Status: Published
RePAD: Real-Time Proactive Anomaly Detection for Time Series
During the past decade, many anomaly detection approaches have been introduced in different fields such as network monitoring, fraud detection, and intrusion detection. However, they require understanding of data pattern and often need a long off-line period to build a model or network for the target data. Providing real-time and proactive anomaly detection for streaming time series without human intervention and domain knowledge is highly valuable since it greatly reduces human effort and enables appropriate countermeasures to be undertaken before a disastrous damage, failure, or other harmful event occurs. However, this issue has not been well studied yet. To address it, this paper proposes RePAD, which is a Real-time Proactive Anomaly Detection algorithm for streaming time series based on Long Short-Term Memory (LSTM). RePAD utilizes short-term historical data points to predict and determine whether or not the upcoming data point is a sign that an anomaly is likely to happen in the near future. By dynamically adjusting the detection threshold over time, RePAD is able to tolerate minor pattern change in time series and detect anomalies either proactively or on time. Experiments based on two time series datasets collected from the Numenta Anomaly Benchmark demonstrate that RePAD is able to proactively detect anomalies and provide early warnings in real time without human intervention and domain knowledge.
Afilliation | Communication Systems |
Project(s) | Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2020 |
Conference Name | Advanced Information Networking and Applications |
Pagination | 1291–1302 |
Date Published | 03/2020 |
Publisher | Springer International Publishing |
Place Published | Cham |
ISBN Number | 978-3-030-44041-1 |
DOI | 10.1007/978-3-030-44041-1_110 |
ReRe: A Lightweight Real-time Ready-to-Go Anomaly Detection Approach for Time Series
In IEEE 44th Annual Computers, Software, and Applications Conference . IEEE, 2020.Status: Published
ReRe: A Lightweight Real-time Ready-to-Go Anomaly Detection Approach for Time Series
Anomaly detection is an active research topic in many different fields such as intrusion detection, network monitoring, system health monitoring, IoT healthcare, etc. However, many existing anomaly detection approaches require either human intervention or domain knowledge, and may suffer from high computation complexity, consequently hindering their applicability in real-world scenarios. Therefore, a lightweight and ready-to-go approach that is able to detect anomalies in real-time is highly sought-after. Such an approach could be easily and immediately applied to perform time series anomaly detection on any commodity machine. The approach could provide timely anomaly alerts and by that enable appropriate countermeasures to be undertaken as early as possible. With these goals in mind, this paper introduces ReRe, which is a Real-time Ready-to-go proactive Anomaly Detection algorithm for streaming time series. ReRe employs two lightweight Long Short-Term Memory (LSTM) models to predict and jointly determine whether or not an upcoming data point is anomalous based on short-term historical data points and two long-term self-adaptive thresholds. Experiments based on real-world time-series datasets demonstrate the good performance of ReRe in real-time anomaly detection without requiring human intervention or domain knowledge.
Afilliation | Communication Systems |
Project(s) | Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2020 |
Conference Name | IEEE 44th Annual Computers, Software, and Applications Conference |
Pagination | 322-327 |
Publisher | IEEE |
ISBN Number | 978-1-7281-7303-0 |
ISSN Number | 0730-3157 |
DOI | 10.1109/COMPSAC48688.2020.0-226 |
Proceedings, refereed
Mobile Edge as Part of the Multi-Cloud Ecosystem: A Performance Study
In Proceedings of the 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). Pavia, Lombardia/Italy: IEEE Computer Society, 2019.Status: Published
Mobile Edge as Part of the Multi-Cloud Ecosystem: A Performance Study
Cloud computing has revolutionized the way of application usage and deployment: applications run cost-effectively in remote data centers. With the increasing need for mobility and micro-services, particularly with the upcoming 5G mobile broadband networks, there is also a strong demand for mobile edge computing (MEC): applications run in small cloud systems in close proximity to the user, in order to minimize latencies. Both cloud and MEC have their advantages and disadvantages. Combining the two approaches in a unified multi-cloud, consisting of both traditional cloud services provisioned over heterogeneous cloud platforms and MEC systems, has the potential of obtaining the best out of both worlds. However, a comprehensive study is needed to evaluate the performance gains and the overheads involved for real-world cloud applications. In this paper, we introduce a baseline performance evaluation in order to identify the fallacies and pitfalls of combining multiple cloud systems and MEC into a unified MEC-multi-cloud platform. For this purpose, we analyze the basic, application-independent performance metrics of average round-trip time (RTT) and average application payload throughput in a setup consisting of two private and one public cloud systems. This baseline performance analysis confirms the feasibility of MEC-multi-cloud, and provides guidelines for designing an autonomic resource provisioning solutions, in terms of an extension proposed to our existing Melodic middleware platform for multi-cloud applications.
Afilliation | Communication Systems |
Project(s) | MELODIC: Multi-cloud Execution-ware for Large-scale Optimised Data-Intensive Computing, NorNet, The Center for Resilient Networks and Applications, Simula Metropolitan Center for Digital Engineering |
Publication Type | Proceedings, refereed |
Year of Publication | 2019 |
Conference Name | Proceedings of the 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) |
Pagination | 59-66 |
Date Published | 02/2019 |
Publisher | IEEE Computer Society |
Place Published | Pavia, Lombardia/Italy |
ISBN Number | 978-1-7281-1644-0 |
Keywords | Cloud computing, latency, Mobile edge computing, Multi-Cloud, Performance |
DOI | 10.1109/PDP.2019.00017 |
Journal Article
A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
IEEE Transactions on Parallel and Distributed Systems 29, no. 12 (2018): 2658-2671.Status: Published
A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
Clouds offer flexible and economically attractive compute and storage solutions for enterprises. However, the effectiveness of cloud computing for high-performance computing (HPC) systems still remains questionable. When clouds are deployed on lossless interconnection networks, like InfiniBand (IB), challenges related to load-balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. Moreover, cloud data centers incorporate a highly dynamic environment rendering static network reconfigurations, typically used in IB systems, infeasible. In this paper, we present a framework for a self-adaptive network architecture for HPC clouds based on lossless interconnection networks, demonstrated by means of our implemented IB prototype. Our solution, based on a feedback control and optimization loop, enables the lossless HPC network to dynamically adapt to the varying traffic patterns, current resource availability, workload distributions, and also in accordance with the service provider-defined policies. Furthermore, we present IBAdapt, a simplified ruled-based language for the service providers to specify adaptation strategies used by the framework. Our developed self-adaptive IB network prototype is demonstrated using state-of-the-art industry software. The results obtained on a test cluster demonstrate the feasibility and effectiveness of the framework when it comes to improving Quality-of-Service compliance in HPC clouds.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2018 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 29 |
Issue | 12 |
Pagination | 2658-2671 |
Publisher | IEEE |
DOI | 10.1109/TPDS.2018.2842224 |
Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks
Concurrency and Computation: Practice and Experience 31, no. 2 (2018).Status: Published
Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks
To meet the demands of communication-intensive workloads in the cloud, virtual machines (VMs) should utilize low overhead network communication paradigms. In general, such paradigms enable VMs to directly communicate with the hardware by means of a passthrough technology like Single-Root I/O Virtualization (SR-IOV). However, when passthrough-based virtualization is coupled with lossless interconnection networks, live-migrations introduce scalability challenges due to the substantial network reconfiguration overhead. With these challenges in mind we proposed a virtual switch (vSwitch) SR-IOV architecture for InfiniBand in (33). In this paper, we first suggest solutions to rectify the space-domain scalability issues that are present in vSwitch-enabled subnets as a result of the VMs using dedicated layer-two addresses. Then we discuss routing strategies for virtualized environments using vSwitches, and present a routing algorithm for Fat-Trees. We also present a reconfiguration method that minimizes imposed reconfiguration overhead on Fat-Trees. We perform an extensive evaluation of our prototype algorithms, and as vSwitch-enabled hardware does not yet exist, we deduce from empirical observations by emulating vSwitches with existing hardware, as well as large-scale simulations. Our results show significant reduction in the reconfiguration times as route recalculations can be eliminated, and for certain scenarios, the number of reconfiguration subnet management packets sent to switches is reduced from several hundred thousand down to a single one without degrading the routing quality.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2018 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 31 |
Issue | 2 |
Date Published | 02/2018 |
Publisher | John Wiley & Sons |
Keywords | InfiniBand, Lossless Interconnection Networks, Network Reconfiguration, Network Routing, SR-IOV, Virtualization |
Journal Article
A Fault-Tolerant Routing Strategy for KNS Topologies Based on Intermediate Nodes
Concurrency and Computation: Practice and Experience 29, no. 13 (2017).Status: Published
A Fault-Tolerant Routing Strategy for KNS Topologies Based on Intermediate Nodes
Exascale computing systems are being built with thousands of nodes. The high number of components of these systems significantly increases the probability of failure. A key component for them is the interconnection network. If failures occur in the interconnection network, they may isolate a large fraction of the machine. For this reason, an efficient fault-tolerant mechanism is needed to keep the system interconnected, even in the presence of faults. A recently proposed topology for these large systems is the hybrid k-ary n-direct s-indirect (KNS) family that provides optimal performance and connectivity at a reduced hardware cost. This paper presents a fault-tolerant routing methodology for the KNS topology that degrades performance gracefully in presence of faults and tolerates a large number of faults without disabling any healthy computing node. In order to tolerate network failures, the methodology uses a simple mechanism. For any source-destination pair, if necessary, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network) with the aim of circumventing faults. The evaluation results shows that the proposed methodology tolerates a large number of faults. For instance, it is able to tolerate more than 99.5% of fault combinations when there are ten faults in a 3-D network with 1,000 nodes using only one intermediate node and more than 99.98% if two intermediate nodes are used. Furthermore, the methodology offers a gracious performance degradation. As an example, performance degrades only by 1% for a 2-D network with 1,024 nodes and 1% faulty links.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2017 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 29 |
Issue | 13 |
Publisher | John Wiley & Sons, Ltd. |
Keywords | exascale computing, fault-tolerant routing, hybrid topology, KNS topology |
DOI | 10.1002/cpe.4065 |
Proceedings, refereed
A New Fault-Tolerant Routing Methodology for KNS Topologies
In 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB). IEEE, 2016.Status: Published
A New Fault-Tolerant Routing Methodology for KNS Topologies
Exascale computing systems are being built with thousands of nodes. A key component of these systems is the interconnection network. The high number of components significantly increases the probability of failure. If failures occur in the interconnection network, they may isolate a large fraction of the machine. For this reason, an efficient fault-tolerant mechanism is needed to keep the system interconnected, even in the presence of faults. A topology recently proposed for these large systems is the hybrid KNS family that provides good performance and connectivity at a reduced hardware cost. This paper present a fault-tolerant routing methodology for the KNS topology that degrades performance gracefully in the presence of faults and tolerates a reasonably large number of faults without disabling any healthy node. In order to tolerate network failures, the methodology uses a simple mechanism: for some sourcedestination pairs, only if necessary, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network) which allow avoiding faults. The evaluation results shows that the methodology tolerates a large number of faults. Furthermore, the methodology offers a gracious performance degradation. For instance, performance degrades only 1% for a 2D-network with 1024 nodes and 1% faulty links.
Afilliation | Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB) |
Pagination | 1-8 |
Date Published | 03/2016 |
Publisher | IEEE |
ISBN Number | 978-1-5090-2121-5 |
DOI | 10.1109/HIPINEB.2016.9 |
Fast Hybrid Network Reconfiguration for Large-Scale Lossless Interconnection Networks
In 15th IEEE International Symposium on Network Computing and Applications (NCA 2016). IEEE, 2016.Status: Published
Fast Hybrid Network Reconfiguration for Large-Scale Lossless Interconnection Networks
Reconfiguration of high performance lossless interconnection networks is a cumbersome and time-consuming task. For that reason reconfiguration in large networks are typically limited to situations where it is absolutely necessary, for instance when severe faults occur. On the contrary, due to the shared and dynamic nature of modern cloud infrastructures, performance-driven reconfigurations are necessary to ensure efficient utilization of resources. In this work we present a scheme that allows for fast reconfigurations by limiting the task to sub-parts of the network that can benefit from a local reconfiguration. Moreover, our method is able to use different routing algorithms for different sub-parts within the same subnet. We also present a Fat-Tree routing algorithm that reconfigures a network given a user-provided node ordering. Hardware experiments and large scale simulation results show that we are able to significantly reduce reconfiguration times from 50% to as much as 98.7% for very large topologies, while improving performance.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | 15th IEEE International Symposium on Network Computing and Applications (NCA 2016) |
Pagination | 101-108 |
Publisher | IEEE |
Keywords | Cloud computing, Fat-Tree, HPC, InfiniBand, Network Reconfiguration, scalability |
Improvements to the InfiniBand Congestion Control Mechanism
In 24th Annual Symposium on High-Performance Interconnects (HotI), 2016.Status: Published
Improvements to the InfiniBand Congestion Control Mechanism
The InfiniBand Congestion Control mechanism (IB CC) is able to reduce the negative consequences of congestion in many situations. However, its effectiveness depends on a set of parameters that must be set by administrators. If the parameters are not appropriately configured, IB CC could negatively impact network performance. Additionally, there is no universal parameter setting that can fit all situations. These difficulties prevent IB CC from being widely used.
In this paper we propose several enhancements to the existing IB CC. First, our improved IB CC significantly reduces parameter configuration. Second, the congestion will be removed quickly. Third, a new utilization-driven approach and a new Link Bandwidth Availability Report (LBAR) approach are implemented to guide sending interfaces on how and when to adjust their injection rates. This adjustment is aware of the actual network condition, rather than relying on pre-configured parameters, as in the existing IB CC. Simulation results have demonstrated that our improved IB CC is able to reduce the congestion consequences efficiently and can adapt to various network topologies and traffic patterns.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | 24th Annual Symposium on High-Performance Interconnects (HotI) |
Keywords | congestion control, InfiniBand, Injection Rate |
Realizing a Self-Adaptive Network Architecture for HPC Clouds
In The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase, 2016.Status: Published
Realizing a Self-Adaptive Network Architecture for HPC Clouds
Clouds offer significant advantages over traditional cluster computing architectures including ease of deployment, rapid elasticity, and an economically attractive pay-as-you-go business model. However, the effectiveness of cloud computing for HPC systems still remains questionable. When clouds are deployed on lossless interconnection networks, challenges related to load balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. In this work, we attack these challenges and propose a novel holistic framework of a self-adaptive IB subnet for HPC clouds. Our solution consists of a feedback control loop that effectively incorporate optimizations based on the multidimensional objective function using current resource configuration and provider-defined policies. We build our system using a bottom-up approach, starting by prototyping solutions tackling individual research challenges associated, and later combining our novel solutions into a working self-adaptive cloud prototype. All our results are demonstrated using state-of-the art industry software to enable easy integration into running systems.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase |
Talks, invited
About Management of Exascale Systems
In ExaComm 2016, Frankfurt, 2016.Status: Published
About Management of Exascale Systems
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Talks, invited |
Year of Publication | 2016 |
Location of Talk | ExaComm 2016, Frankfurt |
Type of Talk | Invited talk |
Journal Article
Compact Network Reconfiguration in Fat-Trees
The Journal of Supercomputing 72, no. 12 (2016): 4438-4467.Status: Published
Compact Network Reconfiguration in Fat-Trees
Afilliation | Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2016 |
Journal | The Journal of Supercomputing |
Volume | 72 |
Issue | 12 |
Pagination | 4438–4467 |
Publisher | Springer |
Efficient Network Isolation and Load Balancing in Multi-Tenant HPC Clusters
Journal of Future Generation Computer Systems (2016).Status: Published
Efficient Network Isolation and Load Balancing in Multi-Tenant HPC Clusters
Afilliation | Communication Systems |
Project(s) | No Simula project |
Publication Type | Journal Article |
Year of Publication | 2016 |
Journal | Journal of Future Generation Computer Systems |
Date Published | 04/2016 |
Publisher | Elsevier |
DOI | 10.1016/j.future.2016.04.003 |
Patent
System and method for efficient network reconfiguration in fat-trees
2016.Status: Published
System and method for efficient network reconfiguration in fat-trees
Systems and methods are provided for supporting efficient reconfiguration of an interconnection network having a pre-existing routing. An exemplary method can provide a plurality of switches, a plurality of end nodes, and one or more subnet managers, including a master subnet manager. The method can calculate, via the master subnet manager, a first set of one or more leaf-switch to leaf-switch multipaths. The method can store this first set of one or more leaf-switch to leaf-switch multipaths at a metabase. The method can detect a reconfiguration triggering event, and call a new routing for the interconnection network. Finally, the method can reconfigure the network according to the new routing for the interconnection network.
Afilliation | Communication Systems |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14927085 |
Date Published | 05/2016 |
Patent Type | Pending |
System and method for efficient network reconfiguration in fat-trees
2016.Status: Published
System and method for efficient network reconfiguration in fat-trees
Systems and methods are provided for supporting efficient reconfiguration of an interconnection network having a pre-existing routing comprising. An exemplary method can provide, a plurality of switches, the plurality switches comprising at least one leaf switch, wherein each of the one or more switches comprise a plurality of ports, and a plurality of end nodes, wherein the plurality of end nodes are interconnected via the one or more switches. The method can detect, by a subnet manager, a reconfiguration triggering event. The method can compute, by the subnet manager, a new routing for the interconnection network, wherein the computing by the subnet manager of the new routing for the interconnection network takes into consideration the pre-existing routing and selects the new routing for the interconnection network that is closest to the pre-existing routing. The method can reconfigure the interconnection network according to the new routing.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US15/073,022 |
Date Published | 03/2016 |
Patent Type | Pending |
System and Method for Providing an InfiniBand SR-IOV vSwitch Architecture for a High Performance Cloud Computing Environment
2016.Status: Published
System and Method for Providing an InfiniBand SR-IOV vSwitch Architecture for a High Performance Cloud Computing Environment
Systems and methods are provided for implementing a Virtual Switch (vSwitch) architecture that supports transparent virtualization and live migration. In an embodiment, a vSwitch with prepopulated Local Identifiers (LIDs). Another embodiment provides for vSwitch with dynamic LID assignment. Another embodiment provides for vSwitch with prepopulated LIDS and dynamic LID assignment Moreover, embodiments of the present invention provide scalable dynamic network reconfiguration methods which enable live migrations of VMs in network environments.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US15/050,901 |
Date Published | 02/2016 |
Patent Type | Pending |
System and method for supporting efficient load-balancing in a high performance computing (HPC) environment
2016.Status: Published
System and method for supporting efficient load-balancing in a high performance computing (HPC) environment
Methods and systems for supporting efficient load balancing among a plurality of switches and a plurality of end nodes arranged in a tree topology in a network environment. The methods and systems can sort the plurality of end nodes, wherein the plurality of end nodes are sorted in a decreasing order of a receive weight. The method and system may further route, in the decreasing order of receive weights, the plurality of end nodes, wherein the routing comprises selecting at least one down-going port and at least one up-going port. Further, the method and system can increase an accumulated downward weight on each selected down-going port by the receive weight of the routed end node, and increase an accumulated upward weight on each selected up-going port by the receive weight of the routed end node
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14792070 |
Date Published | 01/16 |
Patent Type | Pending |
Notes | Date Filed: October 19, 2015, Published Online: Jan 14, 2016. |
System and method for supporting partition-aware routing in a multi-tenant cluster environment
2016.Status: Published
System and method for supporting partition-aware routing in a multi-tenant cluster environment
A system and method can support partition-aware routing in a multi-tenant cluster environment. An exemplary method can support one or more tenants within the multi-tenant cluster environment. The method can associate each of the one or more tenants with a partition of a plurality of partitions. The method can then associate each of the plurality of partitions with one or more nodes of a plurality of nodes, each of the plurality of nodes being associated with a leaf switch of a plurality of switches, the plurality of switches comprising a plurality of leaf switches and a plurality of root switches. Finally, the method can generate one or more linear forwarding tables, the one or more linear forwarding tables providing isolation between the plurality of partitions, wherein each of the plurality of nodes is associated with a partitioning order.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14927085 |
Date Published | 05/2016 |
Patent Type | Pending |
Notes | Date Filed: July 6, 2015, Published Online: May 5, 2016. |
Systems and Method for Providing a Dynamic Cloud with Subnet Administration (SA) Query Caching
2016.Status: Published
Systems and Method for Providing a Dynamic Cloud with Subnet Administration (SA) Query Caching
A system and method support can subnet management in a cloud environment. During a virtual machine migration in a cloud environment, a subnet manager can become a bottleneck point that delays efficient service. A system and method can alleviate this bottleneck point by ensuring a virtual machine retains a plurality of addresses after migration. The system and method can further allow for each host node within the cloud environment to be associated with a local cache that virtual machines can utilize when re-establishing communication with a migrated virtual machine.
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Patent |
Year of Publication | 2016 |
Application Number | US14924281 |
Date Published | 05/2016 |
Patent Type | Pending |
Patent Number | 20160127495 |
Proceedings, refereed
A Novel Query Caching Scheme for Dynamic InfiniBand Subnets
In 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). ACM/IEEE, 2015.Status: Published
A Novel Query Caching Scheme for Dynamic InfiniBand Subnets
In large InfiniBand subnets the Subnet Manager (SM) is a potential bottleneck. When an InfiniBand subnet grows in size, the number of paths between hosts increases polynomially and the SM may not be able to serve the network in a timely manner when many concurrent path resolution requests are received. This scalability challenge is further amplified in a dynamic virtualized cloud environment. When a Virtual Machine (VM) with InfiniBand interconnect live migrates, the VM addresses change. These address changes result in additional load to the SM as communicating peers send Subnet Administration (SA) path record queries to the SM to resolve new path characteristics.
In this paper we benchmark OpenSM to empirically demonstrate the SM scalability problems. Then we show that our novel SA Path Record Query caching scheme significantly reduces the load towards the SM. In particular, we show by using the Reliable Datagram Socket protocol that only a single initial SA path query is needed per communicating peer, independent of any subsequent (re)connection attempts.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) |
Publisher | ACM/IEEE |
A weighted fat-tree routing algorithm for efficient load-balancing in InfiniBand enterprise clusters
In Proceedings of the 23rd Euromicro International Conference on Parallel, Distributed Network-based Processing (PDP 2015). Turku, Finland: IEEE, 2015.Status: Published
A weighted fat-tree routing algorithm for efficient load-balancing in InfiniBand enterprise clusters
InfiniBand (IB) has become a popular network interconnect for high-performance computing (HPC) systems. Many of the large IB-based HPC systems use some variant of the fat-tree topology to take advantage of the useful properties fat-trees offer. The fat-tree routing algorithm is one of the most efficient deterministic routing algorithms for fat-tree topologies. The algorithm ensures that the number of routes assigned to each link are balanced across the fabric. However, one problem with its load-balancing technique is that it assumes uniform traffic distribution in the network. When routes towards nodes that mainly consume large amount of data are assigned to share links in the fabric while alternative links are underutilized, sub-optimal network throughput is obtained. Also, as the fat-tree algorithm routes nodes according to the indexing order, the performance may differ for two systems cabled in the exact same way.
In this paper, we propose wFatTree, a novel fat-tree routing algorithm, which considers node traffic characteristics to balance load across the network links more evenly, and with predictable network performance. Our experiments and simulations show an improvement of up to 60% in total network throughput on large fat-tree installations when using wFatTree routing. Furthermore, wFatTree can also be used to prioritize traffic flowing towards the critical nodes in the network.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | Proceedings of the 23rd Euromicro International Conference on Parallel, Distributed Network-based Processing (PDP 2015) |
Pagination | 35-42 |
Date Published | 03/2015 |
Publisher | IEEE |
Place Published | Turku, Finland |
ISSN Number | 1066-6192 |
Accession Number | 15090056 |
Keywords | fat-tree networks, InfiniBand, Load-balancing, Routing algorithms |
DOI | 10.1109/PDP.2015.111 |
Partition-aware routing to improve network isolation in InfiniBand based multi-tenant clusters
In 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). Shenzhen, China: ACM/IEEE, 2015.Status: Published
Partition-aware routing to improve network isolation in InfiniBand based multi-tenant clusters
InfiniBand (IB) is a widely used network interconnect for modern high-performance computing systems. In large IB fabrics, network isolation is provided through partitioning. However, routing is oblivious to the partitions in the network. Hence, physical links share flows from different partitions. This sharing of the intermediate links creates interference, which is particularly critical to avoid in multi-tenant environments, like cloud computing. In such systems, each tenant needs predictable network performance, unaffected by the workload of the other tenants. In addition, using the current routing schemes, despite that the links connecting nodes outside partitions are never used, they are routed the same way as the other functional links. This may result in degraded load-balancing.
In this paper, we present an implementation of a partition-aware fat-tree routing algorithm, pFTree. The pFTree utilizes a multifold mechanism to provide performance isolation among partitions belonging to the different tenant groups. Given the available network resources, pFTree starts isolating partitions at the physical link level, and then it moves on to utilize virtual lanes when needed. Our experiments and simulations show that pFTree is able to significantly reduce the affect of inter-partition interference effectively without any additional functional overhead. Furthermore, pFTree also provides improved load-balancing over the state-of-the-art fat-tree routing algorithm.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) |
Pagination | 189-198 |
Date Published | 07/2015 |
Publisher | ACM/IEEE |
Place Published | Shenzhen, China |
ISBN Number | 978-1-4799-8006-2 |
DOI | 10.1109/CCGrid.2015.96 |
SlimUpdate: Minimal Routing Update for Performance-based Reconfigurations in Fat-Trees
In 1st IEEE International Workshop on High-Performance Interconnection Networks Towards the Exascale and Big-Data Era (HiPINEB 2015). IEEE Computer Society, 2015.Status: Published
SlimUpdate: Minimal Routing Update for Performance-based Reconfigurations in Fat-Trees
As the size of high-performance computing systems grows, the number of events requiring a network reconfiguration, as well as the complexity of each reconfiguration, is likely to increase. In large systems, the probability of component failure is high. At the same time, with more network components, ensuring high utilization of network resources becomes challenging. Reconfiguration in interconnection networks, like InfiniBand (IB), typically involves computation and distribution of a new set of routes in order to maintain connectivity and performance. In general, current routing algorithms do not consider the existing routes in a network when calculating new ones. Such configuration-oblivious routing might result in substantial modifications to the existing paths, and the reconfiguration becomes costly as it potentially involves a large number of source-destination pairs.
In this paper, we propose a novel routing algorithm for IB based fat-tree topologies, SlimUpdate. SlimUpdate employs techniques to preserve existing forwarding entries in switches to ensure a minimal routing update, without any performance penalty, and with minimal computational overhead. We present an implementation of SlimUpdate in OpenSM, and compare it with the current de facto fat-tree routing algorithm. Our experiments and simulations show a decrease of up to 80% in the number of total path modifications when using SlimUpdate routing, while achieving similar or even better performance than the fat-tree routing in most reconfiguration scenarios.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | 1st IEEE International Workshop on High-Performance Interconnection Networks Towards the Exascale and Big-Data Era (HiPINEB 2015) |
Pagination | 849-856 |
Date Published | 10/2015 |
Publisher | IEEE Computer Society |
ISBN Number | 978-1-4673-6598-7 |
Accession Number | 15570970 |
DOI | 10.1109/CLUSTER.2015.142 |
Towards the InfiniBand SR-IOV vSwitch Architecture
In IEEE Cluster 2015. IEEE Cluster 2015: IEEE, 2015.Status: Published
Towards the InfiniBand SR-IOV vSwitch Architecture
To meet the demands of the Exascale era and facilitate Big Data analytics in the cloud while maintaining flexibility, cloud providers will have to offer efficient virtualized High Performance Computing clusters in a pay-as-you-go model. As a consequence, high performance network interconnect solutions, like InfiniBand (IB), will be beneficial. Currently, the only way to provide IB connectivity on Virtual Machines (VMs) is by utilizing direct device assignment. At the same time to be scalable, Single-Root I/O Virtualization (SR-IOV) is used. However, the current SR-IOV model employed by IB adapters is a Shared Port implementation with limited flexibility, as it does not allow transparent virtualization and live-migration of VMs.
In this paper, we explore an alternative SR-IOV model for IB, the virtual switch (vSwitch), and propose and analyze two vSwitch implementations with different scalability characteristics. Furthermore, as network reconfiguration time is critical to make live-migration a practical option, we accompany our proposed architecture with a scalable and topology agnostic dynamic reconfiguration method, implemented and tested using OpenSM. Our results show that we are able to significantly reduce the reconfiguration time as route recalculations are no longer needed, and in large IB subnets, for certain scenarios, the number of reconfiguration subnet management packets (SMPs) sent is reduced from several hundred thousand down to a single one.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | IEEE Cluster 2015 |
Date Published | 09/2015 |
Publisher | IEEE |
Place Published | IEEE Cluster 2015 |
Keywords | Dynamic Network Reconfiguration, InfiniBand, Live Migration, SR-IOV, Virtualization |
Public outreach
A Self-adaptive network architecture for InfiniBand based HPC clouds
In Talk at 7th Cloud Control Workshop. Nässlingen, Sweden: 7th Cloud Control Workshop, 2015.Status: Published
A Self-adaptive network architecture for InfiniBand based HPC clouds
The research on network optimization in InfiniBand (IB) networks has been evolved in several directions, e.g. increasing network utilization, fault-tolerance, congestion control, and energy-aware systems. However, for efficient HPC clouds based on IB, the optimization problem becomes both complex and multi-dimensional, while individually proposed solutions often yield contradictory management decisions. We believe that a holistic closed-loop control system is required to effectively incorporate multidimensional objective function in future IB systems. Based on control theory, a self-adaptive model for the IB subnet system, may help acheiving better network utilization while effectively keeping user level SLAs in HPC clouds.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Public outreach |
Year of Publication | 2015 |
Secondary Title | Talk at 7th Cloud Control Workshop |
Publisher | 7th Cloud Control Workshop |
Place Published | Nässlingen, Sweden |
Type of Work | Discussion Session |
Journal Article
Efficient and Cost-Effective Hybrid Congestion Control for HPC Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems 26, no. 1 (2015): 107-119.Status: Published
Efficient and Cost-Effective Hybrid Congestion Control for HPC Interconnection Networks
Interconnection networks are key components in High-Performance Computing (HPC) systems, their performance having a strong influence on the overall system one. However, at high load, congestion and its negative effects (e.g. Head-of-line blocking) threaten the performance of the network, and so the one of the entire system. Congestion control (CC) is crucial to ensure an efficient utilization of the interconnection network during congestion situations. As one major trend is to reduce the effective wiring in interconnection networks to reduce cost and power consumption, the network will operate very close to its capacity. Thus, congestion control becomes essential. Existing CC techniques can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. However, both approaches have different, but non-overlapping weaknesses: injection throttling techniques have a slow reaction against congestion, while isolating traffic in special resources may lead the system to run out of those resources. In this paper we propose EcoCC, a new Efficient and Cost-Effective CC technique, that combines injection throttling and congested-flow isolation to minimize their respective drawbacks and maximize overall system performance. This new strategy is suitable for current commercial switch architectures, where it could be implemented without requiring significant complexity. Experimental results, using simulations under synthetic and real tracebased traffic patterns, show that this technique improves by up to 55% over some of the most successful congestion control techniques.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2015 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 26 |
Issue | 1 |
Number | pp |
Pagination | 107-119 |
Publisher | IEEE |
DOI | 10.1109/TPDS.2014.2307851 |
Poster
A Demonstration of the NorNet Core Research Testbed for Multi-Homed Systems
2014.Status: Published
A Demonstration of the NorNet Core Research Testbed for Multi-Homed Systems
This short abstract describes a demonstration proposal for the NorNet Core testbed for multi-homed systems.
Afilliation | , Communication Systems, Communication Systems |
Publication Type | Poster |
Year of Publication | 2014 |
Date Published | December |
Keywords | Conference |
Notes | Demo presentation at the IEEE GLOBECOM 2014 |
PhD Thesis
Congestion Management in Lossless Interconnection Networks
Faculty of Mathematics and Natural Sciences, University of Oslo, 2014.Status: Published
Congestion Management in Lossless Interconnection Networks
In supercomputers and modern data center clusters lossless interconnection networks are frequently used to achieve high throughput and low latency. It has been known for three decades however, that congestion and congestion spreading in such networks can lead to severe performance degradation if no countermeasure is taken. Nevertheless, for a long time, the challenges related to network-wide congestion in lossless interconnection networks received little attention. A combination of tuning and tailoring of network characteristics for a given application, together with overprovisioning of network resources, kept congestion and congestion spreading from occurring in practice. To be able to dynamically manage congestion was then not really needed. During the last decade, however, we have seen a renewed interest in congestion management for lossless interconnection networks. The use of virtualization together with an increased focus on cost-efficient green computing have spawned a desire to operate networks with dynamic and unpredictable traffic patterns closer to saturation. As such, proper congestion management is needed. In this thesis, we study congestion management in lossless interconnection networks in general, while giving special attention to the congestion control mechanism specified for InfiniBand, currently one of the most popular interconnection network standards. The contributions of the thesis include guidelines on how to implement congestion detection in switches facilitating injection throttling at the source nodes to avoid unfair treatment of contributors to congestion; an exploration of the rich InfiniBand congestion control parameter space and the corresponding influence on the performance of the congestion control mechanism; a study of the scope of an injection throttling based congestion management mechanism, like the one specified for InfiniBand; an abstract classification scheme for congestion trees of varying degree of dynamics; and finally, two novel congestion management mechanisms for input buffered switches and switches utilizing virtual output queuing, respectively, to overcome the weaknesses of current congestion management mechanisms based on injection throttling or hot-flow dynamic isolation.
Afilliation | Communication Systems, Communication Systems |
Publication Type | PhD Thesis |
Year of Publication | 2014 |
Date Published | March |
Publisher | Faculty of Mathematics and Natural Sciences, University of Oslo |
Thesis Type | phd |
Talks, invited
Efficient and Robust Architecture for Big Data Cloud
In Hemavan, Sweden, 2014.Status: Published
Efficient and Robust Architecture for Big Data Cloud
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, invited |
Year of Publication | 2014 |
Location of Talk | Hemavan, Sweden |
Public outreach
Grønnere Cloud Computing
In MediaPlanet addition to Finansavisen. Finansavisen: MediaPlanet, 2014.Status: Published
Grønnere Cloud Computing
Afilliation | Communication Systems, Communication Systems |
Publication Type | Public outreach |
Year of Publication | 2014 |
Secondary Title | MediaPlanet addition to Finansavisen |
Date Published | 12/2014 |
Publisher | MediaPlanet |
Place Published | Finansavisen |
Type of Work | Finansavisen |
Journal Article
NorNet Core - A Multi-Homed Research Testbed
Computer Networks Special Issue: Future Internet Testbeds (2014): 75-87.Status: Published
NorNet Core - A Multi-Homed Research Testbed
Over the last decade, the Internet has grown at a tremendous speed in both size and complexity. Nowadays, a large number of important services - for instance e-commerce, healthcare and many others - depend on the availability of the underlying network. Clearly, service interruptions due to network problems may have a severe impact. On the long way towards the Future Internet, the complexity will grow even further. Therefore, new ideas and concepts must be evaluated thoroughly, and particularly in realistic, real-world Internet scenarios, before they can be deployed for production networks. For this purpose, various testbeds - for instance PlanetLab, GpENI or G-Lab - have been established and are intensively used for research. However, all of these testbeds lack the support for so-called multihoming. Multi-homing denotes the connection of a site to multiple Internet service providers, in order to achieve redundancy. Clearly, with the need for network availability, there is a steadily growing demand for multi-homing. The idea of the NorNet Core project is to establish a Future Internet research testbed with multi-homed sites, in order to allow researchers to perform experiments with multi-homed systems. Particular use cases for this testbed include realistic experiments in the areas of multi-path routing, load balancing, multi-path transport protocols, overlay networks and network resilience. In this paper, we introduce the NorNet Core testbed as well as its architecture.
Afilliation | , Communication Systems, Communication Systems, Communication Systems |
Project(s) | NorNet, The Center for Resilient Networks and Applications |
Publication Type | Journal Article |
Year of Publication | 2014 |
Journal | Computer Networks |
Volume | Special Issue: Future Internet Testbeds |
Number | 61 |
Pagination | 75-87 |
Publisher | Elsevier |
DOI | 10.1016/j.bjp.2013.12.035 |
Proceedings, refereed
Design and Implementation of the NorNet Core Research Testbed for Multi-Homed Systems
In Proceedings of the 3nd International Workshop on Protocols and Applications with Multi-Homing Support (PAMS). Barcelona, Catalonia/Spain: IEEE, 2013.Status: Published
Design and Implementation of the NorNet Core Research Testbed for Multi-Homed Systems
The Internet has made it possible to communicate and to use services over large geographical distances. While it has originally been built for less critical services like e-mail and file transfer, it is nowadays also increasingly often used for availability-critical services like e.g. e-commerce or healthcare. Clearly, the reachability of such services must be ensured by so-called multi-homing of endpoints. That is, endpoints are simultaneously connected to multiple Internet Service Providers (ISP) to provide redundancy. If one ISP has problems, it is intended that the connection to another one still works. However, such assumptions have never been verified in real, large-scale setups. The intention of the NorNet project is to build up a realistic Internet testbed for multi-homing. In this paper, we describe the design of NorNet with focus on the implementation of its fixed-line part: NorNet Core. This paper is intended to give researchers an overview of its mode of operation, its capabilities as well as its interesting feature realisations. The knowledge about these items is very useful to plan own experiments in the NorNet testbed.
Afilliation | Communication Systems, , Communication Systems, Communication Systems |
Project(s) | The Center for Resilient Networks and Applications |
Publication Type | Proceedings, refereed |
Year of Publication | 2013 |
Conference Name | Proceedings of the 3nd International Workshop on Protocols and Applications with Multi-Homing Support (PAMS) |
Date Published | 03/2013 |
Publisher | IEEE |
Place Published | Barcelona, Catalonia/Spain |
Keywords | Workshop |
Proceedings, refereed
Exploring the Scope of the InfiniBand Congestion Control Mechanism
In 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE Computer Society, 2012.Status: Published
Exploring the Scope of the InfiniBand Congestion Control Mechanism
In a lossless interconnection network, network congestion needs to be detected and resolved to ensure high performance and good utilization of network resources at high network load. If no countermeasure is taken, congestion at a node in the network will stimulate the growth of a congestion tree that not only affects contributors to congestion, but also other traffic flows in the network. Left untouched, the congestion tree will block traffic flows, lead to underutilization of network resources and result in a severe drop in network performance. The InfiniBand standard specifies a congestion control (CC) mechanism to detect and resolve congestion before a congestion tree is able to grow and, by that, hamper the network performance. The InfiniBand CC mechanism includes a rich set of parameters that can be tuned in order to achieve effective CC. Even though it has been shown that the CC mechanism, properly tuned, is able to improve both throughput and fairness in an interconnection network, it has been questioned whether the mechanism is fast enough to keep up with dynamic network traffic, and if a given set of parameter values for a topology is robust when it comes to different traffic patterns, or if the parameters need to be tuned depending on the applications in use. In this paper we address both these questions. Using the three-stage fat-tree topology from the Sun Datacenter InfiniBand Switch 648 as a basis, and a simulator tuned against CC capable InfiniBand hardware, we conduct a systematic study of the efficiency of the InfiniBand CC mechanism as the network traffic becomes increasingly more dynamic. Our studies show that the InfiniBand CC, even when using a single set of parameter values, performs very well as the traffic patterns becomes increasingly more dynamic, outperforming a network without CC in all cases. Our results show throughput increases varying from a few percent, to a seventeen-fold increase.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2012 |
Conference Name | 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) |
Pagination | 1131-1143 |
Publisher | IEEE Computer Society |
DOI | 10.1109/IPDPS.2012.104 |
Journal Article
SFtree: a Fully Connected and Deadlock Free Switch-to-Switch Routing Algorithm for Fat-Trees
ACM Transactions on Architecture and Code Optimization 8 (2012).Status: Published
SFtree: a Fully Connected and Deadlock Free Switch-to-Switch Routing Algorithm for Fat-Trees
Existing fat-tree routing algorithms fully exploit the path diversity of a fat-tree topology in the context of compute node traffic, but they lack support for deadlock free and fully connected switch-to-switch communication. Such support is crucial for efficient system management, for example in InfiniBand (IB) systems. With the general increase in system management capabilities found in modern InfiniBand switches, the lack of deadlock free switch-to-switch communication is a problem for fat-tree based IB installations because management traffic might cause routing deadlocks that bring the whole system down. This lack of deadlock free communication affects all system management and diagnostic tools using LID routing. In this paper, we propose the sFtree routing algorithm that guarantees deadlock free and fully connected switch-to-switch communication in fat-trees while maintaining the properties of the current fat-tree algorithm. We prove that the algorithm is deadlock free and we implement it in OpenSM for evaluation. We evaluate the performance of the sFtree algorithm experimentally on a small cluster and we do a large-scale evaluation through simulations. The results confirm that the sFtree routing algorithm is deadlock free and show that the impact of switch-to-switch management traffic on the end-node traffic is negligible.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2012 |
Journal | ACM Transactions on Architecture and Code Optimization |
Volume | 8 |
Number | 4 |
Date Published | January |
Publisher | ACM |
DOI | 10.1145/2086696.208673 |
Proceedings, refereed
Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks
In International Conference on Parallel Processing, ICPP 2011. IEEE Computer Society, 2011.Status: Published
Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks
Existing congestion control mechanisms in interconnects can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. These two approaches have different, but non-overlapping weaknesses. In this paper we present in detail a method that combines injection throttling and congested-flow isolation. Through simulation studies we first demonstrate the respective flaws of the injection throttling and of flow isolation. Thereafter we show that our combined method extracts the best of both approaches in the sense that it gives fast reaction to congestion, it is scalable and it has good fairness properties with respect to the congested flows.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | International Conference on Parallel Processing, ICPP 2011 |
Pagination | 662-672 |
Date Published | September |
Publisher | IEEE Computer Society |
ISBN Number | 978-1-4577-1336-1 |
DOI | 10.1109/ICPP.2011.80 |
InfiniBand Congestion Control, Modelling and Validation
In 4th International ICST Conference on Simulation Tools and Techniques (SIMUTools2011, OMNeT++ 2011 Workshop). SIMUTools '11. Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, 2011.Status: Published
InfiniBand Congestion Control, Modelling and Validation
In a lossless interconnection network congestion may results in performance degradation if no countermeasure is taken. To relieve the consequences of congestion, and by that to achieve good utilization of networks resources even at high network load, congestion control (CC) has been added to the InfiniBand specification. The behavior of the InfiniBand CC is, however, governed by a set of CC parameters. Exactly how to set these parameters to ensure an all over efficient network is still not well understood. It is time consuming, costly and hard to explore the CC parameter space in a large scale cluster. Therefore, a simulation platform is needed. In this paper we present our CC capable IB model implemented in the OMNeT++ environment. We explain the basics of our model, and validate it against CC capable hardware to show its high accuracy.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | 4th International ICST Conference on Simulation Tools and Techniques (SIMUTools2011, OMNeT++ 2011 Workshop) |
Pagination | 390-397 |
Date Published | March |
Publisher | Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering |
ISBN Number | 978-1-936968-00-8 |
On the Relation Between Congestion Control, Switch Arbitration and Fairness
In 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011). IEEE, 2011.Status: Published
On the Relation Between Congestion Control, Switch Arbitration and Fairness
In lossless interconnection networks such as InfiniBand, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to the hardware designer. One must be cautious when making these design decisions not to introduce fairness problems, as our study shows. In this paper we study the relationship between congestion control, switch arbitration, and fairness. Specifically, we look at fairness among different traffic flows arriving at a hot spot switch on different input ports, as CC is turned on. In addition we study the fairness among traffic flows at a switch where some flows are exclusive users of their input ports while other flows are sharing an input port (the parking lot problem). Our results show that the implementation of congestion control in a switch is vulnerable to unfairness if care is not taken. In detail, we found that a threshold hysteresis of more than one MTU is needed to resolve arbitration unfairness. Furthermore, to fully solve the parking lot problem, proper configuration of the CC parameters are required.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011) |
Pagination | 342-351 |
Date Published | May |
Publisher | IEEE |
ISBN Number | 978-1-4577-0129-0 |
DOI | 10.1109/CCGrid.2011.67 |
Talks, contributed
InfiniBand Congestion Control
In Contributed talk at the 2011 OpenFabrics International Workshop, Monterey, USA, 2011.Status: Published
InfiniBand Congestion Control
In InfiniBand networks congestion control (CC) can be an effective mechanism to achieve high performance and high utilisation of network resources. Without CC, congestion in one node may severely degrade overall performance. In this talk we introduce the problem of congestion and how it can be avoided with InfiniBand congestion control.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2011 |
Location of Talk | Contributed talk at the 2011 OpenFabrics International Workshop, Monterey, USA |
Talks, contributed
First Experiences With Congestion Control in InfiniBand Hardware
In Invited talk at the HPC Advisory Council Switzerland Workshop 2010, 2010.Status: Published
First Experiences With Congestion Control in InfiniBand Hardware
In InfiniBand networks congestion control (CC) can be an effective mechanism to achieve high performance and high utilisation of network resources. Without CC, congestion in one node may severely degrade overall performance. In this talk we introduce the problem of congestion and how it can be avoided with InfiniBand congestion control.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2010 |
Location of Talk | Invited talk at the HPC Advisory Council Switzerland Workshop 2010 |
Proceedings, refereed
First Experiences With Congestion Control in InfiniBand Hardware
In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 2010.Status: Published
First Experiences With Congestion Control in InfiniBand Hardware
In lossless interconnection networks congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. Without CC, congestion on one link may grow into a congestion tree that can degrade the performance severely. This degradation can affect not only contributors to the congestion, but also throttles innocent traffic flows in the network. The InfiniBand standard describes CC functionality for detecting and resolving congestion. The InfiniBand CC concept is rich in the way that it specifies a set of parameters that can be tuned in order to achieve effective CC. There is, however, limited experience with the InfiniBand CC mechanism. To the best of our knowledge, only a few simulation studies exist. Recently, InfiniBand CC has been implemented in hardware, and in this paper we present the first experiences with such equipment. We show that the implemented InfiniBand CC mechanism effectively resolves congestion and improves fairness by solving the parking lot problem, if the CC parameters are appropriately set. By conducting extensive testing on a selection of the CC parameters, we have explored the parameter space and found a subset of parameter values that leads to efficient CC for our test scenarios. Furthermore, we show that the InfiniBand CC increases the performance of the well known HPC Challenge benchmark in a congested network.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2010 |
Conference Name | 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) |
Pagination | 1-12 |
Publisher | IEEE |
ISBN Number | 978-1-4244-6442-5 |
DOI | 10.1109/IPDPS.2010.5470419 |
Talks, contributed
First Experiences With Congestion Control in InfiniBand Hardware
In Invited talk at the Mellanox booth at Supercomputing 2009, Portland, USA, 2009.Status: Published
First Experiences With Congestion Control in InfiniBand Hardware
Invited talk about early experiences with InfiniBand Congestion Control implemented in Mellanox hardware. Held at the Mellanox booth at Supercomputing 2009, Portland, USA
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2009 |
Location of Talk | Invited talk at the Mellanox booth at Supercomputing 2009, Portland, USA |
Proceedings, refereed
Dragon Kill Points: Loot Distribution in Massive Multiplayer Online Role Playing Games
In Proceedings of the 7th ACM SIGCOMM workshop on Network and system support for games (NetGames 2008). Association for Computing Machinery (ACM), 2008.Status: Published
Dragon Kill Points: Loot Distribution in Massive Multiplayer Online Role Playing Games
One of the major reasons for playing Massive Multiplayer Online Role Playing Games (MMORPGs) is the possibility to show off your abilities to other players. The more rare your equipment is, the higher is the show off value of your character. And because rare items are hard to find cooperation between several players is often required. This introduces a conflict between the players, and a way to distribute loot is necessary. We introduce the problem of loot distribution in MMORPG, and we suggest and give a preliminary evaluation of a new and improved Dragon Kill Points system.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2008 |
Conference Name | Proceedings of the 7th ACM SIGCOMM workshop on Network and system support for games (NetGames 2008) |
Pagination | 100-101 |
Date Published | October |
Publisher | Association for Computing Machinery (ACM) |
ISBN Number | 978-1-60558-132-3 |