Publications
Journal Article
Detailed Modeling of Heterogeneous and Contention-Constrained Point-to-Point MPI Communication
IEEE Transactions on Parallel and Distributed Systems 34, no. 5 (2023): 1580-1593.Status: Published
Detailed Modeling of Heterogeneous and Contention-Constrained Point-to-Point MPI Communication
The network topology of modern parallel computing systems is inherently heterogeneous, with a variety of latency and bandwidth values. Moreover, contention for the bandwidth can exist on different levels when many processes communicate with each other. Many-pair, point-to-point MPI communication is thus characterized by heterogeneity and contention, even on a cluster of homogeneous multicore CPU nodes. To get a detailed understanding of the individual communication cost per MPI process, we propose a new modeling methodology that incorporates both heterogeneity and contention. First, we improve the standard max-rate model to better quantify the actually achievable bandwidth depending on the number of MPI processes in competition. Then, we make a further extension that more detailedly models the bandwidth contention when the competing MPI processes have different numbers of neighbors, with also non-uniform message sizes. Thereafter, we include more flexibility by considering interactions between intra-socket and inter-socket messaging. Through a series of experiments done on different processor architectures, we show that the new heterogeneous and contention-constrained performance models can adequately explain the individual communication cost associated with each MPI process. The largest test of realistic point-to-point MPI communication involves 8,192 processes and in total 2,744,632 simultaneous messages over 64 dual-socket AMD Epyc Rome compute nodes connected by InfiniBand, for which the overall prediction accuracy achieved is 84%.
Afilliation | Scientific Computing |
Project(s) | Department of High Performance Computing , SparCity: An Optimization and Co-design Framework for Sparse Computation |
Publication Type | Journal Article |
Year of Publication | 2023 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 34 |
Issue | 5 |
Pagination | 1580 - 1593 |
Date Published | 03/2023 |
Publisher | IEEE |
ISSN | 1045-9219 |
URL | https://ieeexplore.ieee.org/document/10064025 |
DOI | 10.1109/TPDS.2023.3253881 |
Proceedings, refereed
Adaptive Routing in InfiniBand Hardware
In The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing. IEEE, 2022.Status: Published
Adaptive Routing in InfiniBand Hardware
Interconnection networks are the communication backbone of modern high-performance computing systems and an optimised interconnection network is crucial for the performance and utilisation of the system as a whole. One element of the interconnection network is the routing algorithm, which directly influences how we are able to utilise the physical network topology. InfiniBand is one of the most common network architectures used in high-performance computing and traditionally it only supported static routing. For multi-path networks such as Fat-trees, static routing is inefficient because it cannot balance traffic in real-time nor utilise multiple paths efficiently under adversarial traffic. This again potentially leads to unnecessary contention and an underutilised network, which has led to numerous proposals on how to avoid this by using adaptive routing. Adaptive routing has recently been introduced in InfiniBand and in this paper we evaluate to what extent the expected benefits of adaptive routing is true for InfiniBand. Through a set of experiments on HDR InfiniBand equipment we describe the basic behaviour of adaptive routing in InfiniBand, its benefits in Fat tree topologies and the unfortunate side effects related to unfairness that adaptive routing in general might introduce, including such phenomena as the reverse parking lot
problem and congestion spreading.
Afilliation | Communication Systems |
Project(s) | Simula Metropolitan Center for Digital Engineering, Department of High Performance Computing |
Publication Type | Proceedings, refereed |
Year of Publication | 2022 |
Conference Name | The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing |
Pagination | 463-472 |
Publisher | IEEE |
Patent
System and method of computing ethernet routing paths
In US Patent. H04L45/02 ed, 2020.Status: Published
System and method of computing ethernet routing paths
Afilliation | Communication Systems |
Project(s) | Fabriscale |
Publication Type | Patent |
Year of Publication | 2020 |
Published Source | US Patent |
International Patent Classification | H04L45/02 |
International Patent Number | 10855581 |
Application Number | 16/138,366 |
Date Published | 12/2020 |
URL | https://patents.google.com/patent/US20190149461A1 |
Patent
Method of computing balanced routing paths in fat-trees
In Us Patent. H04L45/14 ed, 2019.Status: Published
Method of computing balanced routing paths in fat-trees
A device and method for providing balanced routing paths in a computational grid including determining a type of topology of the computational grid having a plurality of levels, wherein each level includes a plurality of switches, determining whether the type of topology of the computational grid is a fat-tree, determining whether the fat-tree is odd, determining whether the fat-tree is a regular fat-tree, computing a first set of routing paths for the computational grid based on the determining of whether the fat-tree is odd and is a regular fat-tree, computing a second set of routing paths for the computational grid using a topology agnostic routing technique, and configuring forwarding tables in said switches with the first set of computed routing paths when the topology is determined to be a fat-tree and with the second set of computed routing paths when the topology is determined to not be a fat-tree.
Afilliation | Communication Systems |
Project(s) | Fabriscale |
Publication Type | Patent |
Year of Publication | 2019 |
Published Source | Us Patent |
International Patent Classification | H04L45/14 |
International Patent Number | US10425324B2 |
Application Number | 15/679, 974 |
Date Published | 09/2019 |
URL | https://patents.google.com/patent/US10425324B2/en?oq=US10425324B2 |
Journal Article
A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
IEEE Transactions on Parallel and Distributed Systems 29, no. 12 (2018): 2658-2671.Status: Published
A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
Clouds offer flexible and economically attractive compute and storage solutions for enterprises. However, the effectiveness of cloud computing for high-performance computing (HPC) systems still remains questionable. When clouds are deployed on lossless interconnection networks, like InfiniBand (IB), challenges related to load-balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. Moreover, cloud data centers incorporate a highly dynamic environment rendering static network reconfigurations, typically used in IB systems, infeasible. In this paper, we present a framework for a self-adaptive network architecture for HPC clouds based on lossless interconnection networks, demonstrated by means of our implemented IB prototype. Our solution, based on a feedback control and optimization loop, enables the lossless HPC network to dynamically adapt to the varying traffic patterns, current resource availability, workload distributions, and also in accordance with the service provider-defined policies. Furthermore, we present IBAdapt, a simplified ruled-based language for the service providers to specify adaptation strategies used by the framework. Our developed self-adaptive IB network prototype is demonstrated using state-of-the-art industry software. The results obtained on a test cluster demonstrate the feasibility and effectiveness of the framework when it comes to improving Quality-of-Service compliance in HPC clouds.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2018 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 29 |
Issue | 12 |
Pagination | 2658-2671 |
Publisher | IEEE |
DOI | 10.1109/TPDS.2018.2842224 |
Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks
Concurrency and Computation: Practice and Experience 31, no. 2 (2018).Status: Published
Efficient Routing and Reconfiguration in Virtualized HPC Environments with vSwitch-enabled Lossless Networks
To meet the demands of communication-intensive workloads in the cloud, virtual machines (VMs) should utilize low overhead network communication paradigms. In general, such paradigms enable VMs to directly communicate with the hardware by means of a passthrough technology like Single-Root I/O Virtualization (SR-IOV). However, when passthrough-based virtualization is coupled with lossless interconnection networks, live-migrations introduce scalability challenges due to the substantial network reconfiguration overhead. With these challenges in mind we proposed a virtual switch (vSwitch) SR-IOV architecture for InfiniBand in (33). In this paper, we first suggest solutions to rectify the space-domain scalability issues that are present in vSwitch-enabled subnets as a result of the VMs using dedicated layer-two addresses. Then we discuss routing strategies for virtualized environments using vSwitches, and present a routing algorithm for Fat-Trees. We also present a reconfiguration method that minimizes imposed reconfiguration overhead on Fat-Trees. We perform an extensive evaluation of our prototype algorithms, and as vSwitch-enabled hardware does not yet exist, we deduce from empirical observations by emulating vSwitches with existing hardware, as well as large-scale simulations. Our results show significant reduction in the reconfiguration times as route recalculations can be eliminated, and for certain scenarios, the number of reconfiguration subnet management packets sent to switches is reduced from several hundred thousand down to a single one without degrading the routing quality.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2018 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 31 |
Issue | 2 |
Date Published | 02/2018 |
Publisher | John Wiley & Sons |
Keywords | InfiniBand, Lossless Interconnection Networks, Network Reconfiguration, Network Routing, SR-IOV, Virtualization |
Journal Article
A Fault-Tolerant Routing Strategy for KNS Topologies Based on Intermediate Nodes
Concurrency and Computation: Practice and Experience 29, no. 13 (2017).Status: Published
A Fault-Tolerant Routing Strategy for KNS Topologies Based on Intermediate Nodes
Exascale computing systems are being built with thousands of nodes. The high number of components of these systems significantly increases the probability of failure. A key component for them is the interconnection network. If failures occur in the interconnection network, they may isolate a large fraction of the machine. For this reason, an efficient fault-tolerant mechanism is needed to keep the system interconnected, even in the presence of faults. A recently proposed topology for these large systems is the hybrid k-ary n-direct s-indirect (KNS) family that provides optimal performance and connectivity at a reduced hardware cost. This paper presents a fault-tolerant routing methodology for the KNS topology that degrades performance gracefully in presence of faults and tolerates a large number of faults without disabling any healthy computing node. In order to tolerate network failures, the methodology uses a simple mechanism. For any source-destination pair, if necessary, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network) with the aim of circumventing faults. The evaluation results shows that the proposed methodology tolerates a large number of faults. For instance, it is able to tolerate more than 99.5% of fault combinations when there are ten faults in a 3-D network with 1,000 nodes using only one intermediate node and more than 99.98% if two intermediate nodes are used. Furthermore, the methodology offers a gracious performance degradation. As an example, performance degrades only by 1% for a 2-D network with 1,024 nodes and 1% faulty links.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Journal Article |
Year of Publication | 2017 |
Journal | Concurrency and Computation: Practice and Experience |
Volume | 29 |
Issue | 13 |
Publisher | John Wiley & Sons, Ltd. |
Keywords | exascale computing, fault-tolerant routing, hybrid topology, KNS topology |
DOI | 10.1002/cpe.4065 |
Mobile Edge Computing: A Survey
IEEE Internet of Things Journal 5, no. 1 (2017).Status: Published
Mobile Edge Computing: A Survey
Mobile Edge Computing (MEC) is an emergent architecture where cloud computing services are extended to the edge of networks leveraging mobile base stations. As a promising edge technology, it can be applied to mobile, wireless and wireline scenarios, using software and hardware platforms, located at the network edge in the vicinity of end-users. MEC provides seamless integration of multiple application service providers and vendors towards mobile subscribers, enterprises and other vertical segments. It is an important component in the 5G architecture which supports variety of innovative applications and services where ultra low latency is required. This paper is aimed to present a comprehensive survey of relevant research and technological developments in the area of MEC. It provides the definition of MEC, its advantages, architectures, and application areas; where we in particular highlight related research and future directions. Finally, security and privacy issues and related existing solutions are also discussed.
Afilliation | Communication Systems |
Project(s) | CROWN: Cross-layer Research on Green Cooperative Cognitive Radio Networks and Services |
Publication Type | Journal Article |
Year of Publication | 2017 |
Journal | IEEE Internet of Things Journal |
Volume | 5 |
Issue | 1 |
Date Published | 02/2018 |
Publisher | ACM IEEE |
Keywords | Fog Computing, IoT, Mobile cloud computing, Mobile edge computing |
DOI | 10.1109/JIOT.2017.2750180 |
Proceedings, refereed
A New Fault-Tolerant Routing Methodology for KNS Topologies
In 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB). IEEE, 2016.Status: Published
A New Fault-Tolerant Routing Methodology for KNS Topologies
Exascale computing systems are being built with thousands of nodes. A key component of these systems is the interconnection network. The high number of components significantly increases the probability of failure. If failures occur in the interconnection network, they may isolate a large fraction of the machine. For this reason, an efficient fault-tolerant mechanism is needed to keep the system interconnected, even in the presence of faults. A topology recently proposed for these large systems is the hybrid KNS family that provides good performance and connectivity at a reduced hardware cost. This paper present a fault-tolerant routing methodology for the KNS topology that degrades performance gracefully in the presence of faults and tolerates a reasonably large number of faults without disabling any healthy node. In order to tolerate network failures, the methodology uses a simple mechanism: for some sourcedestination pairs, only if necessary, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network) which allow avoiding faults. The evaluation results shows that the methodology tolerates a large number of faults. Furthermore, the methodology offers a gracious performance degradation. For instance, performance degrades only 1% for a 2D-network with 1024 nodes and 1% faulty links.
Afilliation | Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB) |
Pagination | 1-8 |
Date Published | 03/2016 |
Publisher | IEEE |
ISBN Number | 978-1-5090-2121-5 |
DOI | 10.1109/HIPINEB.2016.9 |
Fast Hybrid Network Reconfiguration for Large-Scale Lossless Interconnection Networks
In 15th IEEE International Symposium on Network Computing and Applications (NCA 2016). IEEE, 2016.Status: Published
Fast Hybrid Network Reconfiguration for Large-Scale Lossless Interconnection Networks
Reconfiguration of high performance lossless interconnection networks is a cumbersome and time-consuming task. For that reason reconfiguration in large networks are typically limited to situations where it is absolutely necessary, for instance when severe faults occur. On the contrary, due to the shared and dynamic nature of modern cloud infrastructures, performance-driven reconfigurations are necessary to ensure efficient utilization of resources. In this work we present a scheme that allows for fast reconfigurations by limiting the task to sub-parts of the network that can benefit from a local reconfiguration. Moreover, our method is able to use different routing algorithms for different sub-parts within the same subnet. We also present a Fat-Tree routing algorithm that reconfigures a network given a user-provided node ordering. Hardware experiments and large scale simulation results show that we are able to significantly reduce reconfiguration times from 50% to as much as 98.7% for very large topologies, while improving performance.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | 15th IEEE International Symposium on Network Computing and Applications (NCA 2016) |
Pagination | 101-108 |
Publisher | IEEE |
Keywords | Cloud computing, Fat-Tree, HPC, InfiniBand, Network Reconfiguration, scalability |
Realizing a Self-Adaptive Network Architecture for HPC Clouds
In The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase, 2016.Status: Published
Realizing a Self-Adaptive Network Architecture for HPC Clouds
Clouds offer significant advantages over traditional cluster computing architectures including ease of deployment, rapid elasticity, and an economically attractive pay-as-you-go business model. However, the effectiveness of cloud computing for HPC systems still remains questionable. When clouds are deployed on lossless interconnection networks, challenges related to load balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. In this work, we attack these challenges and propose a novel holistic framework of a self-adaptive IB subnet for HPC clouds. Our solution consists of a feedback control loop that effectively incorporate optimizations based on the multidimensional objective function using current resource configuration and provider-defined policies. We build our system using a bottom-up approach, starting by prototyping solutions tackling individual research challenges associated, and later combining our novel solutions into a working self-adaptive cloud prototype. All our results are demonstrated using state-of-the art industry software to enable easy integration into running systems.
Afilliation | Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Proceedings, refereed |
Year of Publication | 2016 |
Conference Name | The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '16) Doctoral Showcase |
Talks, invited
About Management of Exascale Systems
In ExaComm 2016, Frankfurt, 2016.Status: Published
About Management of Exascale Systems
Afilliation | Communication Systems, Communication Systems |
Project(s) | ERAC: Efficient and Robust Architecture for the Big Data Cloud |
Publication Type | Talks, invited |
Year of Publication | 2016 |
Location of Talk | ExaComm 2016, Frankfurt |
Type of Talk | Invited talk |
Journal Article
Compact Network Reconfiguration in Fat-Trees
The Journal of Supercomputing 72, no. 12 (2016): 4438-4467.Status: Published
Compact Network Reconfiguration in Fat-Trees
Afilliation | Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2016 |
Journal | The Journal of Supercomputing |
Volume | 72 |
Issue | 12 |
Pagination | 4438–4467 |
Publisher | Springer |
Efficient Network Isolation and Load Balancing in Multi-Tenant HPC Clusters
Journal of Future Generation Computer Systems (2016).Status: Published
Efficient Network Isolation and Load Balancing in Multi-Tenant HPC Clusters
Afilliation | Communication Systems |
Project(s) | No Simula project |
Publication Type | Journal Article |
Year of Publication | 2016 |
Journal | Journal of Future Generation Computer Systems |
Date Published | 04/2016 |
Publisher | Elsevier |
DOI | 10.1016/j.future.2016.04.003 |
Proceedings, refereed
A Novel Query Caching Scheme for Dynamic InfiniBand Subnets
In 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). ACM/IEEE, 2015.Status: Published
A Novel Query Caching Scheme for Dynamic InfiniBand Subnets
In large InfiniBand subnets the Subnet Manager (SM) is a potential bottleneck. When an InfiniBand subnet grows in size, the number of paths between hosts increases polynomially and the SM may not be able to serve the network in a timely manner when many concurrent path resolution requests are received. This scalability challenge is further amplified in a dynamic virtualized cloud environment. When a Virtual Machine (VM) with InfiniBand interconnect live migrates, the VM addresses change. These address changes result in additional load to the SM as communicating peers send Subnet Administration (SA) path record queries to the SM to resolve new path characteristics.
In this paper we benchmark OpenSM to empirically demonstrate the SM scalability problems. Then we show that our novel SA Path Record Query caching scheme significantly reduces the load towards the SM. In particular, we show by using the Reliable Datagram Socket protocol that only a single initial SA path query is needed per communicating peer, independent of any subsequent (re)connection attempts.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) |
Publisher | ACM/IEEE |
A weighted fat-tree routing algorithm for efficient load-balancing in InfiniBand enterprise clusters
In Proceedings of the 23rd Euromicro International Conference on Parallel, Distributed Network-based Processing (PDP 2015). Turku, Finland: IEEE, 2015.Status: Published
A weighted fat-tree routing algorithm for efficient load-balancing in InfiniBand enterprise clusters
InfiniBand (IB) has become a popular network interconnect for high-performance computing (HPC) systems. Many of the large IB-based HPC systems use some variant of the fat-tree topology to take advantage of the useful properties fat-trees offer. The fat-tree routing algorithm is one of the most efficient deterministic routing algorithms for fat-tree topologies. The algorithm ensures that the number of routes assigned to each link are balanced across the fabric. However, one problem with its load-balancing technique is that it assumes uniform traffic distribution in the network. When routes towards nodes that mainly consume large amount of data are assigned to share links in the fabric while alternative links are underutilized, sub-optimal network throughput is obtained. Also, as the fat-tree algorithm routes nodes according to the indexing order, the performance may differ for two systems cabled in the exact same way.
In this paper, we propose wFatTree, a novel fat-tree routing algorithm, which considers node traffic characteristics to balance load across the network links more evenly, and with predictable network performance. Our experiments and simulations show an improvement of up to 60% in total network throughput on large fat-tree installations when using wFatTree routing. Furthermore, wFatTree can also be used to prioritize traffic flowing towards the critical nodes in the network.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | Proceedings of the 23rd Euromicro International Conference on Parallel, Distributed Network-based Processing (PDP 2015) |
Pagination | 35-42 |
Date Published | 03/2015 |
Publisher | IEEE |
Place Published | Turku, Finland |
ISSN Number | 1066-6192 |
Accession Number | 15090056 |
Keywords | fat-tree networks, InfiniBand, Load-balancing, Routing algorithms |
DOI | 10.1109/PDP.2015.111 |
Partition-aware routing to improve network isolation in InfiniBand based multi-tenant clusters
In 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). Shenzhen, China: ACM/IEEE, 2015.Status: Published
Partition-aware routing to improve network isolation in InfiniBand based multi-tenant clusters
InfiniBand (IB) is a widely used network interconnect for modern high-performance computing systems. In large IB fabrics, network isolation is provided through partitioning. However, routing is oblivious to the partitions in the network. Hence, physical links share flows from different partitions. This sharing of the intermediate links creates interference, which is particularly critical to avoid in multi-tenant environments, like cloud computing. In such systems, each tenant needs predictable network performance, unaffected by the workload of the other tenants. In addition, using the current routing schemes, despite that the links connecting nodes outside partitions are never used, they are routed the same way as the other functional links. This may result in degraded load-balancing.
In this paper, we present an implementation of a partition-aware fat-tree routing algorithm, pFTree. The pFTree utilizes a multifold mechanism to provide performance isolation among partitions belonging to the different tenant groups. Given the available network resources, pFTree starts isolating partitions at the physical link level, and then it moves on to utilize virtual lanes when needed. Our experiments and simulations show that pFTree is able to significantly reduce the affect of inter-partition interference effectively without any additional functional overhead. Furthermore, pFTree also provides improved load-balancing over the state-of-the-art fat-tree routing algorithm.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) |
Pagination | 189-198 |
Date Published | 07/2015 |
Publisher | ACM/IEEE |
Place Published | Shenzhen, China |
ISBN Number | 978-1-4799-8006-2 |
DOI | 10.1109/CCGrid.2015.96 |
SlimUpdate: Minimal Routing Update for Performance-based Reconfigurations in Fat-Trees
In 1st IEEE International Workshop on High-Performance Interconnection Networks Towards the Exascale and Big-Data Era (HiPINEB 2015). IEEE Computer Society, 2015.Status: Published
SlimUpdate: Minimal Routing Update for Performance-based Reconfigurations in Fat-Trees
As the size of high-performance computing systems grows, the number of events requiring a network reconfiguration, as well as the complexity of each reconfiguration, is likely to increase. In large systems, the probability of component failure is high. At the same time, with more network components, ensuring high utilization of network resources becomes challenging. Reconfiguration in interconnection networks, like InfiniBand (IB), typically involves computation and distribution of a new set of routes in order to maintain connectivity and performance. In general, current routing algorithms do not consider the existing routes in a network when calculating new ones. Such configuration-oblivious routing might result in substantial modifications to the existing paths, and the reconfiguration becomes costly as it potentially involves a large number of source-destination pairs.
In this paper, we propose a novel routing algorithm for IB based fat-tree topologies, SlimUpdate. SlimUpdate employs techniques to preserve existing forwarding entries in switches to ensure a minimal routing update, without any performance penalty, and with minimal computational overhead. We present an implementation of SlimUpdate in OpenSM, and compare it with the current de facto fat-tree routing algorithm. Our experiments and simulations show a decrease of up to 80% in the number of total path modifications when using SlimUpdate routing, while achieving similar or even better performance than the fat-tree routing in most reconfiguration scenarios.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | 1st IEEE International Workshop on High-Performance Interconnection Networks Towards the Exascale and Big-Data Era (HiPINEB 2015) |
Pagination | 849-856 |
Date Published | 10/2015 |
Publisher | IEEE Computer Society |
ISBN Number | 978-1-4673-6598-7 |
Accession Number | 15570970 |
DOI | 10.1109/CLUSTER.2015.142 |
Towards the InfiniBand SR-IOV vSwitch Architecture
In IEEE Cluster 2015. IEEE Cluster 2015: IEEE, 2015.Status: Published
Towards the InfiniBand SR-IOV vSwitch Architecture
To meet the demands of the Exascale era and facilitate Big Data analytics in the cloud while maintaining flexibility, cloud providers will have to offer efficient virtualized High Performance Computing clusters in a pay-as-you-go model. As a consequence, high performance network interconnect solutions, like InfiniBand (IB), will be beneficial. Currently, the only way to provide IB connectivity on Virtual Machines (VMs) is by utilizing direct device assignment. At the same time to be scalable, Single-Root I/O Virtualization (SR-IOV) is used. However, the current SR-IOV model employed by IB adapters is a Shared Port implementation with limited flexibility, as it does not allow transparent virtualization and live-migration of VMs.
In this paper, we explore an alternative SR-IOV model for IB, the virtual switch (vSwitch), and propose and analyze two vSwitch implementations with different scalability characteristics. Furthermore, as network reconfiguration time is critical to make live-migration a practical option, we accompany our proposed architecture with a scalable and topology agnostic dynamic reconfiguration method, implemented and tested using OpenSM. Our results show that we are able to significantly reduce the reconfiguration time as route recalculations are no longer needed, and in large IB subnets, for certain scenarios, the number of reconfiguration subnet management packets (SMPs) sent is reduced from several hundred thousand down to a single one.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2015 |
Conference Name | IEEE Cluster 2015 |
Date Published | 09/2015 |
Publisher | IEEE |
Place Published | IEEE Cluster 2015 |
Keywords | Dynamic Network Reconfiguration, InfiniBand, Live Migration, SR-IOV, Virtualization |
Public outreach
A Self-adaptive network architecture for InfiniBand based HPC clouds
In Talk at 7th Cloud Control Workshop. Nässlingen, Sweden: 7th Cloud Control Workshop, 2015.Status: Published
A Self-adaptive network architecture for InfiniBand based HPC clouds
The research on network optimization in InfiniBand (IB) networks has been evolved in several directions, e.g. increasing network utilization, fault-tolerance, congestion control, and energy-aware systems. However, for efficient HPC clouds based on IB, the optimization problem becomes both complex and multi-dimensional, while individually proposed solutions often yield contradictory management decisions. We believe that a holistic closed-loop control system is required to effectively incorporate multidimensional objective function in future IB systems. Based on control theory, a self-adaptive model for the IB subnet system, may help acheiving better network utilization while effectively keeping user level SLAs in HPC clouds.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Public outreach |
Year of Publication | 2015 |
Secondary Title | Talk at 7th Cloud Control Workshop |
Publisher | 7th Cloud Control Workshop |
Place Published | Nässlingen, Sweden |
Type of Work | Discussion Session |
Journal Article
Early experiences with live migration of SR-IOV enabled InfiniBand
Journal of Parallel and Distributed Computing 78, no. C (2015): 39-52.Status: Published
Early experiences with live migration of SR-IOV enabled InfiniBand
Virtualization is the key to efficient resource utilization and elastic resource allocation in cloud computing. It enables consolidation, the on-demand provisioning of resources, and elasticity through live migration. Live migration makes it possible to optimize resource usage by moving virtual machines (VMs) between physical servers in an application transparent manner. It does, however, require a flexible, high-performance, scalable virtualized I/O architecture to reach its full potential. This is challenging to achieve with high-speed networks such as InfiniBand and remote direct memory access enhanced Ethernet, because these devices usually maintain their connection state in the network device hardware. Fortunately, the single root IO virtualization (SR-IOV) specification addresses the performance and scalability issues. With SR-IOV, each VM has direct access to a hardware assisted virtual device without the overhead introduced by emulation or para-virtualization. However, SR-IOV does not address the migration of the network device state. In this paper we present and evaluate the first available prototype implementation of live migration over SR-IOV enabled InfiniBand devices.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2015 |
Journal | Journal of Parallel and Distributed Computing |
Volume | 78 |
Issue | C |
Pagination | 39-52 |
Date Published | 04/2015 |
Publisher | Elsevier |
Keywords | Architecture, IO virtualization, SR-IOV, VM migration |
DOI | 10.1016/j.jpdc.2015.01.004 |
Efficient and Cost-Effective Hybrid Congestion Control for HPC Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems 26, no. 1 (2015): 107-119.Status: Published
Efficient and Cost-Effective Hybrid Congestion Control for HPC Interconnection Networks
Interconnection networks are key components in High-Performance Computing (HPC) systems, their performance having a strong influence on the overall system one. However, at high load, congestion and its negative effects (e.g. Head-of-line blocking) threaten the performance of the network, and so the one of the entire system. Congestion control (CC) is crucial to ensure an efficient utilization of the interconnection network during congestion situations. As one major trend is to reduce the effective wiring in interconnection networks to reduce cost and power consumption, the network will operate very close to its capacity. Thus, congestion control becomes essential. Existing CC techniques can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. However, both approaches have different, but non-overlapping weaknesses: injection throttling techniques have a slow reaction against congestion, while isolating traffic in special resources may lead the system to run out of those resources. In this paper we propose EcoCC, a new Efficient and Cost-Effective CC technique, that combines injection throttling and congested-flow isolation to minimize their respective drawbacks and maximize overall system performance. This new strategy is suitable for current commercial switch architectures, where it could be implemented without requiring significant complexity. Experimental results, using simulations under synthetic and real tracebased traffic patterns, show that this technique improves by up to 55% over some of the most successful congestion control techniques.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2015 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 26 |
Issue | 1 |
Number | pp |
Pagination | 107-119 |
Publisher | IEEE |
DOI | 10.1109/TPDS.2014.2307851 |
Journal Article
A New Proposal to Deal With Congestion in InfiniBand-Based Fat-Trees
Journal of Parallel and Distributed Computing 74 (2014): 1802-1819.Status: Published
A New Proposal to Deal With Congestion in InfiniBand-Based Fat-Trees
The overall performance of High-Performance Computing applications may depend largely on the performance achieved by the network interconnecting the end-nodes, thus high-speed interconnect technologies like InfiniBand are used to provide high throughput and low latency. Nevertheless, network performance may be degraded due to congestion, thus using techniques to deal with the problems derived from congestion has become practically mandatory. In this paper we propose a straightforward congestion-management method suitable for fat-tree topologies built from InfiniBand components. Our proposal is based on a traffic-flow-to-service-level mapping that prevents, as much as possible with the resources available in current InfiniBand components (basically Virtual Lanes), the negative impact of the two most common problems derived from congestion: head-of-line blocking and buffer-hogging. We also provide a mathematical approach to analyze the efficiency of our proposal and several ones, by means of a set of analytical metrics. In certain traffic scenarios, we observe up to a 68% of the ideal performance gain that could be achieved in HoL-blocking and buffer-hogging prevention.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2014 |
Journal | Journal of Parallel and Distributed Computing |
Volume | 74 |
Number | 1 |
Pagination | 1802-1819 |
Date Published | January |
Publisher | Elsevier |
Keywords | Congestion management, Fat-trees, High-performance computing, InfiniBand, Interconnection networks |
DOI | 10.1016/j.jpdc.2013.09 |
Talks, invited
Efficient and Robust Architecture for Big Data Cloud
In Hemavan, Sweden, 2014.Status: Published
Efficient and Robust Architecture for Big Data Cloud
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, invited |
Year of Publication | 2014 |
Location of Talk | Hemavan, Sweden |
Book Chapter
Switched Ethernet in Automation
In Industrial Communication Technology Handbook, 16:1-15. Second Edition. Boca Raton, FL: CRC Press, 2014.Status: Published
Switched Ethernet in Automation
Ethernet is now taking over as the preferred communication technology in industrial automation. Fieldbuses and proprietary solutions are rapidly being replaced by Ethernet based solutions with superior performance, robustness, flexibility and functionality. This transition means that automation systems are turning into complex communication networks. The main benefits of going for Ethernet for the manufacturers and end users are the wide availability of software, hardware and various tools, the attractive price/performance ratio and a vast potential for replacing costly manual intervention with smart network behaviour throughout the lifecycle of a system. This chapter will look at some key aspects of Ethernet as an automation network.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Book Chapter |
Year of Publication | 2014 |
Book Title | Industrial Communication Technology Handbook |
Edition | Second Edition |
Chapter | 16 |
Pagination | 16:1-15 |
Date Published | 08/2014 |
Publisher | CRC Press |
Place Published | Boca Raton, FL |
ISBN Number | 9781482207323 |
Journal Article
An Efficient, Low-Cost Routing Framework for Convex Mesh Partitions to Support Virtualisation
ACM Transactions on Embedded Computing Systems 12 (2013).Status: Published
An Efficient, Low-Cost Routing Framework for Convex Mesh Partitions to Support Virtualisation
At the core of an efficient chip multiprocessors (CMP) is support for unicast and multicast routing, low implementation costs, and the ability to isolate concurrent applications with maximum utilization of the CMP. We present an efficient logic-based unicast and multicast routing algorithm that guarantees isolation of local application traffic within any near-convex region on the chip, and the algorithms to recognize supported partitions and configure the cores accordingly. Evaluations show that the routing algorithm has a 57 percent; more compact implementation than a recent multicast solution with the same coverage, and it achieves 5 percent; higher throughput with 13 percent; lower latency.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2013 |
Journal | ACM Transactions on Embedded Computing Systems |
Volume | 12 |
Number | 4 |
Date Published | June |
Publisher | ACM |
Enabling Power Efficiency Through Dynamic Rerouting On-Chip
ACM Transactions on Embedded Computing Systems 12 (2013).Status: Published
Enabling Power Efficiency Through Dynamic Rerouting On-Chip
Networks-on-chip (NoCs) are key components in many-core chip designs. Dynamic power-awareness is a new challenge present in NoCs that must be efficiently handled by the routing functionality as it introduces irregularities in the commonly used 2-D meshes. In this article, we propose a logic-based routing algorithm, iFDOR, oriented towards dynamic powering down one region within every application partition on the chip through dynamic rerouting, with low implementation costs. Results show that we can successfully shutdown an arbitrary rectangular region within an application partition without significant impact on network performance.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2013 |
Journal | ACM Transactions on Embedded Computing Systems |
Volume | 12 |
Number | 4 |
Date Published | June |
Publisher | ACM |
Talks, contributed
Prototyping Live Migration With SR-IOV Supported InfiniBand HCAs
In HPC Advisory Council Spain Conference, 2013.Status: Published
Prototyping Live Migration With SR-IOV Supported InfiniBand HCAs
Live migration is challenging to achieve with high-speed networks because these devices usually maintain their connection state in the network device hardware. In this work we i) describe the challenges with live migration over SR-IOV enabled InfiniBand devices, ii) present and evaluate the first available prototype implementation of live migration over SR-IOV enabled InfiniBand devices.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2013 |
Location of Talk | HPC Advisory Council Spain Conference |
Keywords | Conference |
Book Chapter
Switched Ethernet in Automation
In Industrial Communication Technology Handbook, 49.1-49.15. 2nd ed. CRC Press, 2013.Status: Published
Switched Ethernet in Automation
Ethernet is now taking over as the preferred communication technology in industrial automation. Fieldbuses and proprietary solutions are rapidly being replaced by Ethernet based solutions with su-perior performance, robustness, flexibility and functionality. This transition means that automation systems are turning into complex communication networks. The main benefits of going for Ethernet for the manufacturers and end users are the wide availability of software, hardware and various tools, the attractive price/performance ratio and a vast potential for replacing costly manual intervention with smart network behaviour throughout the lifecycle of a system.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Book Chapter |
Year of Publication | 2013 |
Book Title | Industrial Communication Technology Handbook |
Edition | 2 |
Chapter | 49 |
Pagination | 49.1-49.15 |
Publisher | CRC Press |
Proceedings, refereed
A Scalable Signalling Mechanism for VM Migration With SR-IOV Over InfiniBand
In 18th IEEE International Conference on Parallel and Distributed Systems (ICPADS). IEEE Computer Society, 2012.Status: Published
A Scalable Signalling Mechanism for VM Migration With SR-IOV Over InfiniBand
Single Root I/O Virtualization (SR-IOV) is a promising I/O virtualization approach for achieving high performance in the virtualization over InfiniBand (IB) network. One challenge is related to the hardware address assignment for each virtual IB device. There are two schemes for the hardware address assignment; static assignment and dynamic assignment. Static assignment always preserves the hardware address of a virtual IB device that is attached to a VM, but the dynamic assignment does not. A drawback, however, using static assignment is that its communication will be disconnected after VM migration. In this paper, we point out the problem related to SRIOV over IB that breaks the network connections after VM migration when the static assignment is deployed. Then, we propose a signalling mechanism that can maintain the network connectivity after VM migration. The performance evaluation using an experimental test bed shows that the proposed signalling mechanism does not increase the service downtime during hot migration. We also optimize the signalling method, where the same event can only be forwarded to a physical server once regardless of the hosted VMs, to reduce the management message overhead from O(n {\_\ast} m) to O(n).
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2012 |
Conference Name | 18th IEEE International Conference on Parallel and Distributed Systems (ICPADS) |
Pagination | 384-391 |
Publisher | IEEE Computer Society |
ISBN Number | 978-0-7695-4903-3 |
Keywords | Conference |
Cost-Effective Contention Avoidance in a CMP With Shared Memory Controllers
In Proceedings of Euro-Par 2012. Springer Berlin Heidelberg, 2012.Status: Published
Cost-Effective Contention Avoidance in a CMP With Shared Memory Controllers
Efficient CMP utilisation requires virtualisation. This forces multiple applications to contend for the same network resources and memory bandwidth. In this paper we study the cause and effect of network congestion with respect to traffic local to the applications, and traffic caused by memory access. This reveals that applications close to the memory controller suffer because of congestion caused by memory controller traffic from other applications. We present a simple mechanism to reduce head-of-line blocking in the switches, which efficiently reduces network congestion, increases network performance, and evens out the performance differences between the CMP applications.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2012 |
Conference Name | Proceedings of Euro-Par 2012 |
Pagination | 741-752 |
Date Published | May |
Publisher | Springer Berlin Heidelberg |
Keywords | Conference |
Exploring the Scope of the InfiniBand Congestion Control Mechanism
In 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE Computer Society, 2012.Status: Published
Exploring the Scope of the InfiniBand Congestion Control Mechanism
In a lossless interconnection network, network congestion needs to be detected and resolved to ensure high performance and good utilization of network resources at high network load. If no countermeasure is taken, congestion at a node in the network will stimulate the growth of a congestion tree that not only affects contributors to congestion, but also other traffic flows in the network. Left untouched, the congestion tree will block traffic flows, lead to underutilization of network resources and result in a severe drop in network performance. The InfiniBand standard specifies a congestion control (CC) mechanism to detect and resolve congestion before a congestion tree is able to grow and, by that, hamper the network performance. The InfiniBand CC mechanism includes a rich set of parameters that can be tuned in order to achieve effective CC. Even though it has been shown that the CC mechanism, properly tuned, is able to improve both throughput and fairness in an interconnection network, it has been questioned whether the mechanism is fast enough to keep up with dynamic network traffic, and if a given set of parameter values for a topology is robust when it comes to different traffic patterns, or if the parameters need to be tuned depending on the applications in use. In this paper we address both these questions. Using the three-stage fat-tree topology from the Sun Datacenter InfiniBand Switch 648 as a basis, and a simulator tuned against CC capable InfiniBand hardware, we conduct a systematic study of the efficiency of the InfiniBand CC mechanism as the network traffic becomes increasingly more dynamic. Our studies show that the InfiniBand CC, even when using a single set of parameter values, performs very well as the traffic patterns becomes increasingly more dynamic, outperforming a network without CC in all cases. Our results show throughput increases varying from a few percent, to a seventeen-fold increase.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2012 |
Conference Name | 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) |
Pagination | 1131-1143 |
Publisher | IEEE Computer Society |
DOI | 10.1109/IPDPS.2012.104 |
Journal Article
A Survey and Evaluation of Topology Agnostic Routing Algorithms
IEEE Transactions on Parallel and Distributed Systems 23 (2012): 405-425.Status: Published
A Survey and Evaluation of Topology Agnostic Routing Algorithms
Most standard interconnect technologies that have emerged over the last decade, mainly for clusters, are flexible with respect to network topology. This has spawned a substantial amount of research on topology agnostic routing algorithms, and some of the developed algorithms are widely deployed. These algorithms make no assumption about the structure of the network, thus they provide the flexibility needed to route on a network in the presence of faulty components. On the other hand, the advent of massive multi-core chips challenges the state of the art in interconnection networks. The topology of choice for point to point Networks on-chip (NoC) is the two dimensional mesh which is well understood, but the need of high yield, out of a production line for such chips requires that they should be functional even if some components suffer from manufacturing defects. Such defects will turn the regular topology into a slightly irregular one, making topology dependent routing algorithms inapplicable. Also, heterogeneous multi-core chips, where different cores could have different sizes and shapes, naturally lead to (slightly) irregular topologies. Therefore, topology agnostic routing algorithms proposed for clusters find a new environment to be applied since they provide the flexibility needed to route on a NoC in the presence of faulty or missing components. Moreover, in some cases, they have been shown to perform comparably to special purpose algorithms for regular topologies. Unfortunately, the existing topology agnostic routing algorithms have been developed for varying purposes, giving them different and not always comparable properties. Furthermore, the information on these routing algorithms is scattered on different papers where they have been evaluated under different and not comparable conditions. In this paper, we present a comprehensive overview of the topology agnostic routing algorithms that have been proposed to date. The algorithms are classified with respect to their most important properties, and their performance under equal, but varying conditions is analyzed. Thereby, we provide significant new insight into the properties of the different routing algorithms, and give substantial support for, given a set of requirements, finding the most suitable both for the on and off-chip environments.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2012 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 23 |
Number | 3 |
Pagination | 405-425 |
Date Published | March |
Publisher | IEEE |
Talks, contributed
Allocating Irregular Partitions in Mesh-Based On-Chip Networks
OMHI 2012 Workshop at Euro-Par: unknown, 2012.Status: Published
Allocating Irregular Partitions in Mesh-Based On-Chip Networks
Modern CMPs require sophisticated resource management in order to provide good utilisation of the chip resources. There exists good allocation algorithms for compute clusters, but these are restricted to specific routing algorithms and not easily transferable to the on-chip domain. We present a novel resource allocation algorithm, TSB, that allows petitions with any shape that is supported by an algorithm implemented using LBDR/uLBDR, and show that this has low complexity and comparable utilisation to UDFlex.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2012 |
Publisher | unknown |
Place Published | OMHI 2012 Workshop at Euro-Par |
Fat-Trees and Dragonflies - a Perspective on Topologies
In Contributed talk at the HPC Advisory Council Switzerland Workshop, Lugano, Switzerland., 2012.Status: Published
Fat-Trees and Dragonflies - a Perspective on Topologies
One of the foundations of any HPC cluster is the network topology. The choice of topology is dictated by properties such as cost, network technology, and target applications. For InfiniBand (IB) the Fat-tree is the dominating topology, but recently we have also seen IB clusters based on a 3d Torus. From time to time new topologies are invented and in this talk we will have a closer look at the recently proposed Dragon-fly topology and compare it with the well established Fat-tree topology.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2012 |
Location of Talk | Contributed talk at the HPC Advisory Council Switzerland Workshop, Lugano, Switzerland. |
Keywords | Workshop |
Technical reports
Appendix to Enabling Power Efficiency Through Dynamic Rerouting On-Chip
Simula Research Laboratory, 2011.Status: Published
Appendix to Enabling Power Efficiency Through Dynamic Rerouting On-Chip
Afilliation | Communication Systems, , Communication Systems |
Publication Type | Technical reports |
Year of Publication | 2011 |
Date Published | September |
Publisher | Simula Research Laboratory |
Proceedings, refereed
Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks
In International Conference on Parallel Processing, ICPP 2011. IEEE Computer Society, 2011.Status: Published
Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks
Existing congestion control mechanisms in interconnects can be divided into two general approaches. One is to throttle traffic injection at the sources that contribute to congestion, and the other is to isolate the congested traffic in specially designated resources. These two approaches have different, but non-overlapping weaknesses. In this paper we present in detail a method that combines injection throttling and congested-flow isolation. Through simulation studies we first demonstrate the respective flaws of the injection throttling and of flow isolation. Thereafter we show that our combined method extracts the best of both approaches in the sense that it gives fast reaction to congestion, it is scalable and it has good fairness properties with respect to the congested flows.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | International Conference on Parallel Processing, ICPP 2011 |
Pagination | 662-672 |
Date Published | September |
Publisher | IEEE Computer Society |
ISBN Number | 978-1-4577-1336-1 |
DOI | 10.1109/ICPP.2011.80 |
DFtree - a Fat-Tree Routing Algorithm Using Dynamic Allocation of Virtual Lanes to Alleviate Congestion in InfiniBand Networks
In The Network-Aware Data Management Workshop to be held in conjunction with the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11). ACM, 2011.Status: Published
DFtree - a Fat-Tree Routing Algorithm Using Dynamic Allocation of Virtual Lanes to Alleviate Congestion in InfiniBand Networks
End-point hotspots can cause major slowdowns in interconnection networks due to head-of-line blocking and congestion. Therefore, avoiding congestion is important to ensure high performance for the network traffic. It is especially important in situations where permanent congestion, which results in permanent slowdown, can occur. Permanent congestion occurs when traffic has been moved away from a failed link, when multiple jobs run on the same system, and compete for network resources, or when a system is not balanced for the application that runs on it. In this paper we suggest a mechanism for dynamic allocation of virtual lanes and live optimisation of the distribution of flows between the allocated virtual lanes. The purpose is to alleviate the negative effect of permanent congestion by separating network flows into slow lane and fast lane traffic. Flows destined for an end-point hot-spot is placed in the slow lane and all other flows are placed in the fast lane. Consequently, the flows in the fast lane are unaffected by the head-of-line blocking created by the hot-spot traffic. We demonstrate the feasbility of this approach using a modified version of OFED and OpenSM with fat-tree routing on a small InfiniBand cluster. Our experiments show an increase in throughput ranging from 150% to 468% compared to the conventional fat-tree algorithm in OFED.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | The Network-Aware Data Management Workshop to be held in conjunction with the IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11) |
Pagination | 1-10 |
Date Published | November |
Publisher | ACM |
ISBN Number | 978-1-4503-1132-8 |
Efficient and Contention-Free Virtualisation of Fat-Trees
In IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum. Washington DC, USA: IEEE Computer Society Washington, 2011.Status: Published
Efficient and Contention-Free Virtualisation of Fat-Trees
Maintaining high system utilisation is a key factor for data centres. However, strictly partitioning the datacentre resources to fully isolate the concurrent applications (contention freedom) leads to poor system utilisation because of fragmentation. We present an allocation algorithm for fat-trees (which are commonly found in large-scale data centres) capable of increasing system utilisation while maintaining application isolation. Results show at least a 10% increase in system utilization compared to regular contention free allocation mechanisms, at the cost of a slight reduction in network performance or application isolation.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum |
Pagination | 754-760 |
Date Published | September |
Publisher | IEEE Computer Society Washington |
Place Published | Washington DC, USA |
ISBN Number | 978-0-7695-4577-6 |
IFDOR - Dynamic Rerouting On-Chip
In Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip. New York, USA: ACM, 2011.Status: Published
IFDOR - Dynamic Rerouting On-Chip
Many-core chip design requires flexible routing solutions for the interconnect to handle faults, provide performance partitions, and react to dynamic changes in processing requirements and power/heat distribution. We have developed a logic based rerouting mechanism suitable for tolerating dynamic powering down of regions within the application partition on the chip. This mechanism is combined with the logic based FDOR routing algorithm to create a powerful routing algorithm with low implementation cost. This allows for higher system utilisation through enabling more efficient power management as well as supporting many irregular mesh topologies through flexible virtualisation. Results show that powering down a single switch results in an 8% throughput reduction in the worst case for the evaluated topology.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip |
Pagination | 11-14 |
Date Published | January |
Publisher | ACM |
Place Published | New York, USA |
ISBN Number | 978-1-4503-0272-2 |
On the Relation Between Congestion Control, Switch Arbitration and Fairness
In 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011). IEEE, 2011.Status: Published
On the Relation Between Congestion Control, Switch Arbitration and Fairness
In lossless interconnection networks such as InfiniBand, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to the hardware designer. One must be cautious when making these design decisions not to introduce fairness problems, as our study shows. In this paper we study the relationship between congestion control, switch arbitration, and fairness. Specifically, we look at fairness among different traffic flows arriving at a hot spot switch on different input ports, as CC is turned on. In addition we study the fairness among traffic flows at a switch where some flows are exclusive users of their input ports while other flows are sharing an input port (the parking lot problem). Our results show that the implementation of congestion control in a switch is vulnerable to unfairness if care is not taken. In detail, we found that a threshold hysteresis of more than one MTU is needed to resolve arbitration unfairness. Furthermore, to fully solve the parking lot problem, proper configuration of the CC parameters are required.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | 11th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid 2011) |
Pagination | 342-351 |
Date Published | May |
Publisher | IEEE |
ISBN Number | 978-1-4577-0129-0 |
DOI | 10.1109/CCGrid.2011.67 |
VFtree - a Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion
In Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium. IEEE Computer Society Press, 2011.Status: Published
VFtree - a Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion
It is a well known fact that multiple virtual lanes can improve performance in interconnection networks, but this knowledge has had little impact on real clusters. Currently, a large number of clusters using InfiniBand is based on fat-tree topologies that can be routed deadlock-free using only one virtual lane. Consequently, all the remaining virtual lanes are left unused. In this paper we suggest an enhancement to the fat-tree algorithm that utilizes virtual lanes to improve performance when hot-spots are present. Even though the bisection bandwidth in a fat-tree is constant, hot-spots are still possible and they will degrade performance for flows not contributing to them due to head-of-line blocking. Such a situation may be alleviated through adaptive routing or congestion control, however, these methods are not yet readily available in InfiniBand technology. To remedy this problem, we have implemented an enhanced fat-tree algorithm in OpenSM that distributes traffic across all available virtual lanes without any configuration needed. We evaluated the performance of the algorithm on a small cluster and done a large-scale evaluation through simulations. In a congested environment, results show that we are able to achieve throughput increases up to 38% on a small cluster and from 221% to 757% depending on the hot-spot scenario for a 648-port simulated cluster.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2011 |
Conference Name | Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium |
Pagination | 197-208 |
Publisher | IEEE Computer Society Press |
ISBN Number | 978-1-61284-372-8 |
Journal Article
Dynamic Fault Tolerance in Fat-Trees
IEEE Transactions on Computers 60, no. 4 (2011): 508-525.Status: Published
Dynamic Fault Tolerance in Fat-Trees
Fat-trees are a very common communication architecture in current large-scale parallel computers. The probability of failure increases with the number of components. We present a routing method for deterministically and adaptively routed fat trees, applicable to both distributed and source routing, that is able to handle several concurrent faults and that transparently returns to the original routing strategy once the faulty components have recovered. The method is local and dynamic, completely masking the fault from the rest of the system. It only requires a small extra functionality in the switches to handle misrouting around a fault. The method guarantees connectedness and deadlock and livelock freedom for up to radix/2 -1 arbitrary simultaneous faults. Our simulation experiments show a graceful degradation of performance as faults are added. Furthermore, they demonstrate that for most fault combinations, our method will even be able to handle significantly more faults beyond the radix/2-1 limit with high probability.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2011 |
Journal | IEEE Transactions on Computers |
Volume | 60 |
Issue | 4 |
Number | 4 |
Pagination | 508-525 |
Date Published | April |
Publisher | IEEE |
DOI | 10.1109/TC.2010.97 |
Talks, contributed
InfiniBand Congestion Control
In Contributed talk at the 2011 OpenFabrics International Workshop, Monterey, USA, 2011.Status: Published
InfiniBand Congestion Control
In InfiniBand networks congestion control (CC) can be an effective mechanism to achieve high performance and high utilisation of network resources. Without CC, congestion in one node may severely degrade overall performance. In this talk we introduce the problem of congestion and how it can be avoided with InfiniBand congestion control.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2011 |
Location of Talk | Contributed talk at the 2011 OpenFabrics International Workshop, Monterey, USA |
VFtree - a Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion
In Invited talk at the HPC Advisory Council Switzerland Workshop 2011, Lugano, Switzerland., 2011.Status: Published
VFtree - a Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion
A large number of clusters using InfiniBand is based on fat-tree topologies that can be routed deadlock-free using only one virtual lane. Consequently, all the remaining virtual lanes are left unused. In this talk we present an enhancement to the fat-tree algorithm that utilizes virtual lanes to improve performance when hot-spots are present. In a congested environment, results show that we are able to achieve throughput increases up to 38% on a small cluster and from 221% to 757% depending on the hot-spot scenario for a 648-port simulated cluster.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2011 |
Location of Talk | Invited talk at the HPC Advisory Council Switzerland Workshop 2011, Lugano, Switzerland. |
Proceedings, refereed
Achieving Predictable High Performance in Imbalanced Fat Trees
In Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems. IEEE Computer Society, 2010.Status: Published
Achieving Predictable High Performance in Imbalanced Fat Trees
Abstract-The fat-tree topology has become a popular choice for InfiniBand fabrics due to its inherent deadlock freedom, fault-tolerance and full bisection bandwidth. InfiniBand is used by more than 40% of the systems on the latest Top 500 list, and many of these systems are based on a fat-tree topology. However, the current InfiniBand fat-tree routing algorithm suffers from flaws that reduce its scalability and flexibility. Counter-intuitively, the achievable throughput per node deteriorates both when the number of nodes in a tree decreases or when the node distribution among leaves is nonuniform. In this paper, we identify the weaknesses of the current enhanced fat-tree routing algorithm in OpenFabrics Enterprise Distribution and we propose extensions to it that alleviate all performance problems related to node distribution. The new algorithm is implemented in OpenSM for real world evaluation and for future contribution to the OpenFabrics community. We demonstrate that our solution allows to achieve a predictable high throughput regardless of the number of nodes and their distribution. Furthermore, the simulations show that our extensions improve throughput up to 30% depending on topology size and node distribution.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2010 |
Conference Name | Proceedings of the 16th IEEE International Conference on Parallel and Distributed Systems |
Pagination | 381-388 |
Date Published | December |
Publisher | IEEE Computer Society |
ISBN Number | 978-0-7695-4307-9 |
Admission and Power Control for Cognitive Radio Cellular Networks: a Multidimensional Knapsack Solution
In CogART 2010, invited paper. IEEE, 2010.Status: Published
Admission and Power Control for Cognitive Radio Cellular Networks: a Multidimensional Knapsack Solution
Admission and power control schemes play important roles in deploying cognitive radio technology to cellular networks. In this paper, we propose a method to calculate the power scale factor, introduce two pre-admission control schemes, and reformulate the problem as a multidimensional knapsack problem. In addition, we propose a novel admission and power control scheme called JAPC-MKP. Simulation results show that our proposed JAPC-MKP can very closely approach the optimal results from the optimization software MOSEK, and greatly outperform the existing schemes in most cases.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2010 |
Conference Name | CogART 2010, invited paper |
Date Published | November |
Publisher | IEEE |
ISBN Number | 978-1-4244-8132-3 |
Energy Minimization Approach for Optimal Cooperative Spectrum Sensing in Sensor-Aided Cognitive Radio Networks
In 2010 The 5th Annual ICST Wireless Internet Conference (WICON 2010). ICST/Create-Net, 2010.Status: Published
Energy Minimization Approach for Optimal Cooperative Spectrum Sensing in Sensor-Aided Cognitive Radio Networks
In a sensor-aided cognitive radio network, collaborating battery-powered sensors are deployed to aid the network in cooperative spectrum sensing. These sensors consume energy for spectrum sensing and therefore deplete their life-time, thus we study the key issue in minimizing the sensing energy consumed by such group of collaborating sensors. The IEEE P802.22 standard specifies spectrum sensing accuracy by the detection and false alarm probabilities, hence we address the energy minimization problem under this detection accuracy constraint. Firstly, we derive the bounds for the number of sensors to simultaneously guarantee the thresholds for high detection probability and low false alarm probability. With these bounds, we then formulate the optimization problem to find the optimal sensing interval and the optimal number of sensor that minimize the energy consumption. Thirdly, the approximated analytical solutions are derived to solve the optimization accurately and efficiently in polynomial time. Finally, numerical results show that the minimized energy is significantly lower than the energy consumed by a group of randomly selected sensors. The mean absolute error of the approximated optimal sensing interval compared with the exact value is less than 4% and 8% under good and bad SNR conditions, respectively. The approximated optimal number of sensors is shown to be very close to the exact number.
Afilliation | Communication Systems, , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2010 |
Conference Name | 2010 The 5th Annual ICST Wireless Internet Conference (WICON 2010) |
Publisher | ICST/Create-Net |
ISBN Number | 978-963-9799-86-8 |
First Experiences With Congestion Control in InfiniBand Hardware
In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 2010.Status: Published
First Experiences With Congestion Control in InfiniBand Hardware
In lossless interconnection networks congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. Without CC, congestion on one link may grow into a congestion tree that can degrade the performance severely. This degradation can affect not only contributors to the congestion, but also throttles innocent traffic flows in the network. The InfiniBand standard describes CC functionality for detecting and resolving congestion. The InfiniBand CC concept is rich in the way that it specifies a set of parameters that can be tuned in order to achieve effective CC. There is, however, limited experience with the InfiniBand CC mechanism. To the best of our knowledge, only a few simulation studies exist. Recently, InfiniBand CC has been implemented in hardware, and in this paper we present the first experiences with such equipment. We show that the implemented InfiniBand CC mechanism effectively resolves congestion and improves fairness by solving the parking lot problem, if the CC parameters are appropriately set. By conducting extensive testing on a selection of the CC parameters, we have explored the parameter space and found a subset of parameter values that leads to efficient CC for our test scenarios. Furthermore, we show that the InfiniBand CC increases the performance of the well known HPC Challenge benchmark in a congested network.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2010 |
Conference Name | 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) |
Pagination | 1-12 |
Publisher | IEEE |
ISBN Number | 978-1-4244-6442-5 |
DOI | 10.1109/IPDPS.2010.5470419 |
Host Side Dynamic Reconfiguration With InfiniBand
In 2010 IEEE International Conference on Cluster Computing. IEEE Computer Society, 2010.Status: Published
Host Side Dynamic Reconfiguration With InfiniBand
Rerouting around faulty components and migration of jobs both require reconfiguration of data structures in the Queue Pairs residing in the hosts on an InfiniBand cluster. In this paper we report an implementation of dynamic reconfiguration of such host side data-structures. Our implementation preserves the Queue Pairs, and lets the application run without being interrupted. With this implementation, we demonstrate a complete solution to fault tolerance in an InfiniBand network, where dynamic network reconfiguration to a topology-agnostic routing function is used to avoid malfunctioning components. This solution is in principle able to let applications run uninterruptedly on the cluster, as long as the topology is physically connected. Through measurements on our test-cluster we show that the increased cost of our method in setup latency is negligible, and that there is only a minor reduction in throughput during reconfiguration.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2010 |
Conference Name | 2010 IEEE International Conference on Cluster Computing |
Pagination | 126-135 |
Publisher | IEEE Computer Society |
ISBN Number | 978-0-7695-4220-1 |
Journal Article
Downlink Spectrum Sharing for Cognitive Radio Femtocell Networks
IEEE Systems Journal 4 (2010): 524-534.Status: Published
Downlink Spectrum Sharing for Cognitive Radio Femtocell Networks
Femtocell is envisioned as a highly promising solution for indoor wireless communications. The spectrum allocated to femtocells is traditionally from the same licensed spectrum bands of macrocells. In this case, the capacity of femtocell networks is highly limited due to the finite number of licensed spectrum bands and also the interference with macrocells and other femtocells. In this paper, we propose a radically new communication paradigm by incorporating cognitive radio in femtocell networks. The cognitive radio enabled femtocells are able to access spectrum bands not only from macrocells but also from other licensed systems (e.g. TV systems) provided the interference from femtocells to the existing systems is not harmful. It results in more channel opportunities for femtocells. Thus, the co-channel interference in femtocells can be greatly reduced and the network capacity can be significantly improved. Because of the difference from other traditional wireless networks, we argue the traditional spectrum sharing schemes such as coloring methods are not efficient to femtocell networks especially for dense deployment scenarios. We formulate the downlink spectrum sharing problem in cognitive radio femtocell networks, and employ decomposition theories to solve the problem. Simulation results indicate that cognitive radio enabled femtocells could achieve much higher capacity than the femtocell networks which do not employ agile spectrum access. Simulation results also show that our proposed scheme without any iteration can achieve almost twice of the average capacity by coloring method when the number of available channels is less than five. Moreover, our proposed scheme can converge very fast with a typical value of only five iterations, and it can achieve around two percent extra average capacity than the fixed power control scheme.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2010 |
Journal | IEEE Systems Journal |
Volume | 4 |
Number | 4 |
Pagination | 524-534 |
Date Published | December |
Notes | Special issue on Broadband Access Networks |
DOI | 10.1109/JSYST.2010.2083230 |
Ethernet for High Performance Data Centers - on the New IEEE Data Center Bridging Standards
IEEE Micro 30 (2010): 42-51.Status: Published
Ethernet for High Performance Data Centers - on the New IEEE Data Center Bridging Standards
Ethernet is about to enter the domain of data center and high performance computing by introducing several performance optimizations that will close the performance and functionality gap between Ethernet and its fiercest competitioner InfiniBand. Through the Data Center Bridging Task Group the IEEE is about to expand the 802.1 standard with four new supplements that will both close the performance gap and make the converged network a reality. In a converged network all applications use a single physical infrastructure, e.g. Ethernet or InfiniBand. This is ideal for the next generation of data centers that are now emerging. In this paper we discuss the architectural challenges faced by Ethernet in order to improve performance and make the converged network a reality, and we present the Ethernet enhancements currently being standardized by the IEEE.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2010 |
Journal | IEEE Micro |
Volume | 30 |
Number | 4 |
Pagination | 42-51 |
Date Published | July/August |
Publisher | IEEE |
Medium Access Control Protocols in Cognitive Radio Networks
Wiley Wireless Communications and Mobile Computing 10 (2010): 31-49.Status: Published
Medium Access Control Protocols in Cognitive Radio Networks
In cognitive radio (CR) networks, medium access control (MAC) protocols play an important role to exploit the spectrum opportunities, manage the interference to primary users (PUs), and coordinate the spectrum access amongst secondary users (SUs). In this paper, we first introduce the challenges in the design and implementation of CR MAC protocols. Then, we make a comprehensive survey of the state-of-the-art CR MAC protocols and categorize them on the basis of spectrum sharing modes, i.e., overlay and underlay modes. We also introduce some other classification metrics such as architecture (centralized or distributed), sharing behaviors (cooperative or non-cooperative), and access modes (contention based or contention free). Through the study, we find out that most CR MAC protocols are designed for the overlay mode. The CR MAC protocols in underlay mode could yield higher spectrum utilization efficiency with the cost of more complicated power and admission control schemes. The centralized CR MAC protocols are more suitable to do spectrum sensing using quiet period of whole network, while the distributed CR MAC protocols are more flexible to deploy. We identify several future directions, such as more practical CR MAC protocols for underlay mode, MAC protocols considering security, MAC protocols considering heterogeneous coexistence, etc.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2010 |
Journal | Wiley Wireless Communications and Mobile Computing |
Volume | 10 |
Number | 1 |
Pagination | 31-49 |
Date Published | January |
Notes | Special issue on Recent Advances in Wireless Communications and Networks |
DOI | 10.1002/wcm.906 |
Talks, contributed
First Experiences With Congestion Control in InfiniBand Hardware
In Invited talk at the HPC Advisory Council Switzerland Workshop 2010, 2010.Status: Published
First Experiences With Congestion Control in InfiniBand Hardware
In InfiniBand networks congestion control (CC) can be an effective mechanism to achieve high performance and high utilisation of network resources. Without CC, congestion in one node may severely degrade overall performance. In this talk we introduce the problem of congestion and how it can be avoided with InfiniBand congestion control.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2010 |
Location of Talk | Invited talk at the HPC Advisory Council Switzerland Workshop 2010 |
Proceedings, refereed
A Framework for Routing and Resource Allocation in Network Virtualization
In International Conference on High Performance Computing (HiPC'09). IEEE, 2009.Status: Published
A Framework for Routing and Resource Allocation in Network Virtualization
Computer architectures for high performance computing have traditionally been based on an assumption of one parallel application running alone on one machine. The current trend is, however, that huge computer installations offer compute power to a set of users or customers, each demanding only a subset of the available compute resources. This places new requirements on the architecture, in that it must support dynamic partitioning of the resources into several virtual servers as demand changes. We introduce a novel framework which supports flexible formation of such virtual servers while preventing interference between the communication of different virtual servers. This paper investigates the impacts of a shared interconnection network on applications running on virtual compute servers. We show that the interconnect performance supplied to each job is highly unpredictable, and that a job can experience a performance degradation of 97% when its traffic interferes with the traffic of concurrent jobs. With a minor reduction in the utilization of each processing node, this can be considerably improved through a combination of routing-containment in the interconnection network and a carefully designed resource allocation strategy.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2009 |
Conference Name | International Conference on High Performance Computing (HiPC'09) |
Pagination | 129-139 |
Publisher | IEEE |
ISBN Number | 978-1-4244-4921-7 |
Dynamic Spectrum Sharing in Cognitive Radio Femtocell Networks
In 2009 Fourth International Conference on Access Networks (ACCESSNETS 2009), invited paper. Vol. 37. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering 37. Springer, 2009.Status: Published
Dynamic Spectrum Sharing in Cognitive Radio Femtocell Networks
Femtocell is envisioned as a highly promising solution to tackle the communications in the indoor environments, which has been a very challenging problem for mobile network operators. Currently, the spectrum allocated to femtocells is from the same licensed spectrum of macrocells, and the same mobile network operator. In this case, the capacity of femtocell networks may be largely limited due to the finite number of licensed spectrum bands and also the interference with other femtocells and macrocells. In this paper, we propose a radically new communications paradigm by incorporating cognitive radio in femtocell networks (COGFEM). In COGFEM, the cognitive radio enabled femtocells are able to access licensed spectrum bands not only from macrocells but also from other licensed systems (e.g. TV systems). Thus, the co-channel interference in femtocells can be greatly reduced and the network capacity can be significantly improved. We formulate a joint channel allocation and power control problem in COGFEM, and present two intelligent algorithms for efficient spectrum sharing in COGFEM. Results indicate that COGFEM is able to achieve much higher capacity than the femtocell networks which does not employ agile spectrum access.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2009 |
Conference Name | 2009 Fourth International Conference on Access Networks (ACCESSNETS 2009), invited paper |
Volume | 37 |
Pagination | 164-178 |
Date Published | November |
Publisher | Springer |
ISBN Number | 978-3-642-11663-6 |
Efficient and Deadlock-Free Reconfiguration for Source Routed Networks
In Communication Architecture for Clusters (CAC). IEEE Computer Society, 2009.Status: Published
Efficient and Deadlock-Free Reconfiguration for Source Routed Networks
Overlapping Reconfiguration is currently the most efficient method to reconfigure an interconnection network, but is only valid for systems that apply distributed routing. This paper proposes a solution which enables utilization of Overlapping Reconfiguration in a source routed environment. We demonstrate how a synchronized injection of tokens has a significant impact on the performance of the method. Furthermore, we propose and evaluate an optimization of the original algorithm that reduces (and in some cases even eliminates) performance issues caused by the token forwarding regime, such as increased latency and decreased throughput.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2009 |
Conference Name | Communication Architecture for Clusters (CAC) |
Publisher | IEEE Computer Society |
ISBN Number | 978-1-4244-3750-4 |
Flexible DOR Routing for Virtualization of Multicore Chips
In International Symposium on System-on-Chip. IEEE, 2009.Status: Published
Flexible DOR Routing for Virtualization of Multicore Chips
The expected increase in number of cores on a single chip leads to the necessity of high-performance on chip interconnects (NoC). Furthermore, in order to fully utilize the abundance of cores, the chip is expected to support a number of applications running on the chip simultaneously. It is therefore necessary to partition the chip to support numerous applications without any risk of interference between them. The success of this depends on the flexibility of the underlying routing algorithm. This paper presents a flexible routing algorithm based on dimension ordered routing, which supports a large variety of irregular (2-D and 3-D) mesh topologies. The algorithm provides high efficiency at very low additional complexity, as is confirmed by experimental results.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2009 |
Conference Name | International Symposium on System-on-Chip |
Date Published | October |
Publisher | IEEE |
ISBN Number | 978-1-4244-4465-6 |
Optimal Cooperative Spectrum Sensing in Cognitive Sensor Networks
In The 5th International Wireless Communications and Mobile Computing Conference (IWCMC '09). ACM New York, NY, USA, 2009.Status: Published
Optimal Cooperative Spectrum Sensing in Cognitive Sensor Networks
This paper addresses the problem of optimal cooperative spectrum sensing in a cognitive-enabled sensor network where cognitive sensors can cooperate in the sensing of the spectrum. Such sensor networks are assumed to be power resource constrained. With a given threshold for the accuracy of the spectrum detection, we find the optimal number of cognitive sensors participating in the cooperative spectrum sensing and the optimal sensing interval that minimize the total energy consumption of the cooperative sensing. First, the mathematical lower bound and upper bound for the number of cooperative cognitive sensors are found. Then the optimization problem to minimize the total energy consumed by a group of sensors is presented. Finally, an efficient approximate solution to the optimization problem is proposed. Numerical calculations validate the accuracy and the performance of the proposed scheme. The impact of the noise uncertainty, the choice of the energy detection threshold, and the spectrum bandwidth on the detection accuracy and the minimum total energy consumption is also studied.
Afilliation | Communication Systems, , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2009 |
Conference Name | The 5th International Wireless Communications and Mobile Computing Conference (IWCMC '09) |
Pagination | 1073-1079 |
Publisher | ACM New York, NY, USA |
ISBN Number | 978-1-60558-569-7 |
RecTOR: a New and Efficient Method for Dynamic Network Reconfiguration
In Euro-Par 2009. Springer Berlin / Heidelberg, 2009.Status: Published
RecTOR: a New and Efficient Method for Dynamic Network Reconfiguration
Reconfiguration of an interconnection network is fundamental for the provision of a reliable service. Current reconfiguration methods either include deadlock-avoidance mechanisms that impose performance penalties during the reconfiguration, or are tied to the Up*/Down* routing algorithm which achieves relatively low performance. In addition, some of the methods require complex network switches, and some are limited to distributed routing systems. This paper presents a new dynamic reconfiguration method, RecTOR, which ensures deadlock-freedom during the reconfiguration without causing performance degradation such as increased latency or decreased throughput. Moreover, it is based on a simple concept, is easy to implement, is applicable for both source and distributed routing systems, and assumes Transition-Oriented Routing which achieves excellent performance. Our simulation results confirm that RecTOR supports a better network service to the applications than Overlapping Reconfiguration does.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2009 |
Conference Name | Euro-Par 2009 |
Pagination | 1052-1064 |
Publisher | Springer Berlin / Heidelberg |
ISBN Number | 978-3-642-03868-6 |
Journal Article
A New Distributed Management Mechanism for ASI Based Networks
Computer Communications 32, no. 2 (2009): 294-304.Status: Published
A New Distributed Management Mechanism for ASI Based Networks
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2009 |
Journal | Computer Communications |
Volume | 32 |
Issue | 2 |
Number | 2 |
Pagination | 294-304 |
Publisher | Elsevier |
DOI | 10.1016/j.comcom.2008.10.010 |
QoS Aware Admission and Power Control for Cognitive Radio Cellular Networks
Wiley Wireless Communications and Mobile Computing 9, no. 11 (2009): 1520-1531.Status: Published
QoS Aware Admission and Power Control for Cognitive Radio Cellular Networks
In cognitive radio cellular networks, the Secondary Users (SUs) are allowed to access the channels licensed to the Primary Users (PUs) including Primary Transmitters (PTs) and Primary Receivers (PRs), only if the interference to the PRs is less than the predefined threshold, and the Quality of Service (QoS) requirements of PTs are guaranteed. In addition, different SUs may require different levels of QoS, and pay differently depending on the provided QoS. The network operator achieves different secondary revenues by admitting SUs in different QoS levels. The problem we address in this paper is to maximize the total secondary revenue relative to the interference constraints on PRs, and QoS requirements for both PTs and SUs. We formulate this optimization problem, and propose a power control scheme for both PTs and SUs. Then, we introduce three solutions including an exact solution using dynamic programming, a greedy heuristic algorithm, and a minimal SINR removal algorithm. Based on these algorithms, we propose three QoS aware admission and power control schemes, one optimal solution called QAPC-Dynamic, and two approximate solutions called QAPC-Greedy and QAPC-MSRA, respectively. Numerical results show that QAPC-Dynamic always achieves the highest secondary revenue while QAPC-MSRA gives the lowest secondary revenue. Since the time complexity of QAPC-Dynamic is much higher than the other two schemes, QAPC-Greedy is recommended considering the trade-off between the computation complexity and performance gain.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2009 |
Journal | Wiley Wireless Communications and Mobile Computing |
Volume | 9 |
Issue | 11 |
Number | 11 |
Pagination | 1520-1531 |
Date Published | November |
Publisher | Wiley |
DOI | 10.1002/wcm.765 |
Technical reports
Dynamic Fault Tolerance in Fat-Trees - Research Note
Simula, 2009.Status: Published
Dynamic Fault Tolerance in Fat-Trees - Research Note
Fat-trees are a very common communication architecture in current large-scale parallel computers. The probability of failure in these systems increases with the number of components. We present a routing method for deterministically and adaptively routed fat-trees, applicable to both distributed and source routing, that is able to handle several concurrent faults and that transparently returns to the original routing strategy once the faulty components have recovered. The method is local and dynamic, completely masking the fault from the rest of the system. It only requires a small extra functionality in the switches to handle misrouting around a fault. The method guarantees connectedness and deadlock and livelock freedom for up to k -1 arbitrary simultaneous switch and/or link faults where k is half the number of ports in the switches. Our simulation experiments show a graceful degradation of performance as more faults occur. Furthermore, we demonstrate that for most fault combinations, our method will even be able to handle significantly more faults beyond the k-1 limit with high probability.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Technical reports |
Year of Publication | 2009 |
Date Published | November |
Publisher | Simula |
Book Chapter
Network Reconfiguration in High-Performance Interconnection Networks
In Autonomic Computing and Networking, 313-332. Springer, 2009.Status: Published
Network Reconfiguration in High-Performance Interconnection Networks
High-performance interconnection networks like InfiniBand, ASI, Autonet andMyrinet have the property of link level flow control. As such alterating the routing function while the network is operational is a complex process, because the danger of packet deadlocks must be addressed. Over the last 5-6 years this problem has been addressed and solved by the research community. This chapter collects several proposals for updating a routing function in interconnection networks after the occurrence of a topological change. Both traditional and recent reconfiguration schemes are covered, and the selection includes both technology-dependant and generic techniques.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Book Chapter |
Year of Publication | 2009 |
Book Title | Autonomic Computing and Networking |
Chapter | 8 |
Pagination | 313-332 |
Publisher | Springer |
ISBN Number | 978-0-387-89827-8 |
Scalable Interconnection Networks
In Simula Research Laboratory - by thinking constantly about it, 129-162. Heidelberg: Springer, 2009.Status: Published
Scalable Interconnection Networks
A modern supercomputer or large scale server consists of a huge set of processing units and units that perform different forms of input/output and memory functions. These components unite in a complex collaboration to perform the main tasks of the system. Such collaboration requires communication between the components, which is supported by an infrastructure called the interconnection network. This book chapter describes the interconnection networks research activity at Simula the last five years done by the ICON group. ICON has focused on how to connect point-to-point links and switches into scalable network topologies, and how to route packets efficiently in order to yield the highest possible performance. This also poses various requirements regarding fault tolerance, quality of service (QoS), congestion control, virtualization, and other non-functional aspects. ICON's research results have been published in several of the most respected IEEE journals and magazines within our field. Furthermore, some of ICON's solutions have had a major impact on the routing architecture of modern supercomputers.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Book Chapter |
Year of Publication | 2009 |
Book Title | Simula Research Laboratory - by thinking constantly about it |
Chapter | 14 |
Pagination | 129-162 |
Publisher | Springer |
Place Published | Heidelberg |
ISBN Number | 978-3-642-01155-9 |
Journal Article
A Proposal for Managing ASI Fabrics
Journal of Systems Architecture 54 (2008): 664-678.Status: Published
A Proposal for Managing ASI Fabrics
Recent years, computer performance has been significantly increased. As a con-sequence, data I/O systems have become bottlenecks within systems. To alleviate this problem, Advanced Switching was recently proposed as a new standard for future interconnects. The Advanced Switching specification establishes a fabric man-agement infrastructure, which is in charge of updating the set of fabric paths each time a topological change takes place. The use of source routing and passive switches makes unfeasible the adaptation to this new technology of many existing proposals to handle topological changes in switched interconnection networks. This paper presents a fabric management mechanism for Advanced Switching, but also suitable for other source routing interconnects. Furthermore, the work presents a detailed performance evaluation for this proposal. This evaluation allows us to identify the main drawbacks of the mechanism and to define future improvements.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2008 |
Journal | Journal of Systems Architecture |
Volume | 54 |
Number | 7 |
Pagination | 664-678 |
Date Published | July |
An Efficient and Deadlock-Free Network Reconfiguration Protocol
IEEE Transactions on Computers 57 (2008): 762-779.Status: Published
An Efficient and Deadlock-Free Network Reconfiguration Protocol
Component failures and planned component replacements cause changes in the topology and routing paths supplied by the interconnection network of a parallel processor system over time. Such changes may require the network to be reconfigured such that the existing routing function is replaced by one which enables packets to reach their intended destinations amid the changes. Efficient reconfiguration methods are desired that allow the network to function uninterruptedly over the course of the reconfiguration process while remaining free from deadlocking behavior. In this paper, we propose, evaluate, and prove deadlock freedom of a new network reconfiguration protocol that overlaps various phases of “static” reconfiguration processes traditionally used in commercial and research systems to provide performance efficiency on par with that of recently proposed “dynamic” reconfiguration processes, but without their complexity. Simulation results show that the proposed Overlapping Static Reconfiguration protocol can reduce reconfiguration time by up to 50%, reduce packet latency by several orders of magnitude, reduce packet dropping by an order of magnitude, and provide unhalted packet injection as compared to traditional static reconfiguration while allowing network throughput similar to dynamic reconfiguration.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2008 |
Journal | IEEE Transactions on Computers |
Volume | 57 |
Number | 6 |
Pagination | 762-779 |
Date Published | June |
On the Potential of NoC Virtualization for Multicore Chips
Scalable Computing: Practice and Experience 9 (2008): 165-177.Status: Published
On the Potential of NoC Virtualization for Multicore Chips
As the end of Moores-law is on the horizon, power becomes a limiting factor to continuous increases in performance gains for single-core processors. Processor engineers have shifted to the multicore paradigm and many-core processors are a reality. Within the context of these multicore chips, three key metrics point themselves out as being of major importance, performance, fault-tolerance (including yield), and power consumption. A solution that optimizes all three of these metrics is challenging. As the number of cores increases the importance of the interconnection network-on-chip (NoC) grows as well, and chip designers should aim to optimize these three key metrics in the NoC context as well. In this paper we identify and discuss the main properties that a NoC must exhibit in order to enable such optimizations. In particular, we propose the use of virtualization techniques at the NoC level. As a major finding, we identify the implementation of unicast and broadcast routing algorithms to become a key design parameter in order to achieve an effective virtualization of the chip. The intention behind this paper is for it to serve as a position paper on the topic of virtualization for NoC and the challenges that should be met at the routing layer in order to optimize performance, fault-tolerance and power consumption in multicore chips.
Afilliation | Communication Systems, , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2008 |
Journal | Scalable Computing: Practice and Experience |
Volume | 9 |
Number | 3 |
Pagination | 165-177 |
Date Published | September |
The Interconnection Network - Architectural Challenges for Utility Computing Data Centres
IEEE Computer Magazine 41 (2008): 62-69.Status: Published
The Interconnection Network - Architectural Challenges for Utility Computing Data Centres
The mode of operation employed by Computational Data Centres that offer Utility Computing differs significantly from that of traditional supercomputers and server clusters and as such present new architectural problems that should be studied and solved. In this paper we concentrate on issues facing the interconnection network. We argue that this is a part of the overall architecture where shortcomings in present day solutions are most severe and present a model for the mode of operation of a Utility Computing Data Centre where virtualisation is a main ingredient. Based on this model we identify several areas where the interconnection network faces new challenges and needs new solutions. In each of these areas we give a brief introduction to previous results before we identify the new challenges.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2008 |
Journal | IEEE Computer Magazine |
Volume | 41 |
Number | 9 |
Pagination | 62-69 |
Date Published | September |
Publisher | IEEE |
Proceedings, refereed
An Analysis of Connectivity and Yield for 2D Mesh Based NoC With Interconnect Router Failures
In 11th EUROMICRO Conference on Digital System Design (DSD). University of Parma, 2008.Status: Published
An Analysis of Connectivity and Yield for 2D Mesh Based NoC With Interconnect Router Failures
The manufacturing process of modern day processors is both costly and complex and there are many different factors that influence the quality of a chip when it comes off the production line. Typically, hundreds of chips are manufactured from a single silicon wafer and as we go deeper into the sub-micron era of microchip manufacturing, the potential for defects during production increases. The advent of multi-core computing may introduce problems related to connectivity and yield for high volume manufacturing (HVM). In this paper we explore potential benefits that fault tolerant routing provides within the NoC (network-on-chip) paradigm with a study of the relationship between connectivity and yield at the interconnect routing level. For dimension-order routing based mesh NoCs, we describe two methods that are logically straightforward to implement and that can be used to increase the yield of chips with interconnect router faults.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2008 |
Conference Name | 11th EUROMICRO Conference on Digital System Design (DSD) |
Date Published | September |
Publisher | University of Parma |
ISBN Number | 978-0-7695-3277-6 |
Joint Admission and Power Control for Cognitive Radio Cellular Networks
In the 11th IEEE International Conference on Communication Systems 2008 (ICCS 2008), invited paper. IEEE, 2008.Status: Published
Joint Admission and Power Control for Cognitive Radio Cellular Networks
In cognitive radio cellular networks (CogCell), the Secondary Users (SUs) can be admitted to the Base Station (BS) provided that the interference caused by SUs to the Primary Users (PUs) is no higher than the pre-defined threshold. In addition, different SUs may require different Quality of Service (QoS) and hence make different payment based on the provided QoS level. In this paper, we investigate the maximally achievable revenue obtained from SUs subjected to the interference constraints on PUs and QoS requirements of SUs. To solve the identified issue, we introduce the revenue efficiency factor and propose an efficient Joint Admission and Power Control scheme using a Minimal Revenue Efficiency Removal algorithm (JAPC-MRER). With comparison to the other two schemes JAPC-MSRA and JAPC-Random, our strategy is able to achieve much higher revenue with guaranteed interference requirements and QoS demands. In addition, the proposed scheme is evaluated with the effects of the key parameters, i.e. the number of PUs, the number of SUs and the interference threshold.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2008 |
Conference Name | the 11th IEEE International Conference on Communication Systems 2008 (ICCS 2008), invited paper |
Pagination | 1519-1523 |
Date Published | November |
Publisher | IEEE |
ISBN Number | 978-1-4244-2423-8 |
DOI | 10.1109/ICCS.2008.4737437 |
Maintaining Quality of Service With Dynamic Fault Tolerance in Fat Trees
In International Conference on High Performance Computing (HiPC). Vol. 1. Berlin: Springer-Verlag, 2008.Status: Published
Maintaining Quality of Service With Dynamic Fault Tolerance in Fat Trees
A very important ingredient in the computing landscape is Utility Computing Data Centres (UCDCs), large-scale computing system that offers computational services to concurrently running applications. In a UCDC, virtual servers containing a subset of the available resources are dynamically created to fulfil user demands. Typically, each virtual server will have its own service level agreement, which should to the largest extent be unaffected by the behaviour of the all other virtual servers in the system. As UCDC systems increase in size and the mean time between failure decreases, it is becoming an increasingly important challenge to expediently tolerate failures (dynamically), while distributing the effects of the failure amongst the virtual servers according to their service level agreements. In this paper we propose and evaluate a strategy for offering predictable service in fat trees experiencing faults, by reprioritising packets. The strategy is able to distribute the effect of network faults in order to satisfy a number of quality of service demands. These may include guaranteeing that high-priority packets not encountering the fault are unaffected by the fault event, guaranteeing high network throughput for all high-priority traffic, or ensuring that the negative effects of the fault are evenly and fairly spread throughout the network. We find that which demands to favour depends on the computer system and the characteristics of the applications it is running, and that in the presence of a moderate number of faults it is to some degree possible to meet the demands.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2008 |
Conference Name | International Conference on High Performance Computing (HiPC) |
Volume | 1 |
Pagination | 451-464 |
Date Published | december |
Publisher | Springer-Verlag |
Place Published | Berlin |
ISBN Number | 3-54089893-x |
On the Potential of NoC Virtualization for Multicore Chips
In International Workshop on Multi-Core Computing Systems (MuCoCoS'08). IEEE, 2008.Status: Published
On the Potential of NoC Virtualization for Multicore Chips
As the end of Moores-law is on the horizon, power becomes a limiting factor to the continuous increases in performance gains for single-core processors. Processor engineers have shifted to the multicore paradigm and many-core processors are reality. Within the context of these multi-core, three key metrics point themselves out as being of major importance, performance, fault-tolerance (including yield), and power consumption. A solution that optimizes all three of these metric is challenging. As the number of cores increases the importance of the interconnection network-on-chip (NoC) grows as well, and chip designers should aim to optimize these three key metrics. In this paper we identify and discuss the main properties that a NoC must exhibit in order to enable such optimizations. In particular, we propose the use of virtualization techniques at the NoC level. The implementation of routing algorithms for NoC is a key design parameter in order to achieve an effective virtualization of the chip that should also support broadcast within the virtualized context. The intention behind this paper is for it to serve a position paper on the topic of virtualization for NoC and the challenges that should be met at the routing layer in order to maximize performance, fault-tolerance and power consumption.
Afilliation | Communication Systems, , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2008 |
Conference Name | International Workshop on Multi-Core Computing Systems (MuCoCoS'08) |
Pagination | 801-807 |
Date Published | March |
Publisher | IEEE |
QoS-Aware Channel Selection in Cognitive Radio Networks: a Game-Theoretic Approach
In IEEE Global Communications Conference (GLOBECOM 2008). IEEE, 2008.Status: Published
QoS-Aware Channel Selection in Cognitive Radio Networks: a Game-Theoretic Approach
Abstract-In a cognitive radio wireless network, each node can sense and opportunistically access the under-utilized spectrums in the primary system. Since the unoccupied spectrum is locationdependent and time-dependent, the available spectrums in each node are different.With this spectrum heterogeneity and different Quality-of-Service (QoS) requirement, different node may have different preference in using a particular channel to communicate with its neighboring nodes. In this paper, we formulate this channel selection problem using a cooperative game theoretical approach such that the QoS requirement of each node is guaranteed and the total throughput is maximized. To further improve the performance, a learning negotiation mechanism is introduced. The key motivation is to derive new mixed strategies of the nodes with the reference to the historical profiles of all selected strategies in the past. Simulation results are presented to show the fast convergence of the game, the high efficiency of the learning mechanism, and the effect of mobility.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2008 |
Conference Name | IEEE Global Communications Conference (GLOBECOM 2008) |
Pagination | 1-7 |
Publisher | IEEE |
ISBN Number | 978-1-4244-2324-8 |
DOI | 10.1109/GLOCOM.2008.ECP.934 |
Proceedings, refereed
A Distributed Approach to Handle Topological Changes in Advanced Switching
In 2nd ACM International Workshop on Performance Monitoring, Measurement, and Evaluation of Heterogeneous Wireless and Wired Networks. ACM Press, 2007.Status: Published
A Distributed Approach to Handle Topological Changes in Advanced Switching
Advanced Switching Interconnect (ASI) is a new high-performance serial interconnect. In order to support high availability, its specification establishes a management infrastructure, which is in charge of maintaining the network operation after the occurrence of a topological change. During the change assimilation process, a management mechanism must discover the new topology, obtain a set of paths, and finally distribute them to the fabric endpoints. Recently, an implementation for this mechanism has been proposed, in which a management entity performs all the previous tasks in a centralized way. In this paper, we propose a distributed solution for computing the new set of paths, with the aim of reducing the negative impact on the network service of the centralized proposal.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | 2nd ACM International Workshop on Performance Monitoring, Measurement, and Evaluation of Heterogeneous Wireless and Wired Networks |
Pagination | 37-44 |
Date Published | October |
Publisher | ACM Press |
ISBN Number | 978-1-59593-805-3 |
A Model-Based Admission Control for 802.11e EDCA Using Delay Predictions
In The 26th IEEE International Performance Computing and Communications Conference (IPCCC). IEEE Computer Society Press, 2007.Status: Published
A Model-Based Admission Control for 802.11e EDCA Using Delay Predictions
This paper presents a unique approach for a model-based admission control algorithm for the IEEE 802.11e Enhanced Distributed Channel Access (EDCA) standard. The analytical model used as the foundation for the algorithm covers both non-saturation and saturation conditions. This allows us to keep the system out of saturation by monitoring several variables. Since the medium access delay represents the service time of the system, it is used as the threshold condition to ensure that the queuing delay is within reasonable bounds. The paper describes the admission control algorithm and several simulation results are presented and discussed.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | The 26th IEEE International Performance Computing and Communications Conference (IPCCC) |
Pagination | 226-235 |
Date Published | April |
Publisher | IEEE Computer Society Press |
ISBN Number | 1-4244-1138-6 |
A Routing Methodology for Dynamic Fault Tolerance in Meshes and Tori
In International Conference on High Performance Computing (HiPC). LNCS 4873. Springer-Verlag, 2007.Status: Published
A Routing Methodology for Dynamic Fault Tolerance in Meshes and Tori
This paper proposes a fully distributed fault-tolerant routing methodology for tori and meshes. A dynamic fault-model is supported, enabling the network to remain fully operational at all times. Contrary to most previous proposals that support a dynamic fault-model, the methodology is able to tolerate concave fault regions, thereby avoiding disabling healthy nodes in most practical scenarios. The methodology provides high network performance through the use of adaptive routing and provides graceful performance degradation in the presence of faults.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | International Conference on High Performance Computing (HiPC) |
Pagination | 514-527 |
Publisher | Springer-Verlag |
ISBN Number | 978-3-540-77219-4 |
Boosting Ethernet Performance by Segment-Based Routing
In Proceedings of the 15th Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP 2007). IEEE Computer Society Press, 2007.Status: Published
Boosting Ethernet Performance by Segment-Based Routing
In this paper we embed an efficient topology agnostic routing algorithm with fault tolerance capabilities into back-pressured Ethernet technology. This makes it possible to use off-the-shelf equipment to build cost-effective systems with an efficient use of all network components. This stands in contrast to the inefficient use of network resources (links) supported by the Spanning Tree Protocol (STP). The Segment-Based Routing Algorithm (SR) is a deterministic routing algorithm that achieves high performance without the use of virtual channels. Furthermore, it is topology agnostic, meaning it can handle any topology and any combination of faults derived from the original topology when combined with static reconfiguration. Through simulations we verify an overall improvement in throughput by a factor of 1.2 to 10.0 compared to the conventional Ethernet routing algorithm, the STP, and other topology agnostic routing algorithms such as Up*/Down* and Tree-based Turn-prohibition, which both are applicable to Ethernet.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | Proceedings of the 15th Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP 2007) |
Pagination | 55-62 |
Date Published | February |
Publisher | IEEE Computer Society Press |
ISBN Number | 0-7695-2784-1 |
Effective Shortest Path Routing for Gigabit Ethernet
In Proceedings of the IEEE International Conference on Communications 2007. IEEE Communications Society, 2007.Status: Published
Effective Shortest Path Routing for Gigabit Ethernet
Since its invention at Xerox PARC in 1973, Ethernet technology has proven to be both robust and adaptable. Through several evolutionary steps Ethernet has become an almost ubiquitous communication technology, spanning from local area networking through high performance backplane interconnects (a recent initiative) to metropolitan networking. However, an obstacle still remains for Ethernet to effectively make inroads in application areas such as interconnection and backbone networks. Ethernet's native routing algorithm, the Spanning Tree Protocol, becomes a major performance and utilization bottleneck when network connectivity increases. Since the Spanning Tree Protocol avoids deadlocks and infinitely looping packets by turning any topology into tree, it leaves a large portion of links unused and thus wastes bandwidth. In this paper we address this weakness by proposing a new routing algorithm which achieves the same goals as the Spanning Tree Protocol, but without disabling any links or prohibiting any turns, and at the same time guaranteeing shortest path routing. Through the use of layered routing we show how to improve performance with respect to both the Spanning Tree Protocol and a more recent proposal called Tree-Based Turn-Prohibition. Extensive simulations show that we are able to increase throughput by a factor of more than 3.5 compared to the Spanning Tree Protocol and a factor of 1.8 compared to Tree-Based Turn-Prohibition. Our concept relies on features introduced in IEEE standards 802.1Q, 802.1D and 802.3x, as well as changes currently discussed in IEEE task forces. We also discuss backwards compatibility toghether with the changes necessary for enabling layered shortest path routing in Ethernet.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | Proceedings of the IEEE International Conference on Communications 2007 |
Pagination | 6419-6424 |
Date Published | June |
Publisher | IEEE Communications Society |
ISBN Number | 1-4244-0353-7 |
Real Life Field Trial Over a Pre-Mobile WiMAX System With 4th Order Diversity
In The 7th International Conference on Next Generation Teletraffic and Wired/Wireless Advanced Networking (NEW2AN). Lecture Notes in Computer Science. LNCS Springer, 2007.Status: Published
Real Life Field Trial Over a Pre-Mobile WiMAX System With 4th Order Diversity
Mobile WiMAX is a promising wireless technology approaching market deployment. Much discussion concentrate on whether mobile WiMAX will reach a tipping point and become 4G or not. As a pre-mobile WiMAX system is being delivered, we decided to set up a real life field trial and perform the most important measurements over the system setup. The system was delivered with 4th order diversity. In this paper we analyze physical system performance based on field trial measurements, especially at locations with non line of sight conditions in urban areas. We investigate the gain with 2nd and 4th order base station diversity and derive analytical expressions. The system path loss is plotted and found to approach the Cost-231 Hata model for urban areas. Throughput is also measured and analyzed. Sub-channelization in the uplink points out to be an important feature for enhanced coverage.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | The 7th International Conference on Next Generation Teletraffic and Wired/Wireless Advanced Networking (NEW2AN) |
Pagination | 121-132 |
Date Published | September |
Publisher | LNCS Springer |
ISBN Number | 978-3-54074-832-8 |
Routing-Contained Virtualization Based on Up*/Down* Forwarding
In High Performance Computing - HiPC 2007. 4873 ed. LNCS 4873. Berlin Heidelberg: Springer-Verlag, 2007.Status: Published
Routing-Contained Virtualization Based on Up*/Down* Forwarding
Virtualization of computing resources is becoming increasingly important both in high-end servers and in multi-core CPUs. In a virtualized system, the set of resources that constitute a virtual compute entity should be spatially separated from each other. Dividing the cores on a chip, or the CPUs in a high end server into disjoint sets for each task is a trivial problem. Ensuring that they use disjoint parts of the interconnection network is, however, complex, and in existing methods the requirement of routing-containment of each virtual partition severely degrades the utilization of the system. In this paper, we present an allocation strategy that is based on Up*/Down* routing. Through simulations, we demonstrate increases (in some cases above 30%) in system utilization relative to state-of- the-art in a Dimension Order routed mesh - a topology that is assumed to be widely deployed in Networks on Chip.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | High Performance Computing - HiPC 2007 |
Edition | 4873 |
Pagination | 500-513 |
Publisher | Springer-Verlag |
Place Published | Berlin Heidelberg |
ISBN Number | 978-3-540-77219-4 |
The Physical Performance and Path Loss in a Fixed WiMAX Deployment
In International Wireless Communications and Mobile Computing Conference. ACM Press, 2007.Status: Published
The Physical Performance and Path Loss in a Fixed WiMAX Deployment
Fixed WiMAX is being deployed worldwide, and the networks are increasing in size. Measurements have been performed, but the amount of measurements are few and do therefore not demonstrate performance in a real life deployment. We have performed extensive analyses of the physical performance in a fixed WiMAX deployment which has been operative for a year and where the amount of subscribers constantly increases. The analyses presented in this paper focus on received signal strength and signal to noise ratio. Based on the measured parameters, we present a Path Loss model for fixed WiMAX which will hopefully be of great reference value due to the great amount of measurements presented. Finally, our Path Loss model is compared to other well known Path Loss models and is found to approach the free space loss model.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2007 |
Conference Name | International Wireless Communications and Mobile Computing Conference |
Pagination | 439-444 |
Date Published | August |
Publisher | ACM Press |
ISBN Number | 978-1-59593-695-0 |
Technical reports
The Interconnection Network - Architectural Challenges for Utility Computing Data Centres
Simula Research Laboratory, 2007.Status: Published
The Interconnection Network - Architectural Challenges for Utility Computing Data Centres
Afilliation | Communication Systems |
Project(s) | No Simula project |
Publication Type | Technical reports |
Year of Publication | 2007 |
Publisher | Simula Research Laboratory |
Notes | This technical report is an earlier version of a published journal article. The published version can be found here: |
Proceedings, refereed
A Class Based Dynamic Admitted Time Limit Admission Control Algorithm for 802.11e EDCA
In Proceedings of the 6th International Workshop on Applications and Services in Wireless Networks (ASWN). Fraunhofer FOKUS, 2006.Status: Published
A Class Based Dynamic Admitted Time Limit Admission Control Algorithm for 802.11e EDCA
This paper presents a class based dynamic admission control algorithm for the IEEE 802.11e Enhanced Distributed Channel Acccess (EDCA) standard. The strength of our admission control is the dynamic and flexibility of the algorithm, which adapts to the situation and thus achieves higher throughput than other admission controls for 802.11 EDCA. The achievements of our admission control are presented and evaluated. Class and flow utilization is discussed before a special case of our admission control algorithm aimed at flow utilization is given.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2006 |
Conference Name | Proceedings of the 6th International Workshop on Applications and Services in Wireless Networks (ASWN) |
Pagination | 243-249 |
Date Published | May |
Publisher | Fraunhofer FOKUS |
ISBN Number | ISBN 10:3-8167-7111-4, ISBN 13:978-3-8167-7111-1 |
Combining Source Routing and Dynamic Fault Tolerance
In Proceedings of The 18'th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). Washington, DC, USA: IEEE Computer Society, 2006.Status: Published
Combining Source Routing and Dynamic Fault Tolerance
An increasing amount of current and emerging interconnect technologies rely on source routing to forward packets through the network. It is therefore important to develop methods for fault tolerance that are well suited for source routed networks. Dynamic fault tolerance allows the network to remain available through the occurrence of faults, as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Source routing readily supports the source node choosing a different path when a fault occurs, but using this approach, packets already in the network will be lost. Local dynamic fault tolerance, where the packet is routed around the fault locally, would prevent much of the traffic being lost during failures, but this is cumbersome to achieve in source routed networks since packets encountering a fault will need to follow a path different from that encoded in the packet header. In this paper we present a mechanism to achieve local dynamic fault tolerance in source routed fat trees, a topology that has widespread use in supercomputer systems, and compare it with endpoint dynamic fault tolerance. We also show that by combining the two approaches we achieve performance superior to any of the two individually.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2006 |
Conference Name | Proceedings of The 18'th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) |
Pagination | 151-158 |
Date Published | October |
Publisher | IEEE Computer Society |
Place Published | Washington, DC, USA |
ISBN Number | 0-7695-2704-3 |
Dynamic Fault Tolerance With Misrouting in Fat Trees
In Proceedings of the International Conference on Parallel Processing (ICPP). IEEE Computer Society, 2006.Status: Published
Dynamic Fault Tolerance With Misrouting in Fat Trees
Fault tolerance is critical for efficient utilisation of large computer systems. Dynamic fault tolerance allows the network to remain available through the occurance of faults as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Although dynamic fault tolerance may lead to less efficient solutions than static fault tolerance, it allows for a much higher availability of the system. In this paper we devise a dynamic fault tolerant adaptive routing algorithm for the fat tree, a much used interconnect topology, which relies on misrouting around link faults. We show that we are guaranteed to tolerate any combination of less than (num\_switch\_ports/2) link faults without the need for additional network resources for deadlock freedom. There is also a high probability of tolerating an even larger number of link faults. Simulation results show that network performance degrades very little when faults are dynamically tolerated.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2006 |
Conference Name | Proceedings of the International Conference on Parallel Processing (ICPP) |
Pagination | 33-45 |
Date Published | August |
Publisher | IEEE Computer Society |
ISBN Number | 22 |
On the Use of Rate Configuration in the Interoperation Between DiffServ and 802.11e EDCA
In Proceedings of the 15th IST Mobile & Wireless Communications. Heroon Polytechniou, 15773 Athens, Greece: National Technical University of Athens, 2006.Status: Published
On the Use of Rate Configuration in the Interoperation Between DiffServ and 802.11e EDCA
This paper investigates rate configuration of the Expedited Forwarding (EF) class of Differentiated Services (DiffServ) when used with 802.11e EDCA. The rate configuration problem is presented, and several approaches are tested and evaluated in order to solve the problem. Results reveal that the contention window makes rate configuration very hard for the highest priority class (AC\_VO) in 802.11e EDCA. Our evaluations show that 802.11e EDCA is not able to conform to DiffServ's EF PHB specifications.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2006 |
Conference Name | Proceedings of the 15th IST Mobile & Wireless Communications |
Date Published | June |
Publisher | National Technical University of Athens |
Place Published | Heroon Polytechniou, 15773 Athens, Greece |
ISBN Number | NA |
Segment-Based Routing: an Efficient Fault-Tolerant Routing Algorithm for Meshes and Tori
In 20th IEEE International Parallel & Distributed Processing Symposium. Washington, USA: IEEE Computer Society, 2006.Status: Published
Segment-Based Routing: an Efficient Fault-Tolerant Routing Algorithm for Meshes and Tori
Computers get faster every year, but the demand for computing resources seems to grow at an even faster rate. Science keeps demanding more processing power for calculations and simulations, growth in E-commerce requires powerful servers to offer seamless online shopping, and massive multiplayer online games requires powerful and stable systems to keep their virtual worlds running 24 hours a day. Depending on the problem domain, this demand for more power can be satisfied by either, massively parallel computers, or clusters of computer. Common for both approaches is the dependence on high performance interconnect networks such as Myrinet, Infiniband, or 10 Gigabit Ethernet. While high throughput and low latency are key features of interconnection networks, the issue of fault-tolerance is now becoming increasingly important. As the number of network components grows so does the probability for failure, thus it becomes important to also consider the fault-tolerance mechanism of interconnection networks. The main challenge then lies in combining performance and fault-tolerance, while still keeping cost and complexity low. This paper proposes a new deterministic routing methodology for tori and meshes, which achieves high performance without the use of virtual channels. Furthermore, it is topology agnostic in nature, meaning it can handle any topology derived from any combination of faults when combined with static reconfiguration. The algorithm, referred to as Segment-based Routing (SR), works by partitioning a topology into subnets, and subnets into segments. This allows us to place bidirectional turn restrictions locally within a segment. As segments are independent, we gain the freedom to place turn restrictions within a segment independently from other segments. This results in a larger degree of freedom when placing turn restrictions compared to other routing strategies. In this paper a way to compute segment-based routing tables is presented and applied to meshes and tori. Preliminary evaluation results show that the concept of segments leads to an increase in performance by a factor of 1.8 over FX and up*/down* routing.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2006 |
Conference Name | 20th IEEE International Parallel & Distributed Processing Symposium |
Pagination | 1-10 |
Date Published | April |
Publisher | IEEE Computer Society |
Place Published | Washington, USA |
ISBN Number | 1-4244-0054-6 |
Journal Article
A Routing Methodology for Achieving Fault Tolerance in Direct Networks
IEEE Transactions on Computers 55 (2006): 400-415.Status: Published
A Routing Methodology for Achieving Fault Tolerance in Direct Networks
Massively parallel computing systems are being built with thousands of nodes. The interconnection network plays a key role for the performance of such systems. However, the high number of components significantly increases the probability of failure. Additionally, failures in the interconnection network may isolate a large fraction of the machine. It is therefore critical to provide an efficient fault-tolerant mechanism to keep the system running, even in the presence of faults. This paper presents a new fault-tolerant routing methodology that does not degrade performance in the absence of faults and tolerates a reasonably large number of faults without disabling any healthy node. In order to avoid faults, for some source-destination pairs, packets are first sent to an intermediate node and then from this node to the destination node. Fully adaptive routing is used along both subpaths. The methodology assumes a static fault model and the use of a checkpoint/restart mechanism. However, there are scenarios where the faults cannot be avoided solely by using an intermediate node. Thus, we also provide some extensions to the methodology. Specifically, we propose disabling adaptive routing and/or using misrouting on a per-packet basis. We also propose the use of more than one intermediate node for some paths. The proposed fault-tolerant routing methodology is extensively evaluated in terms of fault tolerance, complexity, and performance.
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2006 |
Journal | IEEE Transactions on Computers |
Volume | 55 |
Number | 4 |
Pagination | 400-415 |
Date Published | April |
An Overview of QoS Capabilities in InfiniBand, Advanced Switching Interconnect, and Ethernet
IEEE Communications Magazine 44 (2006): 32-38.Status: Published
An Overview of QoS Capabilities in InfiniBand, Advanced Switching Interconnect, and Ethernet
A recent trend in interconnection network technologies is the inclusion of various mechanisms to support a variety of Quality of Service concepts. This has been necessitated by an increasing number of application areas that require some level of performance guarantees from the network for parts of its traffic. In this paper we describe and compare the capabilities and support for Quality of Service of the three most important interconnection network technology standards of today. Equalities between the technologies are explained and differences are clarified.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2006 |
Journal | IEEE Communications Magazine |
Volume | 44 |
Number | 7 |
Pagination | 32-38 |
Date Published | july |
Publisher | IEEE |
Layered Routing in Irregular Networks
IEEE Transactions on Parallel and Distributed Systems 17 (2006): 51-65.Status: Published
Layered Routing in Irregular Networks
Freedom from deadlock is a key issue in Cut-Through, Wormhole and Store and Forward networks, and such freedom is usually obtained through careful design of the routing algorithm. Most existing deadlock-free routing methods for irregular topologies do, however, impose severe limitations on the available routing paths. We present a method called Layered Routing, which gives rise to a series of routing algorithms, some of which perform considerably better than previous ones. Our method groups virtual channels into network layers, and to each layer it assigns a limited set of source/destination address pairs. This separation of traffic yields a significant increase in routing efficiency. We show how the method can be used to improve the performance of irregular networks, both through load balancing and by guaranteeing shortest-path routing. The method is simple to implement, and its application does not require any features in the switches other than the existence of a modest number of virtual channels. The performance of the approach is evaluated through extensive experiments within three classes of technologies. These experiments reveal a need for virtual channels as well as an improvement in throughput for each technology class.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2006 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 17 |
Number | 1 |
Pagination | 51-65 |
Date Published | january |
Publisher | IEEE |
Routing for the ASI Fabric Manager
IEEE Communications Magazine 44 (2006): 39-44.Status: Published
Routing for the ASI Fabric Manager
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2006 |
Journal | IEEE Communications Magazine |
Volume | 44 |
Number | 7 |
Pagination | 39-44 |
Date Published | July |
Timeliness of Real-Time IP Communication in Switched Industrial Ethernet Networks
IEEE Transactions on Industrial Informatics 2 (2006): 25-39.Status: Published
Timeliness of Real-Time IP Communication in Switched Industrial Ethernet Networks
Since its invention at Xerox PARC in 1973, the Ethernet technology has proven to be both robust and adaptable. Through several giant evolution steps Ethernet has become an almost ubiquitous communication technology, spanning from enterprise or local area net-working through high performance backplane interconnects (a very recent initiative) to metropolitan (telecommunication) networking. Being nimble enough to maneuver into new application areas, it is now making inroads in factory communication. Automation systems are, however, different from many of the other application areas mentioned, first and foremost since they require real-time performance from the network technology. In this article we will look at some critical aspects of Ethernet as an automa-tion network, usually referred to as Industrial Ethernet. More specifically, we focus on the application-to-application delay and jitter characteristics of such networks, when us-ing Internet protocols such as UDP and TCP. We show the importance of taking control of the latency in the station nodes, since the main communication delays are inside the nodes, and present different solutions for controlling these delays. In particular, a prior-ity-based protocol stack is assessed. Our results show a significant evolution in the appli-cability of real-time Ethernet based IP communication, which is now adequate even for demanding automation applications. In this paper we use substation automation (power distribution) as an example of a demanding automation system
Afilliation | , Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2006 |
Journal | IEEE Transactions on Industrial Informatics |
Volume | 2 |
Number | 1 |
Pagination | 25-39 |
Date Published | February |
Talks, contributed
The Realization of Virtual Compute Resources in a Utility Computing Data Center (UCDC) in the Many Core Era
In 2006 Workshop on On- and Off-Chip Interconnection Networks for Multicore Systems, 2006.Status: Published
The Realization of Virtual Compute Resources in a Utility Computing Data Center (UCDC) in the Many Core Era
Afilliation | Communication Systems, Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2006 |
Location of Talk | 2006 Workshop on On- and Off-Chip Interconnection Networks for Multicore Systems |
Zeitsynchronisierung Im Indudusrtial Ethernet
In etz - Elektrotechnik und Automation, VDE Verlag GMBH, Heft 8/2006, 2006.Status: Published
Zeitsynchronisierung Im Indudusrtial Ethernet
In industriellen automatisierungsanwendungen begrenzt der latenzzeit-jitter den einsatz von Ethernet-netzwerken bei der übertragung der datenpakete im innern der switches. Je nach netzwerklast, grösse des datenpakets und anzahl der switches zwischen server und client kann dieser bis zu mehreren ms betragen. Die zur automatisierung erforderliche präzise datenauswertung, die z.B. zur synchronisierung mehrerer achsen in montargemaschinen benötigt wird, ist dadurch nicht möglich. Eine lösung für dieses problem sund präzisions-Ethernet-switches mit zeitstempeln. Der vorliegende artikel besschreibt die prinzipien des SNTP/NTP-Internet-zeitprotokolls und dessen implementierung in einem Ethernet-switch und einem von Ethernet aktivierten endknoten zum erhalt einer von der netzwerklast unabhängigen zeitgenauigkeit von unter 1 micro second.
Afilliation | , Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2006 |
Location of Talk | etz - Elektrotechnik und Automation, VDE Verlag GMBH, Heft 8/2006 |
Proceedings, refereed
A Dynamic Fault-Torlerant Routing Algorithm for Fat-Trees
In International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, Nevada, USA, June 27-30. CSREA Press, 2005.Status: Published
A Dynamic Fault-Torlerant Routing Algorithm for Fat-Trees
The fat tree is a network topology well suited for use as the interconnection network in systems such as parallel computers. Its large number of paths between every source/destination pair gives the fat tree the ability to provide high throughput. This also gives it a high probability of tolerating network faults statically, but few algorithms to dynamically tolerate faults in fattrees have previously been proposed. In this paper we present a deadlock free routing method for providing dynamic fault tolerance through misrouting downwards in the network. We show that the algorithm is one fault-tolerant, and that it with a certain probability can tolerate a large number of faults.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2005 |
Conference Name | International Conference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, Nevada, USA, June 27-30 |
Pagination | 318-324 |
Publisher | CSREA Press |
ISBN Number | 1-932415-58-0 |
Ethernet As a Lossless Deadlock Free System Area Network
In Proceedings of the International Symposium on Parallel and Distributed Processing and Applications, Nanjing, China May 2-5. Lecture Notes in Computer Science. Heidelberg, Germany: Springer-Verlag GmbH, 2005.Status: Published
Ethernet As a Lossless Deadlock Free System Area Network
The way conventional Ethernet is used today differs in two aspects from how dedicated system area networks are used. Firstly, dedicated system area networks are lossless and only drop frames when bit errors occur, while conventional Ethernet drop frames whenever congestion occur. Secondly, these networks are either deadlock free or use mechanisms which avoids deadlock situations, while still using all available links. Ethernet avoids deadlocks by using a spanning tree protocol which turns any topology into a tree. A drawback of this approach is that we are left with a lot of unused links and thus wasting resources. In this paper we describe how to obtain a lossless deadlock free network with the best possible performance, while adhering to the current Ethernet standard and using off-the-shelf Ethernet equipment. We achieve this by introducing flow control in all network nodes and by taking control over the routing algorithm. Also, we use TCP to illustrate the effect of flow control on higher layer protocols. Through simulations we verify the following tree improvements. Firstly, the activation of flow control turns Ethernet into a lossless network. Secondly, taking control over the routing algorithm allows us to build any topology without the limitations of the spanning tree protocol. And thirdly, an overall improvement in throughput is achieved by combining these enhancements.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2005 |
Conference Name | Proceedings of the International Symposium on Parallel and Distributed Processing and Applications, Nanjing, China May 2-5 |
Pagination | 901-914 |
Date Published | november |
Publisher | Springer-Verlag GmbH |
Place Published | Heidelberg, Germany |
ISBN Number | 3-540-29769-3 |
Siamese-Twin: a Dynamically Fault-Tolerant Fat Tree
In International Parallel and Distributed Processing Symposium (IPDPS), Denver, Colorado, USA, April 4-8. IEEE Computer Society, 2005.Status: Published
Siamese-Twin: a Dynamically Fault-Tolerant Fat Tree
Fat-trees are a special case of multistage interconnection networks with quite good static fault tolerance capabilities. They are however straightforwardly unable to provide local dynamic fault tolerance. In this paper we propose a network topology based on the fat-tree using two parallel networks with crossover links between them in an effort to enable dynamic fault tolerance. We evaluate and compare this topology with two other similar fat-tree topologies and show through simulations that the new topology is able to improve slightly upon the ability to tolerate faults statically. More importantly, we show that the new network topology is the only one of the evaluated topologies able to tolerate one fault dynamically, with a superior network performance in the face of dynamically handled faults.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2005 |
Conference Name | International Parallel and Distributed Processing Symposium (IPDPS), Denver, Colorado, USA, April 4-8 |
Publisher | IEEE Computer Society |
ISBN Number | 0-7695-2312-9 |
Edited books
Proceedings of the 2005 International Conference on Parallel Processing Workshops
IEEE, 2005.Status: Published
Proceedings of the 2005 International Conference on Parallel Processing Workshops
Afilliation | Communication Systems, Communication Systems |
Publication Type | Edited books |
Year of Publication | 2005 |
Date Published | June |
Publisher | IEEE |
ISBN Number | 0-7695-2381-1 |
Proceedings, refereed
A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes
In Proceedings of IFIP International Conference on Network and Parallel Computing. Lecture Notes in Computer Science 3222. Springer-Verlag, 2004.Status: Published
A Fully Adaptive Fault-Tolerant Routing Methodology Based on Intermediate Nodes
Massively parallel computing systems are being built with thousands of nodes. Because of the high number of components, it is critical to keep these systems running even in the presence of failures. Interconnection networks play a key-role in these systems, and this paper proposes a fault-tolerant routing methodology for use in such networks. The methodology supports any minimal routing function (including fully adaptive routing), does not degrade performance in the absence of faults, does not disable any healthy node, and is easy to implement both in meshes and tori. In order to avoid network failures, the methodology uses a simple mechanism: for some source-destination pairs, packets are forwarded to the destination node through a set of intermediate nodes (without being ejected from the network). The methodology is shown to tolerate a large number of faults (e.g., five/nine faults when using two/three intermediate nodes in a 3D torus). Furthermore, the methodology offers a gracious performance degradation: in an 8 × 8 × 8 torus network with 14 faults the throughput is only decreased by 6.49%.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | Proceedings of IFIP International Conference on Network and Parallel Computing |
Pagination | 341-356 |
Date Published | October 18-20 |
Publisher | Springer-Verlag |
A New Adaptive Fault-Tolerant Routing Methodology for Direct Networks
In Proceedings of the International Conference on High Performance Computing. Lecture Notes in Computer Science 3296. Springer-Verlag, 2004.Status: Published
A New Adaptive Fault-Tolerant Routing Methodology for Direct Networks
Interconnection networks play a key role in the fault tolerance of massively parallel computers, since faults may isolate a large fraction of the machine containing many healthy nodes. In this paper, we present a methodology to design fully adaptive fault-tolerant routing algorithms for direct interconnection networks that can be applied to different regular topologies. The methodology is mainly based on the selection of an intermediate node (if needed) for each source-destination pair. Packets are adaptively routed to the intermediate node and, from this node, they are adaptively forwarded to their destination. This methodology requires only one additional virtual channel, even for tori. Evaluation results show that the methodology is 7-fault tolerant, and for up to 14 faults, more than 99% of the combinations are tolerated, also without significantly degrading performance in the presence of faults.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | Proceedings of the International Conference on High Performance Computing |
Pagination | 462-473 |
Date Published | December 19-22 |
Publisher | Springer-Verlag |
Achieving Flow Level QoS in Cut-Through Networks Through Admission Control and DiffServ
In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA). Las Vegas, Nevada: CSREA Press, 2004.Status: Published
Achieving Flow Level QoS in Cut-Through Networks Through Admission Control and DiffServ
Cluster networks will serve as the future access networks for multimedia streaming, massive multiplayer online gaming, e-commerce, network storage etc. And for those application areas provisioning of Quality of Service (QoS) is becoming and important issue. DiffServ as specified by the IETF is foreseen to be the most prominent concept for providing predictability in the future Internet. To enable seamless interoperation with the higher level IETF concepts the QoS architecture of the lower layers should comply with the DiffServ paradigm as well. Previous work on predictability in cut-through networks has only studied class based QoS. In this paper we set out to achieve flow level QoS using flow aware admission control in combination with a flow negligent DiffServ inspired QoS mechanism. Our results show that flow level bandwidth guarantees are achievable with the use of the Link-by-Link and the Probe based schemes. In addition we are able to achieve an order of magnitude improvement in jitter and latency in individual flows.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) |
Pagination | 1084-1090 |
Date Published | June 21-24 |
Publisher | CSREA Press |
Place Published | Las Vegas, Nevada |
An Effective Fault-Tolerant Routing Methodology for Direct Networks
In Proceedings of International Conference on Parallel Processing. IEEE Computer Society Press, 2004.Status: Published
An Effective Fault-Tolerant Routing Methodology for Direct Networks
Current massively parallel computing systems are being built with thousands of nodes, which significantly affect the probability of failure. M. E. Gomex proposed a methodology to design fault-tolerant routing algorithms for direct interconnection networks. The methodology uses a simple mechanism: for some source-destination pairs, packets are first forwarded to an intermediate node, and later, from this node to the destination node. Minimal adaptive routing is used along both subpaths. For those cases where the methodology cannot find a suitable intermediate node, it combines the use of intermediate nodes with two additional mechanisms: disabling adaptive routing and using misrouting on a per-packet basis. While the combination of these three mechanisms tolerates a large number of faults, each one requires adding some hardware support in the network and also introduces some overhead. In this paper, we perform an in-depth detailed analysis of the impact of these mechanisms on network behaviour. We analyze the impact of the three mechanisms separately and combined. The ultimate goal of this paper is to obtain a suitable combination of mechanisms that is able to meet the trade-off between fault-tolerance degree, routing complexity, and performance.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | Proceedings of International Conference on Parallel Processing |
Pagination | 222-231 |
Date Published | August 15-18 |
Publisher | IEEE Computer Society Press |
LASH-TOR: a Generic Transition-Oriented Routing Algorithm
In Proceedings of the IEEE International Conference on Parallel and Distributed Systems (ICPADS). IEEE Computer Society Press, 2004.Status: Published
LASH-TOR: a Generic Transition-Oriented Routing Algorithm
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | Proceedings of the IEEE International Conference on Parallel and Distributed Systems (ICPADS) |
Pagination | 595-604 |
Publisher | IEEE Computer Society Press |
Simple Deadlock-Free Dynamic Network Reconfiguration
In High Performance Computing - HiPC 2004: 11th International Conference, Bangalore, India, December 19-22, 2004. Lecture Notes in Computer Science. Springer Verlag, 2004.Status: Published
Simple Deadlock-Free Dynamic Network Reconfiguration
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | High Performance Computing - HiPC 2004: 11th International Conference, Bangalore, India, December 19-22, 2004 |
Pagination | 504-515 |
Date Published | December |
Publisher | Springer Verlag |
The Existence of a Network of Fixed-Sized Switches That Satisfies Any Communication Needs
In The 2004 International Conference. Las Vegas, Nevada, USA: CSREA Press, 2004.Status: Published
The Existence of a Network of Fixed-Sized Switches That Satisfies Any Communication Needs
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2004 |
Conference Name | The 2004 International Conference |
Pagination | 1056-1062 |
Publisher | CSREA Press |
Place Published | Las Vegas, Nevada, USA |
Journal Article
An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori
IEEE Computer Architecture Letters 3, no. 1 (2004): 3.Status: Published
An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori
In this paper we present a methodology to design fault-tolerant routing algorithms for regular direct interconnection networks. It supports fully adaptive routing, does not degrade performance in the absence of faults, and supports a reasonably large number of faults without significantly degrading performance. The methodology is mainly based on the selection of an intermediate node (if needed) for each source-destination pair. Packets are adaptively routed to the intermediate node and, at this node, without being ejected, they are adaptively forwarded to their destinations. In order to allow deadlock-free minimal adaptive routing, the methodology requires only one additional virtual channel (for a total of three), even for tori. Evaluation results for a 4x4x4 torus network show that the methodology is 5-fault tolerant. Indeed, for up to 14 link failures, the percentage of fault combinations supported is higher than 99.96%. Additionally, network throughput degrades by less than 10% when injecting three random link faults without disabling any node. In contrast, a mechanism similar to the one proposed in the BlueGene/L, that disables some network planes, would strongly degrade network throughput by 79%.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2004 |
Journal | IEEE Computer Architecture Letters |
Volume | 3 |
Issue | 1 |
Pagination | 3 |
Date Published | May |
Publisher | IEEE |
DOI | 10.1109/L-CA.2004.1 |
Methods for Service Discovery in Bluetooth Scatternets
Computer Communications 27, no. 11 (2004): 1087-1096.Status: Published
Methods for Service Discovery in Bluetooth Scatternets
This paper presents methods for service discovery in multi-hop Bluetooth ad hoc networks, so called scatternets. Two service discovery protocols based on filtering of service requests are proposed. Extensive simulation results are presented showing that the protocols significantly reduce network traffic. Reducing the network traffic is important as many Bluetooth devices have limited power sources and, therefore, benefit from keeping links idle in power saving modes. It is also explained how the proposed protocols can interact with reactive routing protocols and effectively assist route discovery. Finally, an implementation providing functionality for both searching and browsing for services is suggested, effectively extending the Bluetooth SDP to the scatternet.
Afilliation | , Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2004 |
Journal | Computer Communications |
Volume | 27 |
Issue | 11 |
Number | 11 |
Pagination | 1087-1096 |
Publisher | ACM |
DOI | 10.1016/j.comcom.2004.01.013 |
Book Chapter
Switched Ethernet in Automation Networking
In The Handbook on Information Technology in Industrial Automation, 49-1-49-15. CRC Press, 2004.Status: Published
Switched Ethernet in Automation Networking
Afilliation | Communication Systems, Communication Systems |
Publication Type | Book Chapter |
Year of Publication | 2004 |
Book Title | The Handbook on Information Technology in Industrial Automation |
Pagination | 49-1-49-15 |
Publisher | CRC Press |
Notes | ISBN 0-8493-1985-4 |
Talks, contributed
The Interconnection Network: Architectural Challenges for Datacenters in the Computational GRID
In Presented at the eVITA Workshop, 2004.Status: Published
The Interconnection Network: Architectural Challenges for Datacenters in the Computational GRID
Afilliation | , Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2004 |
Location of Talk | Presented at the eVITA Workshop |
Notes | Presented at the eVITA Workshop on eScience and Applications |
Proceedings, refereed
A Criterion for Cost Optimal Construction of Irregular Networks
In IPDPS'03: 17th International Parallel and. Nice, France: IEEE, 2003.Status: Published
A Criterion for Cost Optimal Construction of Irregular Networks
Afilliation | , Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2003 |
Conference Name | IPDPS'03: 17th International Parallel and |
Publisher | IEEE |
Place Published | Nice, France |
Notes | Workshop on Communication Architecture for Clusters (D. K. Panda, J. Duato and C. Stunkel eds.) |
Admission Control for DiffServ Based Quality of Service in Cut-Through Networks
In Proceedings of the 10th International Conference on High Performance Computing (HiPC 2003). Lecture Notes in Computer Science. Heidelberg: Springer, 2003.Status: Published
Admission Control for DiffServ Based Quality of Service in Cut-Through Networks
Previous work on Quality of Service in Cut-through networks shows that resource reservation mechanisms are only effective below the saturation point. Admission control in these networks will therefore need to keep network utilization below the saturation point, while still utilising the network resources to the maximum extent possible. In this paper we propose and evaluate three admission control schemes. Two of these use a centralised bandwidth broker, while the third is a distributed measurement based approach. We combine these admission control schemes with a DiffServ based QoS scheme for virtual cut-through networks to achieve class based bandwidth and latency guarantees. Our simulations show that the measurement based approach favoured in the Internet communities performs poorly in cut-through networks. Furthermore it demonstrates that detailed knowledge on link utilization improves performance significantly.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2003 |
Conference Name | Proceedings of the 10th International Conference on High Performance Computing (HiPC 2003) |
Pagination | 118-129 |
Date Published | December 17-20 |
Publisher | Springer |
Place Published | Heidelberg |
Applying the DiffServ Model on Cut-Through Networks
In Proceedings of the 2003 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA2003). Las Vegas, Nevada: CSREA Press, 2003.Status: Published
Applying the DiffServ Model on Cut-Through Networks
Understanding the nature of traffic in high-speed communication systems is essential for achieving QoS in these networks. A first step towards this goal is understanding how basic QoS mechanisms work and affects the network predictability before we introduce more complex mechanisms such as admission control. In this paper we analyse the effect of a DiffServ inspired QoS concept applied to virtual cut-through networks. The main findings from our study are that (i) throughput differentiation can be achieved by weighting of virtual lanes (VL) and by classifying VLs as either low or high priority, (ii) the balance between VL weighting and VL load is not crucial when the network is operating below the saturation point, (iii) jitter, however, is large and good jitter characteristics seems unachievable with such a relative scheme.
Afilliation | Communication Systems, Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2003 |
Conference Name | Proceedings of the 2003 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA2003) |
Pagination | 1089-1095 |
Date Published | June 23 - 26 |
Publisher | CSREA Press |
Place Published | Las Vegas, Nevada |
Service Discovery in Bluetooth Scatternets
In Proc. Workshop on Mobile Ad Hoc Networking. Sophia-Antipolis, France: Institut EURECOM, 2003.Status: Published
Service Discovery in Bluetooth Scatternets
The article covers methods for service discovery in multi hop Bluetooth ad hoc networks, so called scatternets. A service discovery algorithm is presented that can be used to extend the Bluetooth Service Discovery Protocol to the scatternet without the use of broadcast messages. Extensive simulation results are presented showing that significant reductions in the number of messages sent are achieved compared to using a broadcast approach. Avoiding network flooding and reducing the number of messages is important as many Bluetooth devices have limited power sources and therefore benefit from keeping links idle in power saving modes.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2003 |
Conference Name | Proc. Workshop on Mobile Ad Hoc Networking |
Pagination | 69-74 |
Date Published | March 6 |
Publisher | Institut EURECOM |
Place Published | Sophia-Antipolis, France |
Service Discovery in Highly Dynamic Scatternets
In Proceedings of the 3rd IEEE Workshop on Applications and Services in Wireless Networks. Bern, Switzerland: University of Bern, 2003.Status: Published
Service Discovery in Highly Dynamic Scatternets
This paper covers methods for service discovery in multi-hop Bluetooth ad hoc networks, so called scatternets. An efficient service discovery protocol suitable for highly dynamic scatternets is proposed. Extensive simulation results are presented showing that the protocol significantly reduces network traffic. Reducing the network traffic is important as many Bluetooth devices have limited power sources and therefore benefit from keeping links idle in power saving modes. Finally it is explained how the proposed protocol can interact with reactive routing protocols and effectively assist route discovery.
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2003 |
Conference Name | Proceedings of the 3rd IEEE Workshop on Applications and Services in Wireless Networks |
Pagination | 211-220 |
Date Published | July 2-4 |
Publisher | University of Bern |
Place Published | Bern, Switzerland |
Talks, contributed
Introducing End-to-End Priority Across Switched Ethernet
In Industrial Ethernet Book (IEB), 14:10-16, 2003.Status: Published
Introducing End-to-End Priority Across Switched Ethernet
Afilliation | , Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2003 |
Location of Talk | Industrial Ethernet Book (IEB), 14:10-16 |
Journal Article
Ethernet in Substation Automation
IEEE Control Systems Magazine 22 (2002): 43-51.Status: Published
Ethernet in Substation Automation
Afilliation | , Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 2002 |
Journal | IEEE Control Systems Magazine |
Volume | 22 |
Number | 3 |
Pagination | 43-51 |
Publisher | IEEE |
Proceedings, refereed
Evaluation of Minimal Deterministic Routing in Irregular Networks
In Proceedings of the SSGRR International Conference on Infrastructure for e-Business, e-education, e-Science, and e-Medicine (SSGRR), 2002.Status: Published
Evaluation of Minimal Deterministic Routing in Irregular Networks
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2002 |
Conference Name | Proceedings of the SSGRR International Conference on Infrastructure for e-Business, e-education, e-Science, and e-Medicine (SSGRR) |
Layered Shortest Path (LASH) Routing in Irregular System Area Networks
In Proceedings of Communication Architecture for Clusters (CAC'02). IEEE Computer Society, 2002.Status: Published
Layered Shortest Path (LASH) Routing in Irregular System Area Networks
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2002 |
Conference Name | Proceedings of Communication Architecture for Clusters (CAC'02) |
Publisher | IEEE Computer Society |
Minimal Routing in Irregular Networks
In Proceedings of the International Conference on Communications in Computing (CIC). CSREA Press, 2002.Status: Published
Minimal Routing in Irregular Networks
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2002 |
Conference Name | Proceedings of the International Conference on Communications in Computing (CIC) |
Publisher | CSREA Press |
The Road to an End-to-End Deterministic Ethernet
In Proceedings of 4th IEEE International Workshop on Factory Communication Systems (WFCS), 2002.Status: Published
The Road to an End-to-End Deterministic Ethernet
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2002 |
Conference Name | Proceedings of 4th IEEE International Workshop on Factory Communication Systems (WFCS) |
Proceedings, refereed
Highly Accurate Time Synchronization Over Switched Ethernet
In Proceedings of 8th IEEE conference on Emerging Technologies and Factory Automation (ETFA), 2001.Status: Published
Highly Accurate Time Synchronization Over Switched Ethernet
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2001 |
Conference Name | Proceedings of 8th IEEE conference on Emerging Technologies and Factory Automation (ETFA) |
Load Balancing of Irregular System Area Networks Through Multiple Roots
In Proceedings of the International Conference on Communication in Computing, CIC 2001. CSREA Press, 2001.Status: Published
Load Balancing of Irregular System Area Networks Through Multiple Roots
Afilliation | , Communication Systems |
Publication Type | Proceedings, refereed |
Year of Publication | 2001 |
Conference Name | Proceedings of the International Conference on Communication in Computing, CIC 2001 |
Pagination | 165-171 |
Publisher | CSREA Press |
Topologies and Routing in Gigabit Switching Fabrics
In Proceedings of 2nd International Conference on Communications in Computing (CIC2001). Las Vegas, Nevada: CSREA Press, 2001.Status: Published
Topologies and Routing in Gigabit Switching Fabrics
Cluster networks will serve as the future access networks for multimedia streaming, massive multiplayer online gaming, e-commerce, network storage etc. And for those application areas provisioning of Quality of Service (QoS) is becoming and important issue. DiffServ as specified by the IETF is foreseen to be the most prominent concept for providing predictability in the future Internet. To enable seamless interoperation with the higher level IETF concepts the QoS architecture of the lower layers should comply with the DiffServ paradigm as well. Previous work on predictability in cut-through networks has only studied class based QoS. In this paper we set out to achieve flow level QoS using flow aware admission control in combination with a flow negligent DiffServ inspired QoS mechanism. Our results show that flow level bandwidth guarantees are achievable with the use of the Link-by-Link and the Probe based schemes. In addition we are able to achieve an order of magnitude improvement in jitter and latency in individual flows.
Afilliation | Communication Systems |
Project(s) | No Simula project |
Publication Type | Proceedings, refereed |
Year of Publication | 2001 |
Conference Name | Proceedings of 2nd International Conference on Communications in Computing (CIC2001) |
Pagination | 142-149 |
Date Published | June 25 - 28 |
Publisher | CSREA Press |
Place Published | Las Vegas, Nevada |
Talks, contributed
Switched Synchronization
In Industrial Ethernet Book (IEB), 7:24-27, 2001.Status: Published
Switched Synchronization
Afilliation | , Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2001 |
Location of Talk | Industrial Ethernet Book (IEB), 7:24-27 |
VoIP Drives the Realtime Ethernet
In Industrial Ethernet Book (IEB), 5, 2001.Status: Published
VoIP Drives the Realtime Ethernet
Afilliation | , Communication Systems |
Publication Type | Talks, contributed |
Year of Publication | 2001 |
Location of Talk | Industrial Ethernet Book (IEB), 5 |
Proceedings, refereed
A Fault-Tolerant Method for Wormhole Multistage Networks
In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications. CSREA Press, 1998.Status: Published
A Fault-Tolerant Method for Wormhole Multistage Networks
Afilliation | Software Engineering, Software Engineering |
Publication Type | Proceedings, refereed |
Year of Publication | 1998 |
Conference Name | Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications |
Pagination | 637-644 |
Publisher | CSREA Press |
ISBN Number | 1-880843-13-7 |
Handling Multiple Faults in Wormhole Mesh Networks
In Proceedings of the 4th International Euro-Par Conference on Parallel Processing. Lecture Notes in Computer Science, Springer.Verlag, 1998.Status: Published
Handling Multiple Faults in Wormhole Mesh Networks
We present a fault tolerant method tailored for n-dimensional mesh networks that is able to handle multiple faults, even for two dimensional meshes. The method does not require existence of virtual channels. The traditional way of achieving fault tolerance based on adaptivity and adding virtual channels as the main mechanisms, has not shown the ability to handle multiple faults in wormhole mesh networks. In this paper we propose another strategy to provide high degree of fault-tolerance, we describe a technique which alters the routing function on the fly. The alteration action is always taken locally and distributed to a limited number of non-neighbor nodes.
Publication Type | Proceedings, refereed |
Year of Publication | 1998 |
Conference Name | Proceedings of the 4th International Euro-Par Conference on Parallel Processing |
Pagination | 1076-1088 |
Publisher | Lecture Notes in Computer Science, Springer.Verlag |
ISBN Number | 978-3-540-64952-6 |
Handling Multiple Faults in Wormhole Mesh Networks
In Proceedings of the 4th International Euro-Par Conference on Parallel Processing. Lecture Notes in Computer Science, Springer.Verlag, 1998.Status: Published
Handling Multiple Faults in Wormhole Mesh Networks
We present a fault tolerant method tailored for n-dimensional mesh networks that is able to handle multiple faults, even for two dimensional meshes. The method does not require existence of virtual channels. The traditional way of achieving fault tolerance based on adaptivity and adding virtual channels as the main mechanisms, has not shown the ability to handle multiple faults in wormhole mesh networks. In this paper we propose another strategy to provide high degree of fault-tolerance, we describe a technique which alters the routing function on the fly. The alteration action is always taken locally and distributed to a limited number of non-neighbor nodes.
Publication Type | Proceedings, refereed |
Year of Publication | 1998 |
Conference Name | Proceedings of the 4th International Euro-Par Conference on Parallel Processing |
Pagination | 1076-1088 |
Publisher | Lecture Notes in Computer Science, Springer.Verlag |
ISBN Number | 978-3-540-64952-6 |
Switched SCI Systems
In Proceedings of SCI Europe'98. Bordeaux, France, 1998.Status: Published
Switched SCI Systems
Publication Type | Proceedings, refereed |
Year of Publication | 1998 |
Conference Name | Proceedings of SCI Europe'98 |
Pagination | 13-24 |
Date Published | 29-30 September |
Place Published | Bordeaux, France |
Notes | ISBN 1-901864-02-2 |
Technical reports
Adapting to Faults in Deterministic Wormhole Networks
Department of Informatics, University of Oslo, 1998.Status: Published
Adapting to Faults in Deterministic Wormhole Networks
Publication Type | Technical reports |
Year of Publication | 1998 |
Number | 260 |
Publisher | Department of Informatics, University of Oslo |
Journal Article
Avoiding Head-of-Line Blocking in Wormhole Routed Networks
Microprocessors and Microsystems 21 (1998): 455-462.Status: Published
Avoiding Head-of-Line Blocking in Wormhole Routed Networks
This paper focuses on the problem of head-of-line blocking in wormhole routed networks, such as the IEEE 1355. It is well known that the STC104 router only achieves about 60% performance owing to head-of-line blocking. The simulation results presented indicate that the head-of-line blocking problem of the routers may effectively be reduced by introducing central buffering. Both the buffer size and how to allocate buffer space to the input links have severe impact on the performance.
Publication Type | Journal Article |
Year of Publication | 1998 |
Journal | Microprocessors and Microsystems |
Volume | 21 |
Number | 7 |
Pagination | 455-462 |
One-Fault Tolerance and Beyond in Wormhole Routed Meshes
Microprocessors and Microsystems 21, no. 7-8 (1998): 471-481,.Status: Published
One-Fault Tolerance and Beyond in Wormhole Routed Meshes
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 1998 |
Journal | Microprocessors and Microsystems |
Volume | 21 |
Issue | 7-8 |
Pagination | 471-481, |
Publisher | Elsevier |
PhD Thesis
Topics in Interconnect Networking
Department of Informatics, University of Oslo, 1998.Status: Published
Topics in Interconnect Networking
Publication Type | PhD Thesis |
Year of Publication | 1998 |
Publisher | Department of Informatics, University of Oslo |
Thesis Type | phd |
Proceedings, refereed
Scalable Non-Blocking Networks With Fixed Size Routers
In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'97), 1997.Status: Published
Scalable Non-Blocking Networks With Fixed Size Routers
Publication Type | Proceedings, refereed |
Year of Publication | 1997 |
Conference Name | Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'97) |
Pagination | 1308-1314, |
Journal Article
A HIC Based Concentrator for ATM Traffic
Real-Time Magazine 3 (1996): 57-64,.Status: Published
A HIC Based Concentrator for ATM Traffic
Afilliation | Communication Systems, Communication Systems |
Publication Type | Journal Article |
Year of Publication | 1996 |
Journal | Real-Time Magazine |
Volume | 3 |
Pagination | 57-64, |
Publisher | Real Time |
Grouped Adaptive Routing in HIC Networks
Real-Time Magazine 3 (1996): 50-56,.Status: Published
Grouped Adaptive Routing in HIC Networks
Publication Type | Journal Article |
Year of Publication | 1996 |
Journal | Real-Time Magazine |
Volume | 3 |
Pagination | 50-56, |