Using InfiniBand as a Unified Cluster and Storage Fabric

High Performance Computing (HPC), InfiniBand, RDMA, Storage, Uncategorized, , ,

InfiniBand has been the superior interconnect technology for HPC since it was first introduced in 2001, leading with the highest bandwidth and lowest latency year after year.  Although, it was originally designed and ideal for inter-process communications, what many people may not realize is that InfiniBand brings advantages to nearly every use for an interconnect fabric technology to today’s modern data center.

First, let’s review what a fabric actually does in the context of a Beowulf architecture HPC cluster.  In addition to the inter-process communication already mentioned, compute nodes need access to shared services such as storage, network boot or imaging, internet access, and out of band management.  Traditionally, it was common in the design for HPC cluster to build one or more Ethernet networks in addition to InfiniBand for some of these services.

The primary use of a high performance fabric in any HPC cluster is for IPC (inter-process communication) with support for RDMA and higher level protocols such as MPI, SHMEM, and UPC.  Mellanox InfiniBand HCAs (host channel adapters) support RDMA with less than 1% CPU utilization and the switches in an InfiniBand Fabric can work in tandem with HCAs to offload nearly 70% of the MPI protocol stack to the fabric itself – actually enlisting the network as a new generation of co-processor.  And speaking of co-processors, newer capabilities such as GPUDirect and rCUDA extend many of these same benefits to attached GPGPUs and other coprocessor architectures.

The language of the internet is TCP/IP which also supported by an InfiniBand fabric using a protocol known as IPoIB.  Simply put, every InfiniBand HCA port represents a device to the kernel which can be assigned an IP address and fully utilize the same IPv4 and IPv6 network stacks as Ethernet devices.  Additionally, a protocol called Virtual Protocol Interconnect (VPI) allows any InfiniBand port to operate as an Ethernet port when connected to an Ethernet device and Mellanox manufactures “bridging” products that forward TCP/IP traffic from the IPoIB network to an attached Ethernet fabric for full internet connectivity.

Storage can also utilize the IP protocol, but parallel filesystems such as GPFS, Lustre, and other clustered filesystems also support RDMA as a data path for enhanced performance.  The ability to support both IP and RDMA on a single fabric makes InfiniBand an ideal way to access parallel storage for HPC workloads.  End-to-end data protection features and offloads of other storage related protocols such as NVME over fabrics (PCIe-connected solid state storage) and erasure coding further enhance the ability of InfiniBand to support and accelerate access to storage.

Mellanox ConnectX® InfiniBand adapters also support a feature known as FlexBoot.  FlexBoot enables remote boot over InfiniBand or Ethernet using Boot over InfiniBand, over Ethernet, or even Boot over iSCSI (Bo-iSCSI). Combined with VPI technologies, FlexBoot enables the flexibility to deploy servers with one adapter card into either InfiniBand or Ethernet networks with the ability to boot from LAN or remote storage targets. This technology is based on PXE (Preboot Execution Environment standard specification, and FlexBoot software is based on the open source iPXE project (see

Hyperconverged datacenters, Web 2.0, Machine Learning, and non-traditional HPC practitioners are now taking note of the maturity and flexibility of InfiniBand and adopting it to realize accelerated performance and improved ROI from their infrastructures.  The advanced offload and reliability features offered by Mellanox InfiniBand adapters, switches, and even cables means that many workloads can realize greater productivity, acceleration and increased stability.  Our new InfiniBand router, which supports L3 addressing can even interconnect multiple fabrics with different topologies making InfiniBand able to scale to an almost limitless number of nodes.

InfiniBand is an open standard for computer interconnect, backwards and future compatible and supported by over 220 members of the InfiniBand Trade Association (IBTA).  Mellanox remains the industry leader and committed to advancing this technology generations ahead of our competitors with leading edge silicon products integrated onto our adapters (HCAs), switching devices, cables and more.  If you want the highest performance, lowest latency, best scaling fabric for all of your interconnect needs, consider converging on Mellanox InfiniBand.

Join me on Tuesday, March 14th at 10a.m. for our webinar, One Scalable Fabric for All: Using InfiniBand as a Unified Cluster and Storage Fabric with IBM.

Comments are closed.