Boosting Performance With RDMA – A Case Study

 
RDMA

The following is a guest blog by E8 Storage.

E8 Storage recently published the Deploying E8 Storage with IBM Spectrum Scale white paper to demonstrate the simplest and most straightforward way to deploy E8 Storage as the underlying storage for an IBM Spectrum Scale (formerly known as GPFS) cluster.  In that paper, we covered how to enable RDMA between NSD servers and clients to improve the performance of the GPFS cluster with E8 Storage.  Now, we would like to share with our customers what a tremendous improvement a few simple commands can deliver to the performance of your Spectrum Scale cluster.

E8 Storage leverages and greatly benefits from Remote Direct Memory Access (RDMA)– running on InfiniBand or RDMA over Converged Ethernet, aka RoCE–a key feature of our Mellanox ConnectX® adaptors, and the Mellanox switches that we use in our test lab environment. RoCE is supported by all data-center grade Ethernet switches, and is also supported by a wide variety of NICs, primarily from Mellanox. Having a high-speed Ethernet infrastructure already in place enables customers to extract additional value from their hardware and software investments by moving the NSD Server-Client block communication away from the traditional 1GbE networks and on to the fast, reliable and, most importantly, already paid-for RDMA infrastructure. That alone provides a significant performance boost in the form of reduced latency and increased throughput.

It doesn’t stop there. By turning on the support for RDMA using the VERBS API for data transfer between an NSD server and client, customers can further drive the latency down and the throughput up.

The steps to enable the VERBS API can be found in the Deploying E8 Storage with IBM Spectrum Scale whitepaper.  To measure performance, we used a small 3 node cluster consisting of 2 NSD servers and 1 client, each connected at 50GbE to the Mellanox SN2700 network switch via Mellanox ConnectX-4 adaptors. The client node had no local access to E8 Storage volumes and so all I/O had to go through one of the NSD servers. The only I/O load on the cluster came from the I/O generator and performance measuring tool, FIO v3.5.

The testing methodology was simple: run random read jobs against the mounted GPFS file system /e8fs1. The workloads we used for the performance comparison were:

  • 4K, 100% random read, 8 FIO threads, queue depth/thread of 14 for a total queue depth of 112.
  • 128K, 100% random read, 8 FIO threads, queue depth/thread of 7 for a total queue depth of 56.

 

The results below speak for themselves.  For the client performance we’re talking about

  • Over 5x improvement in small block latency and throughput
  • Over 2x improvement in large block latency and throughput

 

 

Note how the large block IO is now able to nearly max out all available bandwidth of the 50GbE connection, which has a potential max throughput of about 5.5GB/s.

————

As you can see from the results, RDMA delivers a significant performance boost, not only for storage built for NVMe, but for general performance between hosts and servers.  The foundation of this performance is Mellanox’s Ethernet Storage Fabric which dramatically increased the performance of GPFS using RDMA.

Mellanox’s ConnectX-4 NICs delivered high bandwidth and provided a robust implementation of RoCE (RDMA over Converged Ethernet) that is a key component of this NVMe-oF-based system.Mellanox’s SN2700 Ethernet Switch delivered the non-blocking high performance and consistently low latency across all of its ports that was necessary to enable the new benchmark.

E8 Storage customers are uniquely positioned to extract additional value from the fast, reliable and cost-effective solution they already have at their disposal by taking advantage of a few simple steps to enable RDMA within their existing Ethernet or IB networks.

If you want to learn more, contact us and join us for a webinar next week:

Comments are closed.