Mellanox and Zettar Crush World Record LOSF Performance Using ESnet OSCARS Test Circuit

Ethernet, InfiniBand, RDMA, , ,

In the wake of SC16, Mellanox has just broken the record for Lots of Small Files (LOSF) performance using the ESnet OSCARS test circuit measuring 70Gb/s. Preliminary results show a ten-fold performance improvement (40+Gbps) compared to the best results DOE researchers have reported thus far (4Gbps), even with TLS encryption enabled, for LOSF[i], despite the bandwidth cap and QoS limitations of the ESnet OSCARS test circuit.

At SC16, Mellanox and Zettar demonstrated real-time transfers, round-trip from SLAC to Atlanta, Georgia, and back to SLAC over a 5000-mile ESNet OSCARS loop. The two companies also exhibited real-time data transfers using two 100Gbps LAN links that show line rate performance of moving data from memory-to-memory and file-to-file between clusters. The configuration leveraged Mellanox 100Gb/s InfiniBand connectivity on the storage backend as well as Mellanox 100Gb/s Ethernet connectivity at the front end. The motivation of such endeavors is due to the fact that the next generation Linear Coherent Light Source experiment (LCLS-II) at SLAC is anticipated to achieve an event rate of 1000 times that of today’s LCLS. The majority of the data analysis will be performed at the NERSC supercomputer center at Lawrence Berkley National Laboratory, where it is essential to have a solution that is capable of supporting this distributed, data-intensive project.

Mellanox was delighted and honored to participate with this important technology demonstration that leveraged a complete state-of-the-art 100Gb/s InfiniBand and Ethernet connectivity solution. By showcasing that the foundational interconnect requirements of the LCLS-II project, we now have hard evidence that co-design and open standards is on the proper trajectory needed to drive future generation requirements for both science and data.

“Providing a scale-out data transfer solution consisting of matching software and a transfer system design will be paramount for the amount of data generated by projects such as LCLS-II,” said Dr. Chin Fang, CEO and Founder, Zettar, Inc. “Harnessing the latest capabilities of RDMA, 100Gb/s InfiniBand and Ethernet with Zettar’s scale-out data transfer solution, we can achieve the performance needed to satisfy the demands of the future data centers for the most data-intensive research such as LCLS-II, even with the formidable challenges found in LOSF transfers.”

“The rates to transfer the data to NERSC is expected to reach several hundred Gb/s soon after the project turns on in 2020 and exceed a terabyte per second by 2025,” said Dr. Les Cottrell, SLAC National Accelerator Laboratory. “This demonstration will bring to light the growing need we are experiencing for data transfer and High Performance Computing (HPC) for analysis.”

ESnet provides the high-bandwidth, reliable connections that link scientists at national laboratories, universities and other research institutions, enabling them to collaborate on some of the world’s most important scientific challenges including energy, climate science, and the origins of the universe. Funded by the DOE Office of Science, ESnet is managed and operated by the Scientific Networking Division at Lawrence Berkeley National Laboratory. As a nationwide infrastructure and DOE User Facility, ESnet provides scientists with access to unique DOE research facilities and extensive computing resources.

Zettar Inc. delivers a scale-out data transfer software and architected a data transfer cluster design that proved the feasibility of using compact and energy-efficient high-density servers for high-performance big-data transfers.   The design leverages the industry leading Mellanox 100Gb/s ConnectX-4, SwitchIB-2 and the Mellanox SN2410 Spectrum-based 48-port 25GbE + 8-port 100GbE Open Ethernet Platform switches.

Supporting Resources:

[i] See, Figures 6 and 7

About Scot Schultz

Scot Schultz is a HPC technology specialist with broad knowledge in operating systems, high speed interconnects and processor technologies. Joining the Mellanox team in March 2013 as Director of HPC and Technical Computing, Schultz is a 25-year veteran of the computing industry. Prior to joining Mellanox, he spent the past 17 years at AMD in various engineering and leadership roles, most recently in strategic HPC technology ecosystem enablement. Scot was also instrumental with the growth and development of the Open Fabrics Alliance as co-chair of the board of directors. Scot currently maintains his role as Director of Educational Outreach, founding member of the HPC Advisory Council and of various other industry organizations. Follow him on Twitter: @ScotSchultz

Comments are closed.