Why Oracle Deployed RoCE Network to serve AI & HPC workloads in Oracle Cloud Infrastructure (OCI)

 
RoCE

As the world continues to march forward with its digital transformation journey, a growing amount of data is becoming available for enterprises to make more accurate business decisions. Organizations must now change their traditional thinking to develop “digital thinking” skills for their decision process. Not having such capabilities will put companies at a disadvantage versus competitors. Any DevOps decisions need to be based on a pure analysis of relevant data, this will help remove traditional silos, improve the communication between different teams within organizations and thus boost corporate efficiency.

This trend hasn’t been hidden from the Enterprises decision makers that have already started to build and use AI/ML solutions. These solutions not only include CPUs but also other high-performance components, like GPUs, faster All Flash Array (AFA) storage that utilize NVMe or Persistent Memory (PM), which are all connected by high performance networking gear required to handle the massive data communication within the AI/ML clusters. Faster data communication results in faster analytics and a company that is first to make the right decision, gains a competitive edge over its competitors. This is very similar to companies that run High Frequency Trading (HFT) where every nanosecond counts and clusters are connected by the fastest networking technology – in these environments the analytic cluster is located very close to the stock exchange data, even within the same building, to minimize the delay that optical cables inject.

Oracle Cloud Infrastructure Presentation at OOW’18 shows the RDMA cluster networking in their cloud network

Figure 1: Oracle Cloud Infrastructure Presentation at OOW’18

 

In addition to building the AI/ML or HPC clusters, enterprises must also embrace a Hybrid strategy to enable higher scalability and higher efficiency. Oracle just released their services to run over clusters connected by RoCE v2. RoCE v2 is a RDMA over converged Ethernet networking technology that removes the CPU from direct data communication between nodes and storage to increase efficiencies (for a brief description of RoCE advantages see: RoCE with Mellanox).

RDMA technology isn’t new to Oracle and was actually used in Oracle’s Exadata (and in other solutions, like Big Data as an example) that used RDMA enabled InfiniBand networks to achieve record setting performance. Having already proven RDMA’s performance, the decision to use RoCE in their high-performance clusters was an easy one.

Oracle’s benchmark results of high-performance (HPC) workloads running over their RDMA enabled network clusters

Figure 2: HPC workload over Oracle Cloud Infrastructure (OCI) presented at OOW’18

 

Oracle’s benchmark results of high-performance (HPC) workloads running over their RDMA enabled network clusters compares them to other cloud solutions and demonstrates the competitive advantages that RoCE enables.

The approach of using AI/ML based applications to develop “digital thinking” skills is just getting started and much more is expected to be developed soon. From this we’ll see new, real-time user interface solutions that are based on voice and video. Like the situation we are seeing to address AI/ML, these too will require faster response times, which RDMA-enabled networks will be best positioned to address.

We’re excited to work with Oracle to further develop these technologies and are excited to collaborate on their current Clustered Network offering.

 

About Motti Beck

Motti Beck is Sr. Director Enterprise Market Development at Mellanox Technologies Inc. Before joining Mellanox Motti was a founder of BindKey Technologies an EDC startup that provided deep submicron semiconductors verification solutions and was acquired by DuPont Photomask and Butterfly Communications a pioneering startup provider of Bluetooth solutions that was acquired by Texas Instrument. Prior to that, he was a Business Unit Director at National Semiconductors. Motti hold B.Sc in computer engineering from the Technion – Israel Institute of Technology. Follow Motti on Twitter: @MottiBeck

Comments are closed.