It’s not every day that you wake up with the thought that you are taking part in changing the world. Well, it happened to me the other day, and I’ve decided to blog about it.
I’ll start with a 5-year-old memory: “OCP Engineering Workshop – UTSA, San Antonio – Now Open! Thursday, October 24, 2013”. Mellanox had just launched its Open Ethernet campaign to promote whitebox switching. The motivation behind the Open Ethernet approach was to move the arena of closed and expensive OEM Ethernet switches, to an open ecosystem of switch platforms, switch ASICs and switch Network OSs. This would allow customers to consume network components as per their needs.
In its early days in 2013, the Open Compute Project (OCP) networking group focused on disaggregation – separating switch system software from hardware. The first step here was the ONIE boot loader that enabled loading the switch network operating system software (aka NOS) from a network drive. While trying to promote the Open Ethernet agenda, I booked a workshop slot to present my idea on the next level of disaggregation – that is, opening the NOS itself. The first obstacle to having an open Network OS was the lack of open interfaces to configure the switch ASIC.
Each vendor has its own specific SDK APIs, which are traditionally proprietary and closed, and thus, cannot be part of an open-source NOS. Another problem has to do with compensating the differences between various vendors’ data plane implementations as reflected by their APIs. A true open NOS should be hardware-agnostic to enable the said ecosystem.
To address these problems, Mellanox proposed a common interface to both serve as the OCP standard and be adopted by all the hardware vendors in the “room”. We (surprisingly 😉) named the interface, “Open Ethernet Switch APIs” or OES.
After the presentation, I had some side discussions with multiple members of the OCP networking group. The common opinion was that certain business interests would prevent OES from reaching the common ground needed between the ASIC vendors—We were blocked.
Something changed in 2014. Microsoft stepped in and began driving the specification for common APIs. This is when Switch Abstraction Interface (SAI) was born. As of July 2015, Switch Abstraction Interface (SAI) was officially accepted by the Open Compute Project (OCP). Today, all the major ASIC switch vendors support SAI, and some have even chosen to adopt SAI as their sole interface. But SAI is only the initial important step towards the end goal— a truly open NOS. At the OCP summit in 2016, Microsoft officially presented Software for Open Networking in the Cloud (SONiC) as the software to power Microsoft Azure.
Needless to say, Mellanox has been extremely supportive of SAI and SONiC from day one, as both resonate well with Open Ethernet. Since then, as members of the SAI and the SONiC communities, we’ve been investing resources and verifying that our Spectrum-based switches run SONiC to production level.
OK, now that we’ve arrived at the end of our history class, why did I wake up with this elevating thought? Because SONiC is becoming a big success. Open Ethernet is alive and kicking!
SONiC has crossed the Pacific Ocean and landed safely in the Data Center clusters switches. I recently visited our major T1 and T2 cloud and web customers in China, where most of our discussions focused on SONiC architecture, production roadmaps and features co-development.
The question is why now, why SONiC? The answer lies in its maturity and ecosystem. SONiC is a NOS coming from a hyperscaler who is putting it in large-scale production. This means SONiC is mature enough. And SONiC relies on SAI, which means it is truly hardware-agnostic. So, it’s open, it’s mature and there is no vendor lock-in. You see? This is Open Ethernet!
Tier 1 companies have an obvious motivation to move to a Linux-based open NOS where they don’t need to invest to get the BGP to work, yet they can run a container on the switch itself for their own differentiated, home-grown applications such as telemetry streaming, load balancing, unique tunneling, etc…
Tier 2 companies are interested in the whitebox / disaggregation CAPEX savings of an open NOS while enjoying the bigger players’ experience and contribution. The more conservative Telco and enterprise will be slower to adopt, but keep in mind what happened to VxWorks, it all moved to Linux…
Some of these new directions were presented last OCP summit by David Maltz, Distinguished Engineer, Microsoft. The session touted SONiC as a containerized, open sourced switch OS created for cloud-scale networking with an ever-growing ecosystem as well as new capabilities, and new scenarios that SONiC can support.
In the session, David showed a SONiC TOR switch running a container with implementation of a load balancer. Such functionality traditionally runs at an ISP edge router or a dedicated server. The key point behind the demo is the openness of the NOS that provides the ability to break the traditional barrier between the network and the application camps. David explained the new world of opportunities SONiC brings and highlighted the fact that The story isn’t about Open NOS but bringing innovation to the network and providing the solutions at the right place.
As always, there are some plot holes in every good story; I’ll try to list some of them:
“SONiC is an open system born to enable fast evolution of switch hardware and networking software. The continuous emergence of new scenarios and increasing stringent requirement for performance in cloud computing are the driving force of SONiC innovation. Started with basic L2/L3 routing functionality in 2016, SONiC has made leaps in performance, monitoring, reliability and virtualization area, leading innovations in RDMA, streaming telemetry and hitless upgrade. The supported platform has grown from a few ASIC types and platforms to all major ASIC chips and 33 different platforms, with richer classes of CPU type and chassis type. In the coming few years, we will see SONiC extending to many more roles, e.g. management network, WAN, etc., with continuous advancing in the area of telemetry, reliability, security, usability and manageability,” says Xin Liu, principal product manager and SONiC project manager at Microsoft Corp.
The mission of ODCC is to build an active, efficient and internationally competitive data center ecosystem and open platform; create the unified and international data center specification and standard; and promote industrial cooperation, innovation and applications for the data center.
The Network is one of the important projects in the ODCC. Baidu, Tencent, Alibaba, China Telecom, China Mobile, CAICT and other data center enterprises have joined forces to form the Open Data Center Committee (ODCC). ODCC embraced SONiC with the China Open Ethernet network project and is putting resources into testing its SONiC branch named Phoenix. Mellanox has been a part of ODCC for years and is allocating even more resources to enhance and improve the Phoenix test-bed.
“ODCC is excited from the world of opportunities that accompany SONiC” Dr. Jie Li, Vice President of ODCC, says, “The Chinese market embraces the Open Ethernet initiative. We strive for an open Network Operating System with unified management and monitoring interfaces, deployed in large-size Data Centers within and outside of China. Phoenix, the ODCC’s version of SONiC, provides the stability that is gained from the extensive testing and mass-production deployment in Chinese Data Centers. We would like to thank Mellanox for its contribution to ODCC and SONiC.”
The momentum around SONiC is obvious. Off course, there are other open source initiatives, but none have matured enough to arrive at the stage where SONiC is today. This is a huge snowball effect that is not going to stop. We ARE changing the world!
With this spirit, I’m looking forward to the SONiC community workshop, Friday, October 19 @ Beijing.