Open Ethernet at Full Speed with Leaf-Spine Architecture Cumulus and SONiC VXLAN EVPN Interoperate Over Mellanox

 
Sonic, Switches

How top companies are mixing and matching to use SoNiC and Linux Switches in production data centers.

Should we be worried when deploying an open source network operating system in production?

Microsoft is not worried, Facebook is not worried, successful medium & large size companies in Russia, France and China are not worried.

They are all deploying SONiC and Linux Switch in production data centers. They are leading a powerful trend in IT, contributing to a large open source community, while controlling their own destiny. They are relying on the stability and feature richness of open source protocol stacks, benefiting from a flexible and customizable network, and reducing costs without compromising their performance requirements.

A typical virtualized network environment today will use a standard leaf/spine topology, with the configuration becoming very simple: its BGP, MP-BGP for VXLAN EVPN, and *maybe* multihoming or MLAG – that’s about all.

The Free Range Routing (FRR) protocol stack can do all that today – it is one of the strongest packages for network control plane mostly due to the contributions of Cumulus Networks as this protocol stack is the routing engine in their Cumulus Linux Switch. Guess what? FRR is also in SONiC and can also be used in the Mellanox Linux Switch.

In Diagram 1 below, a leaf/spine topology with different NOSs (Network Operating Systems) but all running the same FRR protocol stack. This design presents a network that strives for openness, whether its OpenStack or Kubernetes, or any Linux based platform, the network part can be managed and monitored by the same tools used for the servers.

Diagram 1

Are there any benefits in this mix and match game? Yes! Obviously it can reduce capex , but it can also reduce opex by moving to an open source NOS while mixing with commercial Linux based NOS for the elements that requires some features that are not yet implemented or certified, and from the support perspective today there are more options for more complete support contracts that includes also the open source parts.

Suggested Setup:

The minimum setup to bring up an EVPN network for a PoC or demo would be two spine switch and two leaf switches.

  • Spines: SN2700 – Mellanox Spectrum-based product, 32 ports of 100GbE each
  • Leaf’s: SN2100 – Mellanox Spectrum-based product, 16 ports of 100GbE each

The SN2100 is a 16x100GbE half 19Switch that fits just great this exercise of a small and low cost PoC setup.

 Note: One could use the SN2700 for the spine and the SN2410 (48x25GbE + 8x100GbE) as a leaf switch, but it doesn’t really matter, as all the Spectrum products are based on the same silicon and all offer the same set of features and performance

  • Servers: Two PCIe Gen3 servers, 16 lanes and 12 cores Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz each with ConnectX5 100GbE network card
  • Protocols: eBGP for the Underlay, and VxLAN-EVPN
  • The Leaf’s are running Cumulus

The two Spines are running SoniC, master version with some modifications

Leaf’s:

The Cumulus leaf configuration is very straightforward, I am using the NCLU (Cumulus CLI) which is very intuitive and easy to use, partial snapshot form leaf1:

eBGP for underlay:

net add bgp autonomous-system 65551

net add bgp router-id 1.1.1.1

net add bgp bestpath as-path multipath-relax

net add bgp neighbor 172.16.1.1 remote-as 65500

net add bgp neighbor 172.16.1.5 remote-as 65500

Enabling evpn:

net add bgp l2vpn evpn  neighbor 172.16.1.1 activate

net add bgp l2vpn evpn  neighbor 172.16.1.5 activate

Vlan-VNI Mapping

net add vxlan vni100 bridge access 100

net add vxlan vni100 vxlan local-tunnelip 1.1.1.1

Note: Cumulus has a great feature called BGP unnumbered. Use it and then the next hop address for each prefix is an IPv6 link-local address, it is assigned automatically to each interface, By using the IPv6 link-local address as a next hop instead of an IPv4 unicast address, BGP unnumbered saves you from having to configure IPv4 addresses on each interface.

Unfortunately, I was not able to enable that on Sonic, Link local of IPv6 is not supported now (There is a Kernel support) and that is way I configured IPv4 address on each switch.

Spines:

On the SoniC spines, the master version is using FRR as the protocol stack but evpn will not work without some tweaking.

Firstly, change the login default password.

The most important change is to introduce the evpn address family and this is not available in the release,

Look for a file called /etc/sonic/frr/frr.conf. It is generated from conf.db by a jinja2 file called frr.conf.j2 that comes within the bgp (bgp is the only enabled daemon in the frr) container. When the docker container is reloaded, it dumps frr config into the frr vtysh – that will be the running config for the spine.

To make magic happen, move to the docker shell:

admin@l-csi-demo-2700s-24:~$ docker exec -it bgp bash

root@sonic-spine4:/# cd /usr/share/sonic/templates/

Update the frr.conf.j2, by adding:

 

 

 

 

 

 

 

Reload, and it’s ready to go!Now the SoniC spines enabled evpn address family and since they are spines there is no need for vtep functionality.

Below is partial snapshot from the Spine1 Sonic/frr running config:

router bgp 65500

bgp router-id 1.1.1.101

bgp log-neighbor-changes

no bgp default ipv4-unicast

bgp graceful-restart

bgp bestpath as-path multipath-relax

neighbor 172.16.1.2 remote-as 65551

neighbor 172.16.1.2 timers 3 10

neighbor 172.16.2.2 remote-as 65552

neighbor 172.16.2.2 timers 3 10

!

address-family ipv4 unicast

network 1.1.1.101/32

neighbor 172.16.1.2 activate

 

neighbor 172.16.1.2 allowas-in 1

neighbor 172.16.2.2 activate

neighbor 172.16.2.2 allowas-in 1

maximum-paths 64

exit-address-family

!

address-family l2vpn evpn

neighbor 172.16.1.2 activate

neighbor 172.16.2.2 activate

exit-address-family

!

Use the vtysh for other show commands to see that all is working well

B.T.W you can use the vtysh also in the Cumulus switches but the NCLU for me is much more user friendly, here is the bridge mac information:

Two MAC addresses were learned on the bridge, the local server and the remote one over the tunnel IP

 Servers:

We recommend creating a sub interface 100 on each interface,configured on each the IP addresses. Both are in the same subnet 12.143.34.112 and 12.143.34.111, but obviously running on an overlay network on top of BGP underlay

Then, run a basic iperf test to see how much can be gained from this setup:

93.8Gb/sec! That’s almost 100GbE line rate on a vxlan evpn setup, with no special tuning.

Cumulus is Mellanox’s #1 partner for the network operating system today. Sonic is an operating system that Mellanox significantly contributes to, develops, and now also supports.

Mellanox offers different operating systems, from our own Onyx, to Cumulus, to Sonic and Linux Switch , each of our customers can choose and sometimes even mix and match , based on his requirements and objectives,

The performance is 100% Mellanox quality, OPEN Ethernet at full speed.

A special thanks to Ezra Dayan who helped me understand the mysteries of SONiC.

 Supporting Resources:

 

 

About Avi Alkobi

Avi Alkobi is Ethernet Technical Marketing manager for EMEA in Mellanox Technologies. For the past 8 years he has worked at Mellanox in various roles focusing on the Ethernet switch product line, first as a SW developer, then as a team leader of the infrastructure team responsible for the first Ethernet switch management infrastructure. More recently, he has worked as a senior application engineer supporting the field on post sales, pre sales and complex proof of concepts for E2E Ethernet and InfiniBand solutions. Before coming to Mellanox, he worked for five years at Comverse Technology and prior to this, in the Israeli security industry as a software developer. He holds a Bachelor of Science degree in Computer Science and M.B.A from the Bar-Ilan University in Israel.

Comments are closed.