Should we be worried when deploying an open source network operating system in production?
Microsoft is not worried, Facebook is not worried, successful medium & large size companies in Russia, France and China are not worried.
They are all deploying SONiC and Linux Switch in production data centers. They are leading a powerful trend in IT, contributing to a large open source community, while controlling their own destiny. They are relying on the stability and feature richness of open source protocol stacks, benefiting from a flexible and customizable network, and reducing costs without compromising their performance requirements.
A typical virtualized network environment today will use a standard leaf/spine topology, with the configuration becoming very simple: its BGP, MP-BGP for VXLAN EVPN, and *maybe* multihoming or MLAG – that’s about all.
The Free Range Routing (FRR) protocol stack can do all that today – it is one of the strongest packages for network control plane mostly due to the contributions of Cumulus Networks as this protocol stack is the routing engine in their Cumulus Linux Switch. Guess what? FRR is also in SONiC and can also be used in the Mellanox Linux Switch.
In Diagram 1 below, a leaf/spine topology with different NOSs (Network Operating Systems) but all running the same FRR protocol stack. This design presents a network that strives for openness, whether its OpenStack or Kubernetes, or any Linux based platform, the network part can be managed and monitored by the same tools used for the servers.
Diagram 1
Are there any benefits in this mix and match game? Yes! Obviously it can reduce capex , but it can also reduce opex by moving to an open source NOS while mixing with commercial Linux based NOS for the elements that requires some features that are not yet implemented or certified, and from the support perspective today there are more options for more complete support contracts that includes also the open source parts.
Suggested Setup:
The minimum setup to bring up an EVPN network for a PoC or demo would be two spine switch and two leaf switches.
The SN2100 is a 16x100GbE half 19” Switch that fits just great this exercise of a small and low cost PoC setup.
Note: One could use the SN2700 for the spine and the SN2410 (48x25GbE + 8x100GbE) as a leaf switch, but it doesn’t really matter, as all the Spectrum products are based on the same silicon and all offer the same set of features and performance
The two Spines are running SoniC, master version with some modifications
Leaf’s:
The Cumulus leaf configuration is very straightforward, I am using the NCLU (Cumulus CLI) which is very intuitive and easy to use, partial snapshot form leaf1:
eBGP for underlay:
net add bgp autonomous-system 65551
net add bgp router-id 1.1.1.1
net add bgp bestpath as-path multipath-relax
net add bgp neighbor 172.16.1.1 remote-as 65500
net add bgp neighbor 172.16.1.5 remote-as 65500
Enabling evpn:
net add bgp l2vpn evpn neighbor 172.16.1.1 activate
net add bgp l2vpn evpn neighbor 172.16.1.5 activate
Vlan-VNI Mapping
net add vxlan vni100 bridge access 100
net add vxlan vni100 vxlan local-tunnelip 1.1.1.1
Note: Cumulus has a great feature called BGP unnumbered. Use it and then the next hop address for each prefix is an IPv6 link-local address, it is assigned automatically to each interface, By using the IPv6 link-local address as a next hop instead of an IPv4 unicast address, BGP unnumbered saves you from having to configure IPv4 addresses on each interface.
Unfortunately, I was not able to enable that on Sonic, Link local of IPv6 is not supported now (There is a Kernel support) and that is way I configured IPv4 address on each switch.
Spines:
On the SoniC spines, the master version is using FRR as the protocol stack but evpn will not work without some tweaking.
Firstly, change the login default password.
The most important change is to introduce the evpn address family and this is not available in the release,
Look for a file called /etc/sonic/frr/frr.conf. It is generated from conf.db by a jinja2 file called frr.conf.j2 that comes within the bgp (bgp is the only enabled daemon in the frr) container. When the docker container is reloaded, it dumps frr config into the frr vtysh – that will be the running config for the spine.
To make magic happen, move to the docker shell:
admin@l-csi-demo-2700s-24:~$ docker exec -it bgp bash
root@sonic-spine4:/# cd /usr/share/sonic/templates/
Update the frr.conf.j2, by adding:
Reload, and it’s ready to go!Now the SoniC spines enabled evpn address family and since they are spines there is no need for vtep functionality.
Below is partial snapshot from the Spine1 Sonic/frr running config:
router bgp 65500
bgp router-id 1.1.1.101
bgp log-neighbor-changes
no bgp default ipv4-unicast
bgp graceful-restart
bgp bestpath as-path multipath-relax
neighbor 172.16.1.2 remote-as 65551
neighbor 172.16.1.2 timers 3 10
neighbor 172.16.2.2 remote-as 65552
neighbor 172.16.2.2 timers 3 10
!
address-family ipv4 unicast
network 1.1.1.101/32
neighbor 172.16.1.2 activate
neighbor 172.16.1.2 allowas-in 1
neighbor 172.16.2.2 activate
neighbor 172.16.2.2 allowas-in 1
maximum-paths 64
exit-address-family
!
address-family l2vpn evpn
neighbor 172.16.1.2 activate
neighbor 172.16.2.2 activate
exit-address-family
!
Use the vtysh for other show commands to see that all is working well
B.T.W you can use the vtysh also in the Cumulus switches but the NCLU for me is much more user friendly, here is the bridge mac information:
Two MAC addresses were learned on the bridge, the local server and the remote one over the tunnel IP
Servers:
We recommend creating a sub interface 100 on each interface,configured on each the IP addresses. Both are in the same subnet 12.143.34.112 and 12.143.34.111, but obviously running on an overlay network on top of BGP underlay
Then, run a basic iperf test to see how much can be gained from this setup:
93.8Gb/sec! That’s almost 100GbE line rate on a vxlan evpn setup, with no special tuning.
Cumulus is Mellanox’s #1 partner for the network operating system today. Sonic is an operating system that Mellanox significantly contributes to, develops, and now also supports.
Mellanox offers different operating systems, from our own Onyx, to Cumulus, to Sonic and Linux Switch , each of our customers can choose and sometimes even mix and match , based on his requirements and objectives,
The performance is 100% Mellanox quality, OPEN Ethernet at full speed.
A special thanks to Ezra Dayan who helped me understand the mysteries of SONiC.
Supporting Resources: