All posts by Ash Bhalgat

About Ash Bhalgat

Ash is Senior Director of Cloud Marketing at Mellanox. He leads market strategy, product/solutions marketing and ecosystem engagements for cloud service provider and telco cloud markets. Ash is an accomplished hi-tech industry leader focused on the intersection of technology, products/solutions and markets. He has deep experience in building and marketing high impact, fast revenue growth products/solutions at large, mid-size and startups including companies such as Cisco, Mellanox, Polycom, Luxoft and Hapoose. His professional experience spans across diverse technology domains including Cloud Services, Software Defined Networking, Server and Network Virtualization, Routing, Switching, Wi-Fi, Unified Communication and Collaboration, Network Management, Mobile Apps and End User Devices. Ash graduated summa cum laude with a B.E. is Electrical Engineering from University of Pune (India), M.S. in Computer Engineering from University of Cincinnati and M.B.A. from Santa Clara University. Connect with Ash on LinkedIn or Twitter.

“AMD EPYC™ 7002 Series Processors and Mellanox ConnectX Accelerate 5G Wireless Performance”

OVS over ASAP²… OMG! AMD EPYC™ 7002 Series Processors and Mellanox SmartNICs Deliver Epic 5G Performance

Over the last five years, compute and storage technology has achieved substantial performance increases, while at the same time being hampered by PCI Express Gen3 bandwidth limitations (PCIe Gen3). AMD is the first X86  processor company to release support for the PCIe fourth generation bus (PCIe Gen4) with the AMD EPYC™ 7002 Series Processor. This is the second-generation AMD EPYC™ processor, but the first x86 data center processor with PCIe Gen4 support delivering substantial system performance improvements by doubling the bandwidth available to storage, networking, and other peripherals when compared to CPUs that only support PCIe Gen3. AMD EPYC™ 7002 Series Processors also offers more PCIe lanes and support for adding more DRAM capacity, allowing the AMD EPYC™ 7002 Series Processor to provide the industry’s highest PCIe bandwidth and memory capacity.

The new AMD EPYC™ 7002 Series Processor

The new AMD EPYC™ 7002 Series Processor delivers advanced processing capabilities, capable of unleashing giant performance gains for a wide variety of workloads and aimed at addressing new data center challenges. The new AMD EPYC™ 7002 Series Processor offers up to 64 multithreaded cores per chip for a total of 128 processing cores in a single socket, dual processor server. It delivers dual-socket performance and I/O without the dual-socket price tag. AMD is also the first to bring to market an x86 data center processor based on 7nm process technology. With double the core density and optimizations that improve instructions per cycle, the result is 4x the Floating-Point performance of 1st Gen AMD EPYC™. Using 7nm process technology also brings energy efficiency so the 2nd Gen AMD EPYC™ can provide the same performance at half the power consumption[1]. That is amazing!

 

Alongside its high core count there are an extra pair of memory channels, allowing the AMD EPYC™ 7002 Series Processors to take advantage of up to 4TB of RAM for a single socket and 8TB for a dual socket server with 256GB DIMMs. For companies looking to host multi-tenant workloads, the option of adding more DRAM means more tenants can be added per server, which translates to substantial increase in revenue streams.

Mellanox ConnectX Adapters

Mellanox ConnectX offers 200Gb/s InfiniBand (HDR) and Ethernet connectivity, with sub-600 nanosecond latency and up to 200 million messages per second. Mellanox ConnectX SmartNICs and BlueField I/O Processing Units (IPU) are the world’s first PCIe Gen4 smart adapters. The ConnectX smart adapter solutions are optimized to provide breakthrough performance and scalability with the new AMD EPYC™ 7002 Series processor for the most demanding compute and storage infrastructures. By using more of the faster PCI Express 4.0 lanes, Mellanox ConnectX 100 and 200 gigabit per second adapters can achieve full I/O throughout with direct connectivity to 24 NVMe storage drives in a single system. The combination of Mellanox adapters with PCIe Gen4 support and the 2nd Gen AMD EPYC™ processor are ideal for advanced server and storage solutions, providing high-performance computing, artificial intelligence, cloud and enterprise data centers the high data bandwidth they need for the most compute and storage demanding applications. By leveraging the PCIe Gen4 support in both 2nd Gen AMD EPYC™ processors and ConnectX adapters, mutual customers can maximize their data center return-on-investment.

Kernel Bypass Technology

Network and storage processing are very CPU-intensive operations; however, the CPU doesn’t only have to handle these data movement and processing tasks, it must also perform application workload activities. Mellanox ConnectX adapters utilize offloads and accelerators such as Accelerated Switching and Packet Processing (ASAP²), Remote Direct Memory Access (RDMA), and overlay network encap/decap (e.g. for VXLAN) to relieve the CPU from I/O tasks and enable the industry’s lowest network latency. This allows for more efficient data movement for the network, storage devices and application workloads, resulting in lower application latency and leaving more CPU cycles available to accelerate applications and processes.

Impact on Compute and Storage

The improved PCIe Gen4 bandwidth and added PCIe lane count will directly translate to helping tackle the growing need for more compute processing and storage bandwidth. Most of the bandwidth need is in the PCIe bus as a path to local and networked storage and network links to other servers. The added memory is a bonus for storage solutions where a large memory cache is needed, and the up to 4TB of memory for a single socket is a lot of headroom for future workloads.

Where will we see AMD EPYC™ 7002 Series Processors fitting in initially? There are many use cases but to name a few, first might be single socket Windows Storage Spaces Direct (S2D) solutions. These are typically 1U and 2U platforms that support a multi-node, hyperconverged infrastructure (HCI) deployment. Building them with the 2nd Gen AMD EPYC™ processors will allow more dedicated NVMe PCIe lanes without the need for a PCIe NVMe switch. That means more NVMe SSDs with higher storage throughput and IOPS available for workloads running on these platforms.

In a Hyper-Converged solution, one could set up the system with a higher clock speed CPU versus core depth since most of the common virtual machine workloads each use 2-4 virtual CPUs. By utilizing 16 cores with 1TB of RAM, the AMD EPYC™ 7002 Series Processor would provide a solution that bumps up the core density without the need to add the cost of a dual socket setup.

Again, leading the charge to adopt new technology, cloud computing market is already taking advantage of the massive compute capacity of AMD EPYC™ 7002 Series Processors. Microsoft Azure is already offering their customers industry-leading compute performance for all their workloads. After being the first global cloud provider to announce the deployment of AMD EPYC™ 7001 Series Processor based Azure Virtual Machines in 2017, Microsoft been working together with AMD and Mellanox to continue to bringing the latest computing innovation to enterprises of all size and shape. Azure Virtual Machines provide more customer choice to meet a broad range of requirements for general purpose workloads using the new AMD EPYC™ 7002 processor and Mellanox SmartNICs.

Impact on 5G, NFV and Edge Cloud

For telecommunication carriers and multi service operator companies who are looking to deploy virtualized telco cloud infrastructure to support 3GPP 5G CUPS, Network Functions Virtualization (NFV) and Multi-Access Edge Computing (MEC) workloads, having highest capacity economical compute coupled with fastest efficient network means highest performance at the lowest cost for service provider applications. Given the CapEx and OpEx reduction pressure for the service provider industry, AMD EPYC™ 7002 Series Processors and Mellanox SmartNICs combination very quickly translates to highest return on investment and fastest time to ARPU (average revenue per user).

When the Rubber Meets the Road

We decided to put an AMD EPYC™ 7002 Series Processor based server with Mellanox ConnectX-5 PCIe Gen4 SmartNICs to the test in both virtualized and bare metal OpenStack cloud environments. The OMG performance results of our telco benchmark testing are summarized below.

AMD EPYC™ 7002 Series Processor with ConnectX NICs Deliver 197 million packets per second and near-line rate on bare metal servers

Virtualized Telco Cloud Testing: AMD EPYC™ 7002 Series Processor-Based Server with ConnectX-5 PCIe Gen4 100G Adapters

In bare metal server testing, we saw over 197 Million Packet Per Sec (Mpps) at 64-byte frames and over 93Gbps or just over 97% of line rate. While running at 1518-byte frames and utilizing dual ports of a ConnectX-5 with PCIe Gen4 connectivity to an AMD EPYC™ 7002 Series Processor with 16 cores, there was still ample room left for application processing with three fourths of the cores unused and available. Theoretically, with just a single socket AMD EPYC™ 7002 Series Processor 64-core system that supports 4 PCIe Gen4 slots, using Mellanox ConnectX-5 SmartNICs, one could achieve 600 Mpps packet rate or 400Gbps aggregate throughput on a single CPU server.  That really is OMG performance!

AMD EPYC™ 7002 Series Processor and ConnectX 5 utilizing ASAP² deliver up to 10X better performance than when using DPDK

Virtualized Telco Cloud Testing: AMD EPYC™ 7002 Series Processor Server with ConnectX-5 PCIe Gen4 100G Adapters

 

AMD EPYC™ 7002 Series Processor and ConnectX 5 with ASAP² deliver up to 2.5X better performance than OVS-DPDK

Virtualized Telco Cloud Testing: AMD EPYC™ 7002 Series Processor Server with ConnectX-5 PCIe Gen4 100G Adapters

In a virtualized server environment, when we compare at the ASAP2 OVS hardware offload versus OVS-DPDK testing with multi-tenant UDP traffic, ASAP2 were able to achieve 67Mpps at 114-byte frame size and 87.84% of line rate at 1518-byte frame size, all without any CPU cores required for the network load (i.e. UDP VXLAN packet processing). Whereas with OVS-DPDK for multi-tenant UDP traffic, we were only able to achieve only 6.6 Mpps for 114-byte frames, or just 33.2Gb/s and 33.2% of line rate for 1518-byte frames while still consuming 12 CPU Cores for packet processing. Thus, by utilizing ASAP², we were able to achieve up to 10X or 1000% the packet rate and 2.5X or 250% the throughput versus OVS-DPDK for overlay UDP traffic without consuming any CPU cores. Without ASAP2 technology, the massive compute capacity available in AMD EPYC™ 7002 Series Processors could remain untapped due to scarcity of high-speed network traffic. Indeed, this proves the well-known adage that faster compute needs faster networks! Mellanox SmartNICs achieve OMG performance with AMD EPYC™ 7002 Series Processors.

AMD EPYC™ 7002 Series Processor and ConnectX 5 with DPDK deliver up to 25.8 Mpps for 64-byte packets

Virtualized Telco Cloud Testing: AMD EPYC™ 7002 Series Processor Server with ConnectX-5 PCIe Gen4 100G Adapters

In our final test case, we tested OVS performance for UDP only traffic with all 12 CPU cores dedicated to the OVS running over DPDK. The graph above shows the performance results for various packet sizes and percentage of line rate traffic for the test methodology. At 64-byte frame size, OVS was able to achieve 25.8Mpps. This is an amazing performance! 

Summary

With the release of industry’s first PCIe Gen4-capable X86 CPU with the AMD EPYC™ 7002 Series Processor, AMD has revolutionized the computing industry to take advantage of the massive compute capacity for all kinds of workloads. The collaboration between Mellanox and AMD has been at the heart of this sea change. Together with AMD EPYC™ 7002 Series Processors, Mellanox SmartNICs are enabling smarter, better, and faster networking without compromising the efficiency of modern cloud native data centers. Beyond the phenomenal benchmarking performance already demonstrated for HPC, storage and cloud computing workloads, Mellanox has now also validated OMG performance gained from the combination of AMD EPYC™ 7002 Series Processors and ConnectX network adapters for the telecommunications and service provider use cases.

[1] Based on June 8, 2018 AMD internal testing of same-architecture product ported from 14 to 7 nm technology with similar implementation flow/methodology, using performance from SGEMM.  EPYC-07

 

 

How 100Gb Ethernet and DPDK drivers Are Enabling 5G Services

Unlocking the Promise of 5G

Wireless carriers have been hyping the next generation cellular technology, 5G, for years but the reality of it is certain to start rolling out this year. Wireless networks are always evolving, but this is more than a cellular upgrade as 5G will not only increase speeds but offer enhancements in latency, vastly improving responsiveness which will open new capabilities in wireless technology such as offing a replacement for traditional home internet service, a boost to self-driving cars, and new possibilities for the Internet of Things (IoT).

Key Market Trends That Are Driving 5G Technology Adoption

Demand for gigabyte-per-second mobile device performance that will dramatically change how people work and interact in the cloud. 5G is expected to deliver significantly enhanced performance compared to 4G LTE. This includes infinite connectivity, higher bandwidth, lower latency, increased reliability, and faster mobility. The following key technologies will be driving the performance requirements:

  • Self-driving cars and mission-critical virtual healthcare services that require ultra-reliable, high-bandwidth, and low-latency communications.
  • IoT and machine communications for smart factories, smart homes, and smart communities that will exponentially increase the number of internet connections.
  • Emergent technologies such as artificial intelligence, virtual reality (VR), augmented reality (AR), and drones that change the way humans interact with machines and with each other. For people that are thinking that drone-based services aren’t a reality, Chinese online retailer JD.com today delivers shopping packages using drones in mainland China.
  • Continued massive growth in video traffic requiring increased network efficiency, faster performance, and improved network latencies.

Analysts predict that service providers will be able to quickly monetize new 5G services in areas such as consumer-based media and entertainment, self-driving cars, smart cities, healthcare, and automated factories.

When to Expect 5G Services

After years of talking about 5G, telecom carriers most influential players are starting to roll out 5G services. According to Fierce Wireless, analyst firm CCS Insights now expects that 5G connections will approach 280 million by 2021—a 25% rise from their October 2017 forecast. Further, it is predicted that 1 billion mark in terms of 5G connections will be established in mid-2023, and 2.7 billion by 2025 (that’s one of third of the world’s population!).

F5 Networks recently announced several new solutions and enhancements designed to allow service providers to launch 5G services. Included are improvements to their network functions virtualization (NFV) offering that enable the optimization and scale of existing 4G LTE and new 5G networks. These F5 solutions are powered by high performance networking based on Mellanox Ethernet technology including 100G network adapters, switches and cables.

How F5 Ensures a Successful Move to 5G

Although operator’s transition to 5G network seems like a giant leap, good news is that many of the F5 solutions and technologies that support 5G capabilities are already well defined and in use by service provider and enterprise organizations around the globe. Thankfully, many of the F5 NFV solutions can be used to optimize and help secure your existing 4G LTE networks. These include:

  • Network Functions Virtualization (NFV)
  • L4-L7 network services Consolidation
  • Multi-access Edge Computing
  • Automation and Orchestration
  • Network Slicing
  • GPRS Tunneling Protocol (GTP) Security
  • IoT Solutions for the Access Network and Data Center
  • DDoS Solutions in the Access Network, Data Center, and Cloud
  • Intrusion Prevention Systems (IPS)
  • Web Application Firewalls (WAF)
  • Load Balancers and Application Traffic Managers

Getting 5G capabilities up and running as quickly as possible will help operators to maintain their competitive edge and secure new 5G service revenues. At the same time, operators must maintain and optimize existing 4G LTE networks for their customer base. Hence, F5 believes it will be critical to follow three key imperatives to succeed while transitioning to 5G:

  • Optimize Your Network – Simplify and scale your existing 4G LTE network while transitioning to 5G, leveraging high-performance virtualized software solutions.
  • Secure Your Platform – Protect your 5G network at massive scale at every layer and for multiple threats.
  • Monetize New Solutions – Accelerate the time to market of new, compelling, and differentiated 5G services to your enterprise customers and consumer base.

Delivering a high performance and efficient network is of paramount importance to enable all the above essentials of providing a 5G network at scale.

Leverage Virtualized/Cloud-Based Technologies to Meet 5G Network Demands

With the rapid transition to high performance virtualized/cloud-based edge, core, and data networks, service providers can scale and simplify their existing 4G LTE network and evolve to 5G with high-performance virtualized software solutions from F5. Thus, with F5 operators can:

  • Simplify their core network architecture and operations and reduce costs with the integration of L4-L7 network services into a single platform, deployable as hardware and virtual appliances.
  • Migrate seamlessly to a NFV (Network Functions Virtualization) infrastructure using a broad range of Virtual Network Functions (VNFs) and a VNF Manager
  • Meet 5G’s latency and high throughput requirements with Multi-access Edge Computing (MEC) solutions
  • Support transition from 4G to 5G and services migration
  • Leverage automation and orchestration tools to simplify operations and improve efficiency.
  • Transition from CapEx to OpEx consumption models with subscription-based licensing models.

Virtualized Cloud Based Technologies Saves the Day!

While Virtualization and Cloud based technologies improve the scalability, agility and operational simplicity, they also impose significant performance penalties by utilizing host CPU cycles for processing networking traffic. This problem becomes more critical as bandwidth increases to 25/40/50/100 and 200Gb/s which drivers higher CPU consumption and leads to server proliferation.

To solve this challenge, higher performance and increased throughput is enabled through F5’s BIG-IP Virtual Edition support of the Mellanox’s flagship ConnectX family of network interface adapters, including 100Gb Ethernet and DPDK drivers. Working together, Mellanox and F5 provide a solution that boosts data plane performance to near line rate using optimized DPDK drivers that reduce the overhead associated with processing packets. Mellanox network adapters significantly improve the performance of the entire portfolio of F5 BIG-IP VNF Portfolio to near line rate of 100Gbps throughput. Mellanox ConnectX family of network adapters with 200/100/50/25/10G ethernet speeds and networking offload engines are purpose built to meet extreme networking bandwidth required for an 5G infrastructure upgrade and services. With multiple times the performance packed into the same infrastructure footprint, service providers working with F5 and Mellanox can quickly reap the benefits of new 5G services to maximize the return on their 5G network build-out.

Mellanox DPDK: Unmatched Performance in the Industry

Mellanox DPDK

Data Plane Development Kit (DPDK) is a software acceleration technique comprised of a set of software library and drivers that reduces CPU overhead caused by interrupts that are sent each time a new packet arrives for processing. DPDK implements a polling process for new packets and the key benefits of significantly improving processing performance while eliminating PCI overhead and maintaining hardware independence. Although DPDK technology consumes some CPU cycles, Mellanox ConnectX-5 Intelligent NICs offer the industry’s highest bare metal packet rate of 148 million packet per second for running cloud applications such as VNFs over DPDK. Mellanox is an active participant and leads the DPDK software community in driving innovation.

Mobile World Congress 2019 Demo

In the technology world, seeing is believing! Therefore, F5 and Mellanox have put together a demo that shows a 400Gbps high performance and ultra-high density Ethernet fabric in a single rack that portrays the scale and performance needed for 5G network infrastructure. Several BIG IP VNFs including Traffic Management, DDoS, Firewall, , Load Balancer, etc. will perform respective L4-L7 network services at a massive scale over this high performance Ethernet fabric to manage real world 5G traffic.

Here’s a sneak peak of the setup and configuration:

400Gbps high performance and ultra-high density Ethernet fabric

The focus of this demo is on using commercial off-the-shelf (COTS) hardware running BIG-IP VE with Mellanox NICs and Switches to achieve 100G+ throughput in an ultra-high, high-density solution that is available for purchase today. F5, along with Mellanox, are demonstrating real-world performance of a qualified/certified commercially available solution which demonstrates the following:

  • Two Servers with two Mellanox dual-port 100G ConnectX-5 NICs for traffic generation
  • Mellanox SN2010 and SN2100 Ethernet Switches delivering line rate throughput – with Zero Packet Loss!
  • Two Servers with Mellanox dual-port 100G NIC’s for High Availability capable of taking tremendous loads and fending off attacks
  • Performance of 400G – 15% using DPDK driver = 340Gbps throughput by bonding multiple Mellanox 100G NICs

We would like to invite you to join us in Barcelona at Mobile World Congress starting Monday February 25th through Thursday February 28th, 2019 where the demonstration will be shown live in F5 booth at MWC.

Please visit F5 Booth Hall 5 Stand#19 to learn more!

Vote NOW for Mellanox on Open Infrastructure Summit, Denver Presentations

The 2019 Open Infrastructure Summit (formerly called as Openstack Summit) has opened up voting for presentations to be given on April. 29 – May 1, in Denver, USA. Mellanox has a long history of supporting OpenStack with technology, products and solutions. We have submitted a number of technical papers ready for voting! The OpenStack Foundation receives more than 1,500 submissions and of these, they only select 25-35% for participation so every vote counts!

Voting on a topic is super easy and takes less than 5 seconds per topic. Each topic below has a link that will take you to its voting page. In order to vote, an OpenStack account is needed. This can be easily created at the top of the page.

Voting closes on Monday, February 4 at 11:59pm Pacific Time (Tuesday, February 5 at 7:59 UTC).. So please don’t delay and VOTE “WOULD LOVE TO SEE” TODAY! It’s fast and easy.

Highly Efficient Edge Cloud Data Center for 5G

Power, cooling and real estate are the key constraints for Edge Clouds but they should not be met at the expense of performance and efficiency. What if we could build high density micro data centers for the edge cloud without sacrificing performance and efficiency? Sounds too good to be true? Come and learn about data plane acceleration!

Nokia and Mellanox have teamed up to create an integrated edge cloud infrastructure supporting “switch on SmartNIC” offloading for Virtualized and Containerized infrastructure.  This solution boosts the network performance by an order of magnitude. Specifically, SDN using overlay networking such as VXLAN can be completely offloaded.  Not only can we terminate tunneling on the edge cloud servers, but we can also switch traffic between various VNFs/CNFs, apply ACLs and much more.

This review session covers the design, performance, and benefits of a high performance and efficient edge cloud data center.

Vote for Highly Efficient Edge Cloud Data Center for 5G. 

It’s a Cloud.. it’s a SuperComputer.. no, it’s SuperCloud!

CSIRO is Australia’s national science agency and to lead in science, it needs the best, uncompromising IT.

Since the beginning of the Cloud revolution, scientists had to choose between top performance of supercomputers and the agility of cloud. To address this challenge CSIRO, Mellanox and Red Hat created SuperCloud – a bare-metal OpenStack system with SDN InfiniBand.

First presented at the Vancouver Summit, the project evolved, gaining features: HA, ephemeral hypervisors, software defined storage, GPU/ML support and InfiniBand connected containers. SuperCloud is now the platform supporting all of the Scientific Computing needs.

Building upon open standards, SuperCloud brings Infrastructure as Code methodology to bare metal. It supports the vast array of DevOps tools, enabling users to programmatically request HPC resources, from compute, to NVMe, to InfiniBand networks. This allows building HPC clusters, RDMA storage and containerised workloads quickly, with a simple playbook.

Vote for It’s a Cloud…it’s a SuperComputer…no, it’s SuperCloud!

NVMe over Fabrics – Demystified

The new Cinder NVMe SSD interface can be connected across a Network. In fact it can be connected across lots of different fabrics: Ethernet (3 approaches), Fibre Channel, InfiniBand, and PCIe to date. OpenStack Data Centers want to share storage readily among multiple compute nodes and be able to perform clustering, failover, and other system-wide operations at NVMe SSD speeds. NVMe over Fabrics (NVMe-oF) is the solution. This talk will describe the technology in its many forms. Describe use cases, for both Enterprise and Cloud, where it is being applied. Then finish with potential future directions it is heading.

Vote for NVMe Over Fabrics – Demystified.

Cyber Security SmartNIC for Modern Cloud Scale Data Centers

Traditionally, data centers have been protected by perimeter-based security technologies, placed at key ingress and egress points, designed to restrict and analyze the traffic coming in and out of the data center (north-south). Lateral (east-west) traffic within the data center was assumed to occur in a well-protected trusted zone and therefore not restricted.

Organizations are still dealing with security breaches originating within their own data centers. Isolated networking is of paramount importance to address this problem. SoC based SmartNIC is a computer in front of a computer that creates a trust zone for software security controls by separating the infrastructure from the host applications and a potential attacker. Accelerating network along with trust-based security functions addresses cloud-scale data center security risks without sacrificing performance.

Vote for Cyber Security SmartNIC for Modern Cloud Scale Data Centers.

 

An Efficient Scale-Out Deep Learning Cloud – Your Way

The Duchess of Windsor famously said that you could not be too rich or too thin.  And whether or not that is correct, a similar observation is definitely true when trying to match deep learning applications and compute resources: you cannot have enough horsepower.

Intractable problems in fields as diverse as finance, security, medical research, resource exploration, self-driving vehicles, and defense are being solved today by “training” a complex neural network how to behave rather than programming a more traditional computer to take explicit steps.  And even though the discipline is still relatively young, the results have been astonishing.

The training process required to take a Deep Learning model and turn it into a computerized savant is extremely resource-intensive. The basic building blocks for the necessary operations are GPUs, and though they are already powerful – and getting more so all the time – the kinds of applications identified will take whatever you can throw at them and ask for more.

In order to achieve the necessary horsepower, the GPUs need to be used in parallel, and there lies the rub.  The easiest way to bring more GPUs to bear is to simply add them to the system.  This scale-up approach has some real-world limitations.  You can only get a certain number of these powerful beasts into a single system within reasonable physical, electrical, and power dissipation constraints.  If you want more than that – and you do – you need to scale-out.

This means that you need to provide multiple nodes in a cluster, and for this to be useful the GPUs need to be shared among the nodes.  Problem solved, right?  It can be – but this approach brings with it a new set of challenges.

The basic challenge is that just combining a whole bunch of compute nodes into a large cluster – and making them work together seamlessly – is not simple.  In fact, if it is not done properly, the performance could become worse as you increase the number of GPUs, and the cost could become unattractive.

Mellanox has partnered with One Convergence to solve the problems associated with efficiently scaling on-prem or bare metal cloud Deep Learning systems.

Mellanox supplies end-to-end Ethernet solutions that exceed the most demanding criteria and leave the competition in the dust. For example, we can easily see the performance advantages with TensorFlow over a Mellanox 100GbE network versus a 10GbE network, both taking advantage of RDMA in the chart below.

 

RDMA over Converged Ethernet (RoCE) is a standard protocol which enables RDMA’s efficient data transfer over Ethernet networks allowing transport offload with hardware RDMA engine implementation, and superior performance. While distributed TensorFlow takes full advantage of RDMA to eliminate processing bottlenecks, even with large-scale images the Mellanox 100GbE network delivers the expected performance and exceptional scalability from the 32 NVIDIA Tesla P100 GPUs. For both 25GbE and 100GbE, it’s evident that those who are still using 10GbE are falling short of any return on investment they might have thought they were achieving.

Beyond the performance advantages, the economic benefits of running AI workloads over Mellanox 25/50/100GbE are substantial. Spectrum switches and ConnectX network adapters deliver unbeatable performance at an even more unbeatable price point, yielding an outstanding ROI. With flexible port counts and cable options allowing up to 64 fully redundant 10/25/50/100 GbE ports in a 1U rack space, Mellanox end-to-end Ethernet solutions are a game changer for state-of-the-art data centers that wish to maximize the value of their data.

The performance and cost of the system are the ticket in the door, but an attractive on-prem Deep Learning platform needs to go further; much further.  You need to:

  • Provide a mechanism to share the GPUs without users or applications monopolizing the resources or getting starved.
  • Ensure that the GPUs are fully utilized as much as possible. These beasts are expensive!
  • Automate the process of identifying the resources on each node, and making them immediately useful without a lot of manual intervention or tuning.
  • Make the system easy to use overall, since experts are always the minority in any population.
  • Base the system on best-in-class open platforms to enhance ease of use, and to enable rapid integration of better frameworks and algorithms as they become available.

Many of these challenges have been overcome for organizations that are able to offload their compute needs to the public cloud.  But for those companies that cannot take this path for regulatory, competitive, security, bandwidth or cost reasons, there has not been a satisfactory solution.

In order to achieve the goals highlighted above, One Convergence has created a full stack Deep Learning-as-a-service application called DKube that addresses these challenges for on-prem and bare metal cloud users.

DKube provides a variety of valuable and integrated capabilities:

  • It abstracts the underlying networked hardware consisting of compute notes, GPUs, and storage, and allows them to be accessed and managed in a fully distributed composable manner. The complexity of the topology, device details, and utilization are all handled in a way that makes on-prem cloud operation as simple as – and in some ways simpler than – the public cloud.    The system can be scaled out by adding nodes to the cluster, and the resources on the nodes will automatically be recognized and be made useful immediately.
  • It allows the resources, especially the expensive GPU resources, to be efficiently shared among the users of the application. The GPUs are allocated on-demand, and are available when not actively being used by a job.
  • It is based on best-in-class open platforms, which make the components familiar to data scientists. DKube is based on the containerized standard Kubernetes, and is compatible with Kubeflow, supporting Jupyter, TensorFlow, and PyTorch – with more to come.  It integrates the components into a unified system, guiding the workflow and ensuring that the pieces work operate together in a robust and predictable way.
  • It provides an out-of-the-box, UI-driven deep learning package for model experimentation, training, and deployment. The Deep Learning workflow coordinates with the hardware platform to remove the operational headaches normally associated with deploying on-prem.  Users can focus on the problems at hand – which are difficult enough – rather than the plumbing.
  • It is simple enough for users to get the application installed on an on-prem cluster and be working with models within 4 hours.

 

At KubeCon + CloudNativeCon conference to be held in Seattle from Dec 10-13, 2018, you can see a demo of One Convergence DKube system using Mellanox high speed network adapters that enable Deep Learning-as-a-Service for on-prem platforms or bare metal deep learning cloud.

 

Supporting Resources:

The Best Smart NIC for the Cloud Enables software-defined, hardware-accelerated networking for high performance with programmable flexibility

Best SmartNIC for Building the Smart Cloud: PART II

As the pendulum swings away from hardware-defined infrastructure, the best smart clouds will be software-defined and hardware-accelerated.

Figure 1: Hardware Accelerated, Software Defined World Achieves Total Infrastructure Efficiency

 

In part one of my blog, I drew a conclusion that smart devices around us are changing our lives in remarkable ways. Yet, the infrastructure to support these smart innovations hasn’t fully evolved in terms of flexibility, performance and efficiency. A software defined world offers flexibility but at the cost of performance and efficiency. A hardware defined world holds some distinct advantages in terms of performance but lacks flexibility. To bridge the agility, performance and efficiency gaps between hardware defined and software defined worlds, we need to strike a balance between flexibility, performance and efficiency. A hardware accelerated, software defined world is the ultimate nirvana for solving the total infrastructure efficiency challenges when building cloud scale and cloud native architectures.  Mellanox SmartNICs are at the forefront of the next infrastructure transformation. We are leading the market with purpose-built hardware that complements and turbocharges software defined infrastructure for cloud and communication service providers.

Networking Adapters for the Smart Cloud have evolved from basic/foundational NICs to Intelligent NICs and now to high-performance SmartNICs.

Figure 2: Evolution of the Network Adapters

 

Mellanox Smart Network Adapters Offload Common Tasks from the CPU

Unlike Basic/Foundational or regular NICs, SmartNICs including Mellanox ConnectX-5 and BlueField are built on the key tenets of maximizing performance and agility without sacrificing efficiency. Mellanox Smart Network Adapters or SmartNICs offer many smart networking offloads including network overlay offloads for multi-tenant cloud data centers, virtual switch or virtual router offloads, flow classification, traffic steering; routing, switching, network address translation (NAT), port address translation (PAT), quality of service and many other features that are handled inefficiently using general purpose CPUs.  Storage offloads include RDMA over Converged Ethernet (RoCE), NVMe over Fabrics (NVMe-oF), Erasure Coding, iSER over RDMA, etc. In addition, Mellanox Smart Network Adapters provide offloads to boost the performance of a variety of cloud native workloads including Artificial Intelligence using Tensorflow and Big Data using Spark and Hadoop frameworks.

Optimized networking ASICs are the most efficient silicon choice for building efficient SmartNICs for the cloud.

Figure 3: Choosing the Right Hardware Acceleration Option

 

Mellanox BlueField High Performance SmartNIC delivers Speed, Efficiency, and Flexibility

But when it comes to selecting the right high-performance SmartNIC for a network design task at hand, it is important to evaluate flexibility vs. efficiency. Achieving one at the expense of the other is like going back to the initial problem of hardware defined vs. software defined networks. ASIC (Application Specific Integrated Circuit) based Intelligent Network Adapters such as Mellanox ConnectX family are ideal for well-defined, price-power-performance efficient networking offloads. Given that ASICs are pre-programmed for the highest efficiency in the silicon, flexibility isn’t the highest with such network adapters. Mellanox BlueField System on Chip (SoC) SmartNICs offer a combination of highly efficient and flexible networking for the SDX world. The BlueField SmartNIC SOC combines an embedded ARM processor’s flexibility with ConnectX-5 ASIC’s native offload capabilities, and thus takes the smart networking to a whole new level. Due to the BlueField SmartNIC’s bump-in-the-wire architecture, it is a perfect SmartNIC for running network services such as virtual switch (vSwitch) or virtual router (vRouter) control and data planes, host isolation, traffic engineering, real time threat detection and mitigation, advanced storage offloads such as NVMe-oF, etc. right on the wire. A quick comparison of SmartNICs based on price-performance, ease of programming and flexibility is important when choosing the right tool for the network design job at hand.

Mellanox works with a large ecosystem of server, cloud, storage and software partners to deliver efficient cloud infrastructure with programmable hardware acceleration

Figure 4: Broad Ecosystem Support for Deploying Efficient Cloud Infrastructure

Mellanox is the inventor and pioneer of the Smart Adapter technology. For a decade, Mellanox smart network adapters such as ConnectX have been certified by all leading computer and storage server vendors including HPE, Dell, IBM, Cisco, Lenovo, Supermicro, Oracle and Nokia. The world’s top Cloud Service Providers, Web 2.0 and Hyperscale customers including Microsoft Azure, Alibaba and Tencent are trusting Mellanox to build their Cloud Scale, Virtualized Data Centers using Mellanox Smart Adapters. As Albert Greenberg of Microsoft Azure mentioned in his ONS 2014 Keynote, Microsoft Azure networking achieves a fantastic storage scale and efficiency using the performance and efficiency advantages of RoCE networking to make the storage cheaper. Also, several cloud providers have been using Mellanox smart adapters to accelerate OpenStack Networking.

Alibaba, the China e-commerce giant has been using Mellanox smart adapter technologies such as RoCE and DPDK to optimize online transaction processing times and boost topline revenues.  Tencent built a high performance and low latency artificial intelligence cloud service using Mellanox Ethernet and InfiniBand smart network adapters. But note that SmartNICs aren’t limited to only cloud scale customers. Any business, be it a service provider or a large enterprise, that needs to achieve a competitive edge must think of building a smart cloud scale infrastructure driven by total infrastructure efficiency.

As the world around us is becoming smarter, it is imperative that the next generation cloud scale infrastructure that is helping to connect billions of these smart things to consumers and business also becomes smarter. Mellanox is leading the transformation of SmartNIC driven Cloud Infrastructure. With end to end smart technologies surrounding us, the Smart Era has finally arrived!

Additional Resources:

The Best SmartNIC for the Cloud Enables software-defined, hardware-accelerated networking for a high performance SmartNIC with programmable flexibility

Best Smart NICs for Building the Smart Cloud: PART I

The Smart Cloud is here and it needs the Best Smart NICs

Amazon recently announced that Alexa, the smart personal voice assistant is coming out of a small desktop gadget (pick your flavor: echo dot, spot, hub, show, etc.) and getting into the very fabric of our life. Anything and everything is getting an Alexa boost from things like microwave, home security, car infotainment and even your wall clock. Soon, smart devices are going to get even smarter – they can even hear you whisper (e.g. when your kids are napping in the living room) and answer you back with a whisper (not to disturb the sleep of your kids). That is so smart!! And thus, a smart fabric of Artificial Intelligence and Machine Learning is being built around us to make our lives easier.

The growing use of smart personal assistants like Alexa and Google Assistant are putting increasing loads on servers and computer networks in smart cloud infrastructure.

Figure 1: Smart Personal Assistant Products

 

But Smart is not only limited to the smart personal assistants. Today, there are a variety of smart personal assistants from Amazon, Google, Apple and others in the market, but the key thing to remember is that “Smart” is an evolution of life. We have seen many innovations already that are all happening around us, that are termed “Smart”.  Over the last decade, we have happily adopted anything and everything that made our life easier. Be it the SmartPhone that came out from Apple in 2007 that gave us a new and unique human-compute interface, or Smart cars that are environmentally friendly, fuel efficient and easier to park in large cities.  Or, smart credit cards with embed microprocessors that have the ability to securely process payment transactions, protecting consumers from fraud and identity theft. Life is getting Smarter all around us. I call this living in a Smart Era!

Smart Innovations are everywhere in our phones, cars, credit cards and appliances. Managing them requires a Smart Cloud.

Figure 2: Smart Innovations Are Everywhere: SmartPhone, SmartCar, SmartCard

 

Moving to Software-Defined Networking and Software-Defined Everything

While the world around us has evolved to be “Smarter”, certainly consumer technologies have been leading the charge. However, the backend infrastructure to support Smart devices was still struck in a dinosaur era. For decades, the “backbone” supporting these devices was built using proprietary and purpose built hardware appliances such as Routers, Switches, Load Balancers, Firewall, CDN, Wan Accelerators, Gateways, etc. The list can go on and on. All these purpose built and proprietary hardware appliances are great at what they do, but they are not flexible, hinder innovation and are locked down. I call the hardware defined everything world ‘a dinosaur era’ because it is hard to sustain.

The evolution from hardware-defined appliances to software-defined everything has increased flexibility but at the cost of decreased performance.

Figure 3: Evolution from hardware to software defined infrastructure

 

Agility, Effectiveness and Efficiency are the key drivers for the long term success of any business. In 2007, we saw a movement to disrupt this age old way of building monolithic and vendor locked networking infrastructure. Software took over hardware and Open Source technologies such as kernel based virtual machine (KVM), open virtual switch (OVS) and OpenStack became the building blocks of cloud infrastructure and revolutionized the way next generation networks would be built. Agility, Automation, and Community-driven Standards were at the heart of this software defined transformation. I called this a software defined everything world. The pendulum swung all the way to the other end and hardware became commodity for compute, network and storage infrastructure. Software became the brain and the most value-added asset in the software defined everything world. We entered an evolutionary phase with a radically different way of building software defined infrastructure with server virtualization, network virtualization, software defined networks, network function virtualization and software defined storage. So many innovations, all happening pretty quickly. Great news for the industry right? Well, not quite.

 

What Is The True Cost of Software-Defined Networking?

Nothing in life is free. As we know, there is a hidden cost to every transformation. This is true for the software defined world as well. As we shifted network functions from purpose-built hardware appliances to software-driven virtual appliances, we saw that general purpose CPUs took the center stage. While these CPUs are great for general application tasks, they aren’t specialized in handling network and storage workloads.

Running Virtual Network Function (VNFs) on the CPU impacts networking performance in terms of throughput, packet rate, latency, and I/O operations for storage workloads. Further, disaggregation of software and hardware, along with the virtualization of servers and networks, consumes even more CPU due to software overloads. Think of the layers of abstractions that software virtualization like hypervisors or virtual routers or overlay networking creates in order to run workloads in a hardware-agnostic manner. The result of all of this is decreased efficiency that looks something like below.

Using high server virtualization and network virtualization with increased security requirements greatly reduces performance compared to bare metal servers. But this can be remedied in a smart cloud network by moving virtualization and security tasks to a high performance Smart NIC.

Figure 4: Software Defined Everything Creates Bottlenecks

 

Best Smart Cloud Performance Requires the Best Smart NIC

We achieved flexibility at the expense of efficiency. As seen above, we started with a clean slate, high performance, bare metal, commodity server and due to SDX such as virtualization, disaggregation and Security overheads, ended up with only 20-25% compute power available for running business applications, or for offering that power to tenants for infrastructure as a service (IaaS). Imagine a server costing $20,000 with only $5000 worth of available compute capacity. Essentially, due to SDX penalties, we need to pay up 400% on the capital expense to match to a bare metal server’s compute power. This is a great news for server CPU vendors as it increases the server footprint but it defeats the very essence of software defined everything. In term of capital savings, we are at the same place as before or even worse when building the next generation software defined infrastructure with commodity servers. In addition to the CapEx increase, the purely software defined world performs inferiorly compared to a hardware defined world, due to CPU-bound network and storage processing.

At this point, you would obviously ask – “But what does this all have to do with the Smart Era?” In the next blog, I will explain how Smart NICs are revolutionizing the software defined world to achieve smartness in building next generation cloud scale Infrastructure. Indeed, SmartNICs are a key innovation to make this era truly a Smart Era.

Additional Resources:

 

OpenStack Berlin 2018

Vote NOW for Mellanox on OpenStack Berlin Presentations

The 2018 Openstack Summit has opened up voting for presentations to be given on Nov. 13-15, in Berlin, Germany. Mellanox has a long history of supporting OpenStack with technology, products and solutions. We have submitted a number of technical papers ready for voting! The OpenStack Foundation receives more than 1,500 submissions and of these, they only select 25-35% for participation so every vote counts!

Voting on a topic is super easy and takes less than 5 seconds per topic. Each topic below has a link that will take you to its voting page. In order to vote, an OpenStack account is needed. This can be easily created at the top of the page.

Voting closes tomorrow on THURSDAY, JULY 26 AT 23:59 PM PDT / FRIDAY, JULY 27 AT 06:59 AM UTC. So please don’t delay and VOTE “WOULD LOVE TO SEE” TODAY! It’s fast and easy.

Mellanox and Red Hat Deliver Unprecedented Performance, Efficiency and Simplicity for NFV Infrastructure and Agile Cloud Data Centers

At Red Hat Summit 2018, Mellanox announced an open, high performance and easy to deploy Network Function Virtualization Infrastructure (NFVI) and cloud data center solution combining Red Hat Enterprise Linux cloud software with in-box support of Mellanox NIC hardware. Our close collaboration and joint validation with Red Hat, has yielded a fully integrated NFV and cloud data center solution that delivers high performance, efficiency, and is easy to deploy. The solution includes open source datapath acceleration technologies including Data Plane Development Kit (DPDK) and Open Virtual Switch acceleration.

Private cloud and communication service providers are transforming their infrastructure in order to achieve the agility and efficiency of Hyperscale public cloud providers. This transformation is based on two fundamental tenets: disaggregation and virtualization. Disaggregation decouples the network software from the underlying hardware. Server and network virtualization drives higher efficiencies through sharing of industry standard servers and networking gears using a hypervisor and overlay networks. While these disruptive capabilities offer benefits such as flexibility, agility, and software programmability, they impose significant network performance penalties due to kernel based hypervisor and virtual switching that inefficiently consumes host CPU cycles for network packet processing. Over-provisioning of CPU cores to solve degraded network performance leads to high CapEx, thus defeating the goal to gain hardware efficiency through server virtualization.

To address these challenges, Red Hat and Mellanox are bringing to market a highly efficient, hardware accelerated and tightly integrated NFVI and cloud data center solution combining Red Hat Enterprise Linux OS with Mellanox ConnectX-5 network adapters running DPDK and Accelerated Switching and Packet Processing (ASAP2) OVS offload technologies.

ASAP2 OVS Offload Acceleration:

An OVS hardware offload solution accelerates the slow software based virtual switch packet performance by an order of magnitude. Essentially, OVS hardware offloads offers the best of both worlds: hardware acceleration of the data path along with an unmodified OVS control path for flexibility and programming of match-action rules. Mellanox is a pioneer of this ground breaking technology and has led the open architecture needed to support this innovation within the OVS, Linux kernel, DPDK and Openstack open source communities.

 Figure 1: ASAP2 OVS Offload Solution

Figure 1: ASAP2 OVS Offload Solution

As indicated in figure 1, Mellanox’s open ASAP2 OVS offload technology, fully and transparently offloads virtual switch and router datapath processing to the NIC’s embedded switch (e-switch). Mellanox has contributed heavily to the upstream development of the core framework and APIs such as TC Flower, making them now available in Linux kernel and OVS versions. These APIs dramatically accelerate networking functions such as overlays, switching, routing, security and load balancing. As verified during the performance tests conducted in Red Hat labs, Mellanox ASAP2 technology delivered near 100G line rate throughput for large VXLAN packets without consuming any CPU cycles. For small packets, ASAP2 boosted the OVS VXLAN packet rate by 10X, from 5 million packets per second using 12 CPU cores to 55 million packets per second consuming Zero CPU cores. Thus, cloud, communications service providers, and enterprises can achieve total infrastructure efficiency from an ASAP2 based high performance solution while freeing up CPU cores for packing more VNFs and cloud native applications on the same server. This will benefit customers in reducing the server footprint and achieve substantial CapEx savings. Mellanox ASAP2 is fully integrated with RHEL 7.5, and is available out of the box as tech preview for trials.

 

OVS DPDK Acceleration:

Customers who can want to maintain existing slower OVS virtio data path but still need some acceleration, can avail Mellanox’s DPDK solution to boost OVS performance. As shown in Figure 2 below, OVS over DPDK solution uses DPDK software libraries and poll mode driver (PMD) to substantially improve packet rate at the expense of consuming CPU cores.

          Figure 2: OVS-DPDK Solution

Figure 2: OVS-DPDK Solution

 

Using open source DPDK technology, Mellanox ConnectX-5 NICs deliver industry’s best bare metal packet rate of 139 million packet per second for running OVS or VNF or cloud applications over DPDK, and is fully Red Hat supported for RHEL 7.5

Network architects are often faced with many options when choosing the best technology that fits their IT infrastructure needs. When it comes to deciding between ASAP2 and DPDK, thankfully the decision making is much easier due to substantial benefits of ASAP2 technology over DPDK. Due to SR-IOV data path, ASAP2 OVS offloads achieve dramatically higher performance than OVS over DPDK, which uses traditional slower virtio data path. Further, ASAP2 saves CPU cores by offloading flows to the NIC where as DPDK consumes CPU cores in order to sub optimally process the packets. Note that similar to DPDK, ASAP2 OVS offload is an open source technology that’s fully supported in the open source communities and is gaining wider adoption in the industry.

Mellanox is an open networking company and is among the top ten contributors in the Linux kernel community. Through our cutting-edge NIC technologies, and joint innovation with open software leader such as Red Hat, we have eliminated the performance barriers associated with deploying modern cloud DC and NFV solutions. Moreover, these groundbreaking performance numbers are achieved without sacrificing valuable server resources or ease of deployment. The intelligence and parallel flow processing capabilities of Mellanox ConnectX family of Ethernet Adapters imposes minimal burden on precious CPU and memory resources, empowering NFV platforms to do what they are supposed to do: network services and application processing, rather than handling packet I/O.

Supporting Resources

  • More information about Mellanox ConnectX-5 NICs
  • Video: Mellanox DPDK for Cloud and NFV
  • Video: Mellanox ASAP2 for Cloud and NFV
MEC, VM Optimization, MWC, Mobile edge computing, Mobile World Congress, Virtual Machine Network Optimization

Watch Efficient Cloud Solutions for MEC and Virtual Machine Network Optimization at MWC 2018!

Organized by the GSMA and held in the Mobile World Capital, Barcelona, the event kicked off this morning and runs from February 26th through March 1st. Mellanox, along with our partners and open source communities,will be demonstrating cool new technologies for mobile edge computing (MEC) and virtual machine network optimization that empower mobile carriers and cloud services providers to build an efficient, agile and programmable cloud data center infrastructure. Our innovative software defined networking and network function virtualization technologies are at the core of high performance, software defined data centers that enable key emerging mobile use cases including MEC, virtual machine network optimization, central office transformation into cloud data centers, IoT, artificial intelligence and machine learning.

As you browse the MWC show floor, you will see how Mellanox powered technologies such as Single Root IO Virtualization (SR-IOV), Data Plane Development Kit (DPDK), RDMA over Converged Ethernet (RoCE), VXLAN offloads, OVS Offloads, Security Offloads and , are integrated in many of our partner’s solutions that highlight open, efficient and high performance network virtualization infrastructure. So, when you see these demos, do ask the question about what’s under the hood and see for yourself how Mellanox ConnectX adapters, LinkX cables and transceivers, and Spectrum ethernet switches boost the total infrastructure efficiency of Telco Cloud and Cloud Service Provider infrastructure.

While there are many partner demonstrating our solutions, there are a couple that are not to be missed. After all, seeing is believing!

1. Accelerated Switching and Packet Processing (ASAP2) Demo in Lenovo Booth

Lenovo is unveiling a performance optimized offering at MWC based on Lenovo ThinkSystem SR650/SR630 servers and switches, Red Hat OpenStack Platform, and Mellanox ConnectX-4 NICs for accelerated switching and packet processing. This demo highlights how open vSwitch performance bottleneck can be overcome using Mellanox ASAP2, , an open source OVS offload technology available in standard Linux distributions including RHEL.

Virtual machine network optimization

As seen in above demo setup, two Lenovo ThinkSystem SR630 Servers with ConnectX-4 Lx PCIe 25Gbps two port SFP28 ethernet adapter are running RHEL 7.4 with KVM hypervisor. The VMs are running Ubuntu 16.04. Traffic is running between VM1-VM2 and VM3-VM2 with and without ASAP² direct offloads. Spoiler Alert: The line rate throughput and very high packet rate numbers with ASAP2 will blow you away! Also, do not forget to ask about the CPU utilization while running higher throughput traffic with ASAP².
Come to Lenovo booth (Hall 3, Stand 3N30) and see yourself amazing performance numbers when you enable Mellanox ASAP² OVS Offload technology.

2. M-CORD Programmable Dataplane Switch Fabric Demo in ONF Booth

Mobile Central Office Re-architected as a Data Center (M-CORD) is ONF’s reference design and implementation for transforming mobile carrier legacy networking infrastructure into a programmatic SDN and NFV cloud data center. Mellanox is a proud member of the ONF and CORD open source communities and is participating in this multi-vendor demo of programmable data plane switch fabric for M-CORD VNF Offloading. This demo also shows ONAP and CORD integration for multi-access edge computing (MEC) and machine learning use cases.
The demo setup above will make use of ONF’s M-CORD pod with a P4-enabled programmable data plane switch fabric of 2 leafs and 2 spines. Mellanox’s high performance, most flexible, reliable and scalable 6.4Tb/s Spectrum SN2700 open ethernet switches are part of a leaf-spine programmable switch fabric and support P4 runtime interfaces for programming data plane. In this demo, you will experience building a variant of the CORD architecture for the mobile use case, in which user traffic is processed uniquely on a multi-terabit programmable switching fabric. For this purpose, a P4-enabled CORD fabric is implementing LTE’s Serving and Packet Gateway for GTP termination, filtering, and billing.
Following are key contributions from Mellanox in this demonstration:

  • Spectrum switch running the fabric.p4 pipeline, configured as a spine node, using the open source P4 compiler combined with a Mellanox Spectrum backend target.
  • Auto-generated p4 runtime info, APIs and code that is executed on the switch.
  • Spectrum P4 runtime gRPC server handles the ONOS controller pipeline configuration, table entry requests, link discovery and counter queries.

Visit ONF booth (Hall 5, Stand 5161) and experience this exciting demo to witness Mellanox Spectrum flexibility and programmability.

Happy MWC 2018!

Rett Syndrome Awareness: HPC and Cloud Technologies Giving Hope to the Silent Angels and those with Rare Diseases

In October, we paid tribute to Breast Cancer Awareness and with it, the brave souls that battle breast cancer, a disease that receives a huge amount of media attention. It is small wonder because in 2017, it’s estimated that about 30 percent of newly diagnosed cancers in women will be breast cancers. There is even talk of a cure on the horizon because the most common forms of cancer would naturally tend to get the most research and funding and attention. But what if you or someone you love don’t have something so common? What if you or someone you love is stricken with that one in a million disease? Now, imagine that that disease is actually a disorder that rolls autism, Parkinson’s disease, cerebral palsy, anxiety disorders and epilepsy, all into one. That’s Rett syndrome. Rett syndrome is a postnatal neurologic disorder that occurs almost exclusively in females and becomes apparent in babies after 6-18 months of early normal development. This rare condition leads to lifelong disabilities. More than half of those afflicted will lose their ability to walk and use hands meaningfully. Most girls impacted with this neurological disorder are non-verbal but are cognitively smart to understand spoken language. Yet, their inability to communicate is misconstrued as their cognitive incompetence. Although these Rett angels like social interaction, they often experience isolation due to lack of speech, which leads to frustration.

In the United States, a rare disease is defined as a condition that affects fewer than 200,000 people. Rare diseases have also become known as orphan diseases because drug companies are usually not interested in developing treatments. There may be as many as 7,000 rare diseases. The total number of Americans living with a rare disease is estimated at between 25-30 million. This estimate has been used by the rare disease community for several decades to highlight that while individual diseases may be rare, the total number of people with a rare disease is large. As concerning, in the United States, only a few types of rare diseases are tracked when a person is diagnosed. These include certain infectious diseases, birth defects, and cancers.

Aside from being Breast Cancer Awareness month, October was also the month designated Rett Syndrome Awareness Month. Rett syndrome only occurs once in 10,000 females which begs the question, what kind of support for research and a cure can those with rare diseases expect? Dr. Mary Jones of Katie’s Clinic at UCSF Benioff Children’s Hospital in Oakland, California is dedicated to helping individuals with Rett syndrome by providing health care and making recommendations for therapies. She says, “We have learned much about the disorder since it was first described in 1982. The gene was discovered in 1999 making it possible for research to be directed toward a cure. We know that there are over two hundred mutation types and that no two girls are alike.”

Although there is no cure at the present time for Rett syndrome, there is good news! With the amazing discovery of the reversal of Rett Syndrome symptoms in lab models in 2007, we now know that this disorder can be cured. Current research shows that Rett syndrome could be the first reversible genetic brain disorder, and also, unlock the door to treatments and a cure for other neurological disorders, such as autism and schizophrenia. But more needs to be done. Research requires the computational firepower of high performance computing, and more and more, including cloud-based solutions that can allow researchers from remote parts of the globe to share information and findings. Dr. Jones said, “Scientists worldwide are searching for a safe way to find a cure for Rett Syndrome girls. Studies in three Rett Syndrome Research Centers in Italy, Australia and the USA have proven that mice with the Rett mutation have benefitted from environmental enrichment, showing improvement in motivation to try new tasks, motor coordination and balance.” Numerous clinical trials are underway to develop treatments that will allow girls with Rett Syndrome to be more functional and Gene Therapy along with therapies holds the promise of a full-on cure.

That’s where Mellanox high performance computing and cloud solutions come in. With the ability to run millions, even billions of complex scenarios, high performance computing can help to narrow down the possibilities and push research forward for even the most elusive disease. Being able to analyze and model an individual human’s genome in a realistic timeframe to guide medical treatment is both a big data and high performance computing challenge. The addition of deep learning and artificial intelligence techniques to facilitate diagnosis and recommend therapies requires even more processing power.

Mellanox prides itself with high speed interconnect products and solutions that enable faster processing of petabytes of data. The selection of a network and its required capabilities is one of the major challenges when building cloud-based Apache Spark big data analytics cluster computing framework, as workloads vary significantly. The goal is to buy enough network capacity so that all nodes in the cloud cluster can communicate with each other in a non-blocking manner. Mellanox’s end-to-end Spark and Hadoop networking solutions deliver the necessary performance to eliminate any data movement and processing bottlenecks. Whether Ethernet or InfiniBand, Mellanox switches, cables, and adapter cards provide enough bandwidth to sustain today’s advanced flash storage array throughput. Using Apache spark cloud big data applications, Rett Syndrome research communities distributed across the globe can collaborate efficiently to share the genome database, identify common mutation patterns and propose gene therapy ideas and therapies to cure Rett Syndrome. Indeed, technology can prove to be a blessing in improving the quality of life of the silent Rett Syndrome angels.

Supporting Resources: