All posts by Kevin Deierling

About Kevin Deierling

Kevin Deierling has served as Mellanox's VP of marketing since March 2013. Previously he served as VP of technology at Genia Technologies, chief architect at Silver Spring Networks and ran marketing and business development at Spans Logic. Kevin has contributed to multiple technology standards and has over 25 patents in areas including wireless communications, error correction, security, video compression, and DNA sequencing. He is a contributing author of a text on BiCmos design. Kevin holds a BA in Solid State Physics from UC Berkeley. Follow Kevin on Twitter: @TechseerKD

Call Me Stupid…But as Moore’s Law Ends, All I Hear is Tock, Tock, Tock

At the recent Silicon 100 Summit Intel senior vice president Jim Keller gave a keynote titled: “Moore’s Law isn’t dead and if you think that, you’re stupid.” Well, then call me stupid because I’ve been arguing for several years that Moore’s Law is dying; and making the case that this will have a profound impact on semiconductor, system, and data center innovation and dictate new development strategies required to succeed. In fact, my #1 prediction for 2019  was that Moore’s Law would be declared officially dead.

But before we go any further it’s important to define what we mean by Moore’s Law. For that, it’s useful to go back to Gordon Moore’s original article “Cramming More Components onto Integrated Circuits,” published in 1965 in Design magazine. It was in this article that Moore first publicly made the now-famous observation, by recognizing the trend of doubling of the number of transistors in a chip every twelve months, and predicted that this would continue for at least the next ten years. This observation and prediction proved to be so prescient that it eventually was accorded with “LAW” status. (Later, in 1975, citing the “end of cleverness” Moore adjusted his Law to state that the doubling would occur every two years.)

Having started my career as a new process development engineer in the wafer fabs, this notion that there was a law that transistor count would keep doubling always irked me, as it seemed to imply that this was some sort of fundamental law of nature, and thus undercut the tremendous R&D and engineering effort required to stay on this exponential path of advancement. Nonetheless, all of us in the fabs struggled mightily to stay on this ever more complicated path.

So, if viewed through a narrow lens, Moore’s Law might be defined as the prediction of chip device density doubling every two years. But Gordon said much more, noting that simple, regular improvements in photolithography were all that were needed to maintain this doubling:

“It is not even necessary to do any fundamental research or to replace present processes. Only the engineering effort is needed.”

So if you can improve lithography by a factor of 30% every two years, the result is a 2D chip that is half the size (0.7 * 0.7 ~ 0.5); or a chip that doubles the transistor count for a fixed die size.

But Gordon went well beyond this, predicting that this doubling would also have the benefits of faster devices that consumed less power:

“In fact, shrinking dimensions on an integrated structure makes it possible to operate the structure at higher speed for the same power per unit area.”

But perhaps most importantly, Moore’s Law also predicted that the reduction in cost per device would follow this same exponential trend. As Gordon himself later said: “Moore’s Law is really about economics,” and he clearly understood the impact that this would have on society, including in his original 1965 article this wonderful figure:

I just love this image because it demonstrates what a truly mind-blowing prediction Gordon made and just how visionary he was. Today we take computers that you can slide into your pocket for granted, but in 1965, when computers were the size of a house and required massive cooling plants to keep them running, the idea of a “handy home computer” was truly fantastic.

In fact, there are many concurrent innovations and challenges required to achieve each transition to the next process node. Moreover, meeting these challenges requires not just advances in photolithography, but encompass addressing the entire range of engineering and physics challenges.

The device physics that underpins Moore’s Law were first clearly articulated by Robert Dennard in 1974, in an article where he outlined all of the various device parameters and how they needed to scale together. He described that as a MOSFET transistor shrinks exponentially it also becomes:

    • Smaller
    • Faster
    • Lower power
    • Cheaper

The net takeaway from Dennard scaling is truly remarkable. If you scale things correctly everything just gets better! Smaller, faster, cheaper. And even though the number of transistors per unit area increases by the square of the scaling factor, the power of each device decreases by precisely the same square law amount. The net is that the power density stays constant so that you get both faster and more devices in the same area and power and it all costs the same. In short Dennard Scaling explains Moore’s Law at a fundamental level of device physics, but also points out all the simultaneous things that need to be scaled to remain on track.

And Moore’s Law and the associated Dennard scaling has been wildly successful. Even as transistor counts repeatedly doubled every two years, the semiconductor cost per unit area has remained relatively constant over many decades at about one billion dollars per acre (the only thing I know of that is more expensive than Silicon Valley real estate).

It is this larger, economic part of Moore’s Law which has made it the powerful driving force behind the impact technology has made in our lives over the last 50+ years.

Thus, if you define Moore’s Law as simply the doubling of devices every couple of years, achieved by any means imaginable, then you can make a sort of weak argument that it will continue. I will stipulate that technology will continue to progress. Jim is a smart guy and I think that was the point he was making.

But 3D memory processes, multichip packaging, through-silicon-vias, die-stacking, and new vertical FinFet processes, definitely don’t meet the criteria that Gordon originally stated of no fundamental research needed to replace existing processes. Just ask the thousands of engineers involved in driving these chip technologies forward.

So call me stupid if you will, but Moore’s Law is Dead.

… at least Moore’s Law writ large, in the way Gordon originally articulated it.

Former Intel CTO Justin Rattner even admitted as much, when talking about the end of the classical phase of Moore’s Law in this interesting conversation at the Computer History Museum back in 2013 when he said:  “for Intel, Moore’s Law for silicon gate MOS technology ended 3 generations back…”

At some point, one or more of these Dennard knobs hits practical limits, which puts stress on the overall scaling that underpins Moore’s Law. As dimensions shrink and geometries approach those of atomic diameters, second order electrical and even quantum effects become dominant. At some point, these second order effects require fundamental changes in process technology to stay on the Moore’s Law progression. In fact, such roadblocks were encountered in the previous decade, forcing the silicon foundries to abandon the standard silicon gate planar process and the simple “engineering effort” that served the first four decades of chips.Figure 4: Cpu Clock Speed Increases Stalled around 2002 2002 ( http://www.ni.com/white-paper/14565/en/)

So achieving each process step has become increasingly difficult, and key parts of both Moore’s Law and Dennard Scaling have broken. For example, the doubling of device performance stalled around 2002 and chips simply have not gotten much faster since.

Individual transistors have continued to speed up, however at a chip level practical issues like noise, cross talk, clock distribution and jitter, on-chip variation, have all conspired to limit continued performance improvement. But even as processor clock speeds stalled, parts of Moore’s Law marched along. Indeed just a few years later in 2005 AMD introduced their first multi-core processors, and were followed closely by Intel. So no longer did we get faster processors but instead more and more of them.

But other parts of Dennard Scaling have hit a wall too – in particular as vertical tunneling of electrons through gate oxides became significant, constant power density collapsed, and chips consumed more and more power. This in turn required intricate and expensive packaging and cooling technologies at both the chip and system level.

And even allowing for jumping through all sorts of process hoops to maintain the crushing requirements imposed by any exponential law, former Intel CEO Brian Krzanich admitted on Intel’s Q2 2015 earnings call that the Moore’s Law Tick-Tock “cadence today is closer to 2.5 years than two.”

This “BK Brake” was a stunning admission by the Intel CEO, but in fact turned out to be wildly optimistic in terms of the actual delays that would occur moving from 14nm to the 10 nm node.  It also highlighted the Tick Tock Model – one of the vaunted advantages that Intel pioneered to take advantage of Moore’s Law.

The Tick-Tock Model of Chip Development

To understand this Tick-Tock model of chip development you first need to understand how chips were developed before Intel invented it. In the time before Tick-Tock, process engineers, CPU architects, and chip designers were all part of one big, fully integrated team and moved in lock step together. A big new program was kicked off that would have a new chip architecture built on a brand-new process node. This created giant leaps in performance with each new product, but had a couple of big flaws. First of all you would only get a new product out every two years as shown in the diagram below:

The second problem is it requires solving two extraordinarily complex problems at the same time – both trying to bring up a new manufacturing process with good yields in high volume, while simultaneously bringing up and debugging a new architecture. Solving two problems concurrently is always much more difficult and riskier than just one. It is like the adage of the man who chases one rabbit and has supper vs the man who chases two rabbits and starves.

But once you become fully convinced and committed to the two-year cadence of new process nodes promised by Moore’s Law, a new model becomes possible. This fundamental business model innovation takes advantage of parallel, rather than serial development.

The TICK-TOCK-TICK-TOCK model of chip development simply staggers the process and architecture development by one year. Each development still requires the same amount of time, but because they are staggered it allows a once a year cadence of new product introduction. One year the new chip is a process shrink of an existing architecture. The next year the new chip uses the same process node but introduces a new architecture.

The beauty of this chip development model is that it is a business innovation that takes advantage of a classic pipeline that hides latency to achieve throughput.

By staggering by one year the development of a new architecture from that of a new process you can pipeline the introduction of new products. That is you can hide the two year development cycle of architecture and process, by overlapping and staggering them. So even though it takes around two years to come up with a new architecture, and also roughly two years to develop a new process – you still manage to introduce a new product every year. By adopting this Tick-Tock development pipeline you not only get twice as many new products in a given time but also reduce the risk by tackling only one of the two new developments with each new product.

In a nutshell, this is the Tick-Tock model of chip development and it worked extremely well for many, many years. But like all pipelines, it’s latency is impacted by the slowest stage – and unfortunately one of the stages has slowed down to a crawl. And now Moore’s Law is Dead … and with it, the Tick-Tock Model.

So all we hear now from Intel is Tock, Tock Tock.

That is five successive architectural TOCK products without a new process TICK. Now to be fair this problem is not unique to Intel. All chip companies are experiencing the too much TOCK problem, but its impact on Intel is particularly pronounced. After all Intel originated the TICK-TOCK model of processor development more than a decade ago, and this cadence became a core part of their business strategy.

But all chip companies share the same laws of semiconductor physics, and as we approach the atomic limits of scaling, Moore’s Law has slowed to a crawl for everyone. So with the end of scaling comes the end of Moore’s Law (although I would argue it is the laws of economics at least as much as physics that are contributing to its demise).  And for a whole bunch of different reasons, it simply won’t be possible anymore to introduce a new manufacturing process every two years. So even if it is not dead, Moore’s Law is deeply fractured. And unfortunately, the Tick-Tock model has become broken along with it. So the vaunted Tick-Tock model has been stuck at 14nm, with now 4 or 5 new devices in the same process node coming out on a TOCK-TOCK-TOCK cadence.

So call me stupid, but rather than deny the demise of Moore’s Law, I prefer to embrace this new reality, understand the implications, and develop new strategies to cope with this brave new world. In my next blog I’ll discuss the implications and recommend some strategies to thrive in the post Moore’s world.

Additional Reading:

 

 

Achieving a Cloud Scale Architecture with SmartNICs

In the first blog of this series, I argued that it is function and not form that define a SmartNIC (or smart NIC). I also introduce another category of data center NICs called an intelligent NIC (iNIC) which include both hardware transport and a programmable data path for virtual switch acceleration. These capabilities are necessary but not sufficient for a NIC to be a SmartNIC. A true SmartNIC must also include an easily extensible, C-programmable Linux environment that enables data center architects to virtualize all resources in the cloud and make them appear as local.  To understand why SmartNICs need this, we go back to what created the need for smarter NICs in the first place.

 

Why the World Needs SmartNICs

One of the most important reasons why the world needs SmartNICs is that modern workloads and data center designs impose too much networking overhead on the CPU cores. With faster networking (now up to 200Gb/s per link), the CPU just spends too much of its valuable cores classifying, tracking, and steering network traffic. These expensive CPU cores are designed for general purpose application processing, and the last thing needed is to consume all this processing power simply looking at and managing the movement of data. After all, application processing that analyzes data and produces results is where the real value creation occurs.

The introduction of compute virtualization makes this problem worse, as it creates more traffic on the server both internally–between VMs or containers—and externally to other servers or storage. Applications such as software-defined storage (SDS), hyperconverged infrastructure (HCI), and big data also increase the amount of east-west traffic between servers, whether virtual or physical, and often Remote Direct Memory Access (RDMA) is used to accelerate data transfers between servers.

Through traffic increases and the use of overlay networks such as VXLAN, NVGRE, or Geneve, increasingly popular for public and private clouds, adds further complications to the network by introducing layers of encapsulation. Software-defined networking (SDN) imposes additional packet steering and processing requirements and adds additional burden to the CPU with even more work, such as running the Open vSwitch (OVS).

Smarter NICs can handle all this virtualization (SRIOV, RDMA, overlay network traffic encapsulation, OVS offload) faster, more efficiently, and at lower cost than standard CPUs.

 

Another Reason for the SmartNIC—Security Isolation

In addition, sometimes you want to isolate the networking from the CPU for security reasons. The network is the most likely vector for a hacker attack or malware intrusion and the first place you’d look to detect or stop a hack. It’s also the most likely place you’ll want to implement in-line encryption. The SmartNIC—being a NIC—is the first/easiest/best place to inspect network traffic, block attacks, and encrypt transmissions. This has both performance and security benefits, as it eliminates the frequent need to route all incoming and outgoing data back to the CPU and across the PCIe bus. It provides security isolation by running separately from the main CPU—if the main CPU is compromised then the SmartNIC can still detect or block malicious activity. And the smart NICs can work to detect or block attacks without immediately involving the CPU.

The security benefits of a SmartNIC are covered more in this Security blog by Bob Doud.

The Mellanox BlueField Smart NIC offers extremely fast hardware acceleration of networking and storage functions plus C-programmable Arm cores

Virtualizing Storage and the Cloud

A newer use case for SmartNICs is to virtualize software-defined storage, hyperconverged infrastructure, and other cloud resources. Before the virtualization explosion, most servers just ran local storage, which is not always efficient but it’s easy to consume—every OS, application and hypervisor knows how to use local storage. Then came the rise of network storage—SAN and NAS and more recently NVMe over Fabrics (NVMe-oF). But not every application is natively SAN-aware, and some operating systems and hypervisors (like Windows and VMware) don’t speak NVMe-oF yet.  Something smart NICs can do is virtualize networked storage—which is more efficient and easier to manage–to look like local storage—which is easier for applications to consume. A SmartNIC could even virtualize GPUs (or other neural network processors) so that any server can access as many GPUs as it needs whenever it needs them, over the network.

A similar advantage applies to software-defined storage and hyperconverged infrastructure, as both use a management layer (often running as a VM or as a part of the hypervisor itself) to virtualize and abstract the local storage—and the network—to make it available to other servers or clients across the cluster. This is wonderful for rapid deployments on commodity servers and is good at sharing storage resources, but the layer of management and virtualization soaks up many CPU cycles that should be running the applications. And like with standard servers, the faster the networking runs and the faster the storage devices are, the more CPU must be devoted to virtualizing these resources.

Once again enter the intelligent NIC (smarter NIC) or SmartNIC. The first offloads and helps virtualize the networking (accelerating private and public cloud, which is why they are sometimes called CloudNICs), and the second can offload both the networking and much or all the storage virtualization. SmartNICs can also offload a wide variety of functions for SDS and HCI, such as compression, encryption, deduplication, RAID, reporting, etc., all in the name of sending more expensive CPU cores back to what they do best—running applications.

Mellanox ConnectX adapters are intelligent NICs and offer hardware offload popular networking and virtualization functions

Choosing the best SmartNIC—Must have Hardware Acceleration

Having covered the major SmartNIC use cases, we know we need them and where they can provide the greatest benefit. They must be able to accelerate and offload network traffic and also might need to virtualize storage resources, share GPUs over the network, support RDMA, and perform encryption. Now what are the top SmartNIC requirements? First all SmartNICs (and smarter NICs) must have hardware acceleration.  Hardware acceleration offers the best performance and efficiency, which also means more offloading with less spending. The ability to have dedicated hardware for certain functions is key to the raison d’être of SmartNICs.

 

Must be Programmable

While for the best performance most of the acceleration functions must run in hardware, for the greatest flexibility, the control and programming of these functions needs to run in software.

There are many functions that could be programmed on a smart NIC, a few of which are outlined in the feature table of my previous blog.  Usually the specific offload methods, encryption algorithms, and transport mechanisms don’t change much, but the routing rules, flow tables, encryption keys, and network addresses change all the time. We recognize the former functions as data plane and the latter functions as the control plane functions. The data plane rules and algorithms can be coded into silicon once they are standardized and established. The control plane rules and programming change too quickly to be hard coded into silicon but can be run on an FPGA (modified occasionally, but with difficulty) or in a C-programmable Linux environment (modify easily and often).

How much of the Programming Needs to Live on the SmartNIC?

We have a choice on how much of a SmartNIC’s programming is done on the smart NIC. That is, the NIC’s handling of packets must be hardware-accelerated and programmable, but the control of that programming can live on the NIC or elsewhere.  If it’s the former, we say the NIC has a programmable data plane (executing the packet processing rules) and control plane (setting up and managing the rules). In the latter case, the NIC only does the data plane while the control plane lives somewhere else, like the CPU.

For example with Open vSwitch, the packet switching can be done in software or hardware, and the control plane can run on the CPU or on the SmartNIC. With a regular foundational or “dumb” NIC, all the switching and control is done by software on the CPU.  With a smarter NIC the switching is run on the NIC’s ASIC but the control is still on the CPU.  With a true SmartNIC the switching is done by ASIC-type hardware on the NIC while the control plane also runs on the NIC in easily-programmable Arm cores.

With Open vSwitch offload, the packet processing is offloaded to a ConnectX NIC eSwitch while the control plane runs on a CPU, or both the packet processing and control plane can run on a BlueField SmartNIC

ConnectX-5 NIC offloads OVS switching to NIC hardware

 

So Which SmartNIC is Best for Me, and Which Mellanox Adapters are SmartNICs?

Both transport offload and a programmable data path with hardware offload for virtual switching are vital functions to achieve application efficiency in the data center. According to the definition in my earlier blog Defining a SmartNIC, these functions are part of an intelligent NIC and are table stakes on the path to a SmartNIC. But just transport and programmable virtual switching offload by themselves don’t raise an intelligent NIC to the level of a SmartNIC or Genius NIC.

Very often we find customers that tell us they must have a SmartNIC because they need programmable virtual switching hardware acceleration. This is mainly because another vendor competitor with an expensive, barely programmable offering has told them a “SmartNIC” is the only way to achieve this. In this case we are happy to deliver this very same functionality with our ConnectX family of intelligent NICs which after all are very smart NICs.

But by my reckoning there are a few more things required to take a NIC to the exalted level of a SmartNIC, such as running the control-plane on the NIC and offering C-programmability with a Linux environment. In those cases, we’re proud to offer our BlueField SmartNIC, which includes all the smarter NIC features of our ConnectX adapters plus from 4 to 16 64-bit Arm cores, all of course running Linux and easily programmable.

As you plan your next infrastructure build-out or refresh, remember my key points:

  • SmartNICs are increasingly useful for offloading networking functions and virtualizing resources like storage, networking, and GPUs
  • Intelligent NICs (iNICs or smarter NICs) accelerate data plane tasks in hardware but run the control plane in software
  • The control plane software—and other management software—can run on the regular CPU or on the very smart (Genius) NICs
  • Mellanox offers best-in class intelligent NICs (ConnectX), FPGA NICs (Innova), and fully-programmable data plan/control plane SmartNICs (BlueField SmartNIC)

 

Additional Resources:

What Is a SmartNIC?

Everyone is talking about SmartNICs but without ever answering one simple question: what is a “SmartNIC” and what do they do? Now a NIC of course stands for a “Network Interface Card”; practically speaking a NIC is a PCIe card that plugs into server or storage box to enable connectivity to an Ethernet network. A SmartNIC goes beyond simple connectivity, and implements network traffic processing on the NIC that would necessarily be performed by the CPU in the case of a foundational NIC.

Some vendors’ definition of a SmartNIC is focused entirely on the implementation. But this is problematic as different vendors have different architectures and thus a SmartNICs can be ASIC, FPGA, and System-on-a-Chip (SOC) based. Naturally vendors who make just one kind of NIC seem to insist that only the type of NIC they make should qualify as a SmartNIC. :

A SmartNIC definition could include NICs built using an ASIC, an FPGA, or a System on Chip (SOC).

There are various tradeoffs between these different implementations with regards to cost, ease of programming, and flexibility. An ASIC is very cost effective and may deliver the best price performance, but it suffers from limited flexibility. While an ASIC-based NIC (like the Mellanox ConnectX-5) can have a programmable data path that is relatively simple to configure, ultimately functionality will have limitations based on what functions are defined within the ASIC, and that can prevent certain workloads from being supported. An FPGA NIC (like the Mellanox Innova-2 Flex) by contrast is highly programmable and with enough time and effort can be made to support almost any functionality relatively efficiently (within the constraints of the available gates). However FPGAs are notoriously difficult to program and expensive. So for the more complex use cases, the SOC (like the Mellanox BlueField SmartNIC) provides what appears to be the best SmartNIC implementation option: good price performance, easy to program, and highly flexible.

The ASIC SmartNIC, FPGA SmartNIC, and SoC SmartNIC have different pros and cons with the ASIC-based NIC having the best price-performance and the SoC NIC having the greatest flexibility.

But focusing on how a particular vendor implements a SmartNIC doesn’t really do justice to addressing exactly what it is capable of or how it should be ideally architected. In the case of Mellanox we actually have products that could be classified as ‘SmartNIC’ based on each of these architectures – and in fact customers use each of these products for different workloads depending on their needs. So the focus on implementation – i.e. ASIC vs. FPGA vs. SoC – reverses the ‘form follows function’ philosophy that underlies the best architectural achievements.

So rather than focusing on implementation, I think this PC Magazine entry gives a pretty good working definition of what makes a NIC a SmartNIC:

SmartNIC: A network interface card (network adapter) that offloads processing tasks that the system CPU would normally handle. Using its own on-board processor, the SmartNIC may be able to perform any combination of encryption/decryption, firewall, TCP/IP and HTTP processing. SmartNICs are ideally suited for high-traffic Web servers.

There are two things I like about this definition. One it focuses on the function more than the form, but secondly it hints at this form with the statement “…using its own on-board processor … to perform any combination of …” network processing tasks. So the embedded processor is key to achieving the flexibility to perform almost any networking function. I would just modernize that definition in two ways: First SmartNICs might also perform network, storage or GPU virtualization. Second, SmartNICs are also ideally suited for telco, security, machine learning, software-defined storage, and hyperconverged infrastructure servers—not just Web servers.

So let’s look at some of the functions that network adapters can support and use the ability to accelerate different workloads to differentiate three categories of NICs:

Comparing exactly which functions and workloads are accelerated is the proper way to differentiate between a SmartNIC, CloudiNIC, and foundational NIC.

Here I’ve defined three classes of NICs: foundational NIC, intelligent NIC (iNIC), and SmartNIC, based on their ability to accelerate specific functionality. The basic, or foundational NIC simply moves network traffic and has few or no offloads, other than possibly SRIOV and basic TCP acceleration, so it doesn’t save any CPU cycles and can’t offload packet steering or traffic flows. At Mellanox we don’t even sell a foundational NIC any more with our ConnectX adapter family featuring a programmable data path and accelerate a whole range of functions that first became important in public cloud use cases. For this reason I’ve defined this type of NIC as an “intelligent NIC” (iNIC) although today on-premise enterprise, telco and private clouds are just as likely as public cloud providers to need this type of programmability and acceleration functionality. So another name for it could be “smart NIC” or “smarter NIC” without the capital “S.”

In many cases customers tell us they need SmartNIC capabilities that are being offered by a competitor with either an FPGA or a NIC combined with custom, proprietary processing engines. But when customers really look at the functions they need for their specific workloads, ultimately they decide that the ConnectX family of iNICs provides all the function, performance, and flexibility of other so-called SmartNICs at a fraction of the power and cost. So by the definition of SmartNIC that some competitors use – our ConnectX NICs are indeed SmartNICs, though we might call them intelligent NICs or smarter NICs. Our FPGA NIC (Innova) is alsoa SmartNIC in the classic sense, and our SoC NIC (using BlueField) is the smartest of SmartNICs, to the extent that we could call them Genius NICs

So what is a SmartNIC? A SmartNIC is a network adapter that accelerates functionality and offloads it from the server (or storage) CPU.

But as to how one should build a SmartNIC and which SmartNIC is the best for each workload—the devil is in the details and it’s important to dig into exactly what data path and virtualization accelerations are available and how they can be used. I’ll do this in my next blog.

 

Other Resources:

Rethinking Data Center Security – From an M&M to a Jawbreaker Model

With the explosion of mobile devices and applications, more and more of our personal data and online interaction is captured, stored, and analyzed. As a result, security and privacy are becoming major issues, at a government, corporate and personal level.At the same time there is a rapid and fundamental transformation occurring with applications and data shifting from traditional enterprises to the cloud. This in turn causes a major change in network security architectures. In traditional enterprise architectures, a perimeter based security model was sufficient, with ingress and egress points to the internal network protected, but all internal access and applications within the enterprise being assumed as trusted. This is the so called “M&M” security model – hard on the outside but soft on the inside!

 

The Problem

However with the cloud this model breaks down. In the cloud providers actually invite their customers and their customers’ customers, right inside the data center. In fact in a multi-tenant cloud this means that customers are able to spin up their own applications, potentially running on the exact same physical infrastructure as their most bitter competitors or other “hostiles.” Clearly in this environment the M&M model is insufficient, and requires instead the “Jawbreaker” security model – hard on the outside *and* hard on the inside. For example the recent Spectre and Meltdown security exploits enable applications hosted on shared cloud resources to steal data and harm other users applications. This Great Security Blog explains how to avoid these types of  threats using BlueField SmartNICs.

So security is a key part of what is driving virtualization, at both the server and network levels. Server virtualization provides good isolation between applications running on Virtual Machines (VM’s). However this protection of application data comes at a real cost – as the hypervisor effectively implements security functions in software to isolate the VM’s, and this consumes massive CPU resources. This means that the most expensive part of the server, the CPU and memory sub-system, is being consumed by tasks not related to the actual customer application that needs to be run. The amount of CPU consumed is increasing dramatically as both data and east-west traffic increases.  These problems are exacerbated as VMs move to containers – creating an explosion of microservices, increased network traffic, and diminished isolation containers. So many more CPU cycles are consumed to deliver distributed security.

Similarly network isolation is being provided by overlay networks which create virtual layer 2 networks over a layer 3 physical underlay network. While this provides good isolation of network traffic in a multi-tenant cloud, again this creates massive CPU utilization to map and bridge between the logical overlay and physical underlay networks.

The Solution – Smart Network Accelerators Enables Distributed Security Everywhere

Fortunately both server and network security can be addressed by smart network accelerators within SmartNICs and next generation Ethernet switches. Networking vendors such as Mellanox, are at the forefront of this acceleration technology – delivering scalable, secure solutions that can protect an individual’s activities, data and privacy – no matter where a consumers digital wanderings may take them. The new BlueField SmartNIC is a great way to achieve comprehensive distributed security without sacrificing application performance.

These accelerators are implemented by technologies such as NFV, OVS, VXLAN, DPDK, and ASAP2. But in many cases these technologies are extending core security functions, which used to reside at the hard edge of the M&M model, into the hard interior of the new Jawbreaker security architecture.

This means functions like load balancing, firewalls, address translation, and application delivery control move from edge appliance, to being distributed software services running on servers throughout the network. Software-only implementations are extending and improving the security of data and resources – but at the huge cost of reduced efficiency of the server and storage infrastructure implementing these features. Fortunately a new generation of smart networking adapters is able to deliver the needed security in hardware, leaving expensive server CPU resources available to run applications.

So the network is becoming more important than ever, allowing a distributed Jawbreaker security model, while simultaneously enabling cloud and service providers to achieve total infrastructure efficiency from their server and storage investments.

Seven Predictions for 2017

Fibre Channel Market Collapsing in 2017

Fibre Channel Market Collapsing in 2017

As we look forward to 2017 it is time to peer out into the distance and think about what will happen during the year:

 

  1. 2017 will mark the beginning of the Hunger Games for high performance Optical Component Startups.

This fight to the death battle is inevitable, because there is no de-facto industry standard for 25, 50, and 100G optical interconnects. The early adopters of advanced optical interconnect are the Super Seven – and each has its own view on what the technology should look like. Some want to use multi-mode fiber while others insist on single mode. Some use QSFP and parallel fiber while others insist on WDM (wavelength division multiplexing) over single fiber SFP. Still others want to use break out or pigtailed options. This fragmentation of the market means that small manufacturers can’t develop all of these different options. And with only one major customer for each variant, it is a very dangerous game to play. Some players who are “one-trick ponies” will find themselves unable to achieve scale and maintain the investment required to compete.

 

  1. NVMe Over Fabrics will “Cross the Chasm” and accelerate the decline of Fibre Channel

NVMe Over Fabrics (NVMe-OF) arrived with a bang just a scant 18 months ago and is being driven forward by the performance advantages of RDMA and RoCE. Often a new technology is over-hyped into a state of “Overstated Expectations” and eventually falls into a “Trough of Disillusionment.” But like RoCE before, NVMe-OF will cross the chasm in 2017 with GA solutions appearing that deliver true business value. This will accelerate the decline of the Fibre Channel making this market collapse even faster.

 

  1. Flash Memory will demonstrate remarkable market resilience vs. the new class of Non-Volatile memory competitors such as 3D-Xpoint and ReRAM

Many have predicted the end of Flash memory with the advent of new non-volatile memory technologies such as 3D-Xpoint and ReRAM. In true Mark Twain fashion, the news reports of the death of Flash are greatly exaggerated. Flash memory will continue to thrive even as the new technologies struggle to become reliable and manufacturable in high volumes. In fact the major Flash memory manufacturers will innovate to dramatically improve the read and write latency of Flash, thereby closing the gap on the main advantage of these new technologies.

 

  1. We’ll see a Flash Crash with several prominent All Flash Array vendors finally succumbing

With success comes fierce competition. So despite the overall success of Flash Memory storage there will be winners and losers. Violin Memory will be first and foremost among the struggling All Flash Array vendors that will finally give up the ghost. The competition will get fierce as the big boys and especially the now colossal Dell-EMC hit their stride. It will become increasingly difficult for the smaller guys (and maybe even some of the bigger guys) to compete. Consolidations and pink slips will be the order of the day in 2017.

 

  1. NFV will finally start Functioning in 2017

In 2016 the much ballyhooed potential of Network Function Virtualization (NFV) to eliminate the need for purpose built appliances, failed to materialize. Unfortunately vendors found that when they ported their applications to industry standard servers the performance of their network functions (such as load balancers and firewalls) was dramatically degraded. The promise of better agility with pure software defined virtual network functions (VNFs) unfortunately came at the expense of untenable tradeoffs on price, performance, and power. Instead of reducing cost the performance limitations of VNFs running on X86 servers meant more boxes, dollars, and Megawatts.

 

But this will all change in 2017. Advanced 25, 50, and 100G Network adapters now have built-in Open Virtual Switch (OVS) hardware accelerators that allow vendors to achieve the agility and DevOps capabilities of software defined VNFs, without the performance penalties previously suffered. That combined with the nimble software providers developing true cloud native VNFs based on scalable microservices will make 2017 the year that NFV finally starts to function!

 

  1. The grand vision of SDN will stall and SDN will become “just an overlay” technology

The original grand vision of Software Defined Networks was to create an entirely new, centrally managed, flow based networking architecture. But displacing 30+ years of router technology is a tall order. In reality all but the largest cloud providers have rejected the forklift upgrade required to replace all their BGP routers with flow based manager – and instead have adopted only a subset of the grand SDN vision. The use of ‘overlay’ networks (based on VXLAN, NVGRE, or GENEVE technologies) is becoming widely adopted. This form of network virtualization enables isolation within a multi-tenancy service provider environment and importantly allows tenants to span across L3 routers transparently to both the traditional routers and to tenant software. So SDN and network virtualization is becoming a reality, but only the overlay part of the grand vision.

 

  1. Open Flow will Morph From a Protocol Into an Interface

Closely related to SDN the original grand vision of Software Defined Networks was to use OpenFlow to replace traditional end-point path based routing algorithms, and instead treat every flow a separated entity. Unfortunately this vision didn’t take into account scalability challenges of flow based forwarding nor the robustness and feature set that has evolved around traditional network routing, quality of service, and management. So the funeral for OpenFlow as a network routing technology will be held in 2017, but it will persist as an API to configure flow policies at end points and within gateways.

 

WooHoo!! We’re Two!! Happy Birthday 25G Ethernet Consortium!

Why 25G Ethernet is like a back-Flipping Wall-Climbing 2 Year Old!

Happy Birthday! Today marks the 2nd birthday of the 25G Ethernet Consortium . It was July 1, 2014 that Microsoft, Google, Mellanox, Arista, and Broadcom first announced the formation of the consortium in order to define interoperable 25 and 50 Gigabit Ethernet solutions. There is a lot of history behind why 25G Ethernet was defined by this consortium, rather than in the IEEE, that you can read about in this Electronic Design Article.

Fast forward two years and there has been tremendous progress. Today the specification is at version 1.4 which has allowed multiple vendors to develop interoperable solutions. The first 25/50G Consortium Plugfest is being held next month with interoperability demonstrations expected from 20+ companies. Of course we’ll be there with our end to end line of 25, 50, & 100Gb/s Ethernet solutions of adapters, switches, and copper and optical cables and transceivers.

UPDATE: I missed mention in my original post, that on June 30, the IEEE just approved the 802.3by spec for 25G Ethernet too! From the ieee802.org 25G reflector:

“Congratulations to all!  The IEEE-SA Standards Board today approved 802.3by an an IEEE Standard!  We are done!”

 

25G Ethernet Adapters

Perhaps even more important than the standards is the fact that major server vendors including Dell, HPE, and Lenovo have 25G network adapters solutions based on our 25/50G ConnectX-4 Lx device.

HPE_FlexLOM

For example HPE offers 25G Adapters as both regular standup PCIe cards and in the compact FlexibleLOM form factors.

25G, 50G  & 100G Ethernet Switches

In addition a broad range of 25, 50, & 100 Gb/s Ethernet switches are now available. This includes the complete line of Mellanox 25/50/100 GE switches, including the half rack width SN2100, the 48+8 SN2410, and the 32 port SN2770. Based on the Spectrum switching silicon these switches offer the best performance and predictability in the industry. You can read the Predictable Performance Blog to learn more abouthow Spectrum based switches deliver the lowest latency, best congestion resilience, predictable performance, fairness, and Zero Packet Loss.

SN2100: 16 Port 100G half rack width. Can be 64 Port @ 25G with breakout cables  SN2100
SN2410 – 48-Port @ 25G + 8-Port @100G  SN2410
SN2700: 32-Port @ 100G  SN2700

25G, 50G  & 100G Ethernet Cables and Transceivers

And lastly we have a complete line of LinkX copper and fiber cables and transceivers. These include both VCSEL based multimode short reach and Silicon Photonics based single mode long reach optical cables and transceivers. We’ve got the best 100G optical modules in the business and are looking forward to the expected ramp in 25, 50, and 100G data centers from hyperscale customers in the second half of 2016.

LinkX

Analysts have been predicting a rapid ramp for the 25GbE technology and this Network Computing Article explains three of the key drivers behind this explosive growth.  But bottom line for a technology that is only two years old, it is amazing to see how rapidly the entire 25, 50, and 100G Ethernet ecosystem has come together with a robust end to end product line of GA products. The 25G Ethernet market is really taking off! Can’t wait to see what it will look like at 3!

The Race to 25G Ethernet – Seven Critical Issues that will Decide the Winner

Often marketers treat new technology like a foot race, and for some it seems the ultimate goal is to be the first to announce a new product. But in reality the first to announce, just like the first out of the blocks, doesn’t actually determine the end of the story as this race video shows.

In reality there are many issues that need to be considered when choosing the right partners with which to deploy new technologies, and an aggressive marketing department willing to announce a product just to be first, is the least important of these considerations.

The new 25Gb/s Ethernet technology is a great case in point, which is now literally hitting full stride with major server vendors announcing support for both adapters and switches. Mellanox has been at the vanguard of this technology being one of the original founders of the 25GEthernet Consortium, along with hyperscale providers like Google and Microsoft.

So if “first to announce” isn’t the primary consideration to determine ultimate success with a new technology, it raises the question what is important?

Here is my take on the top 7 considerations to evaluate companies that will actually determine success in the new 25GbE technology:

  1. Technology
  2. Manufacturing and Operational Capabilities
  3. Price Performance
  4. Ease of Adoption
  5. Product Robustness and Reliability
  6. Corporate and financial stability
  7. End to End portfolio

1.     Technology

The first and most critical consideration for most customers is the core features and capabilities of a new technology. What is most important here is that the technology just works and that the advanced feature set can be easily consumed and delivers true business value. The good news here is that Mellanox offers ConnectX-4 Lx 25/50 GbE adapters that deliver not just 2.5X higher bandwidth, but combine this with advanced networking offloads that accelerate cloud, virtualization, storage, and communications. These offloads mean that more CPU power is available to run applications rather than being consumed by moving data. So the ultimate benefit is application and infrastructure efficiency that results in a better data center ROI using 25GbE.

2.     Manufacturing and Operational Capabilities

So even if the technology works and has the features you need, it’s vital to consider whether your technology partner can manufacture and deliver 25GbE products in high volume and in a timely fashion that meets your business needs. There is nothing more frustrating than having significant customer system revenue opportunity delayed or lost because of supply chain problems with a single component.

The good news here is that Mellanox has proven itself as a reliable supplier shipping to the largest OEM and data center customers in the world. We are the market share leader today in high performance Ethernet NICs (>10Gb/s) with over 90% market share. We are shipping millions of ConnectX adapters to the largest public cloud, Web 2.0, storage, and server OEM customers every year with reliable and dependable delivery. Our ConnectX-4 Lx adapters are a mature product line, with a broad set of software driver support, and have been battle hardened in real world deployments. We maintain significant inventory that is staged throughout the world to enable us to meet upside demand on an expedited schedule.

3.     Price/Performance

Industry analysts are predicting that 25GbE will have the fastest adoption ramp ever for a new Ethernet technology.

Adoption25

Figure 2 Faster Ever Adoption Forecast for 25Gb/s Ethernet

To make this forecast a reality requires not just 25GbE technology that is both manufacturable and offers better features and capabilities, but also deliver a true price/performance advantage.

 PPC

Figure 3: Price Performance Advantage of 25GbE

And here 25GbE delivers on both fronts with better price/performance as well, as can be seen in the Crehan forecast. While 25GbE Ethernet is slightly more expensive than the 10GbE pricing, when normalized for price/performance it is much cheaper on a $/Gbit/s of bandwidth.

In fact the 25GbE pricing is very competitive, with only a 30%-40% premium over 10GbE and this premium is expected to come down over time. To achieve these competitive pricing levels requires devices that are optimized to support 25GbE.

This is precisely why Mellanox introduced new ConnectX-4 Lx silicon for our 25GbE adapter products. The ConnectX-4 Lx is a dedicated 25/50GbE device with a X8 PCIe interface. This is in contrast to the larger and more expensive ConnectX-4 device which has a wider PCIe interface and is capable of supporting 100GbE performance levels. Other offerings that try to cut corners with a one-size fits all approach won’t be able to meet the aggressive price targets required by this market.

4.     Ease of Adoption

ConnectX-4_Lx_Dual-Port-SFP_front_small

Figure 4: ConnectX-4 Lx Adapter with Backwards Compatible SFP28 Connectors

At Mellanox we’ve worked hard to ensure that 25Gb/s Ethernet offers a seamless upgrade to 10 GbE environments, with backwards compatibility that uses the same 10GbE LC fiber cabling that has already been deployed in the data center. Other 25GbE NIC offerings require the use of special QSFP to SFP28 breakout cables, and thus do not provide backwards compatibility with existing LC fiber. In fact there is no solution to connect these NICs to fiber at all.

By contrast the ConnectX-4 Lx offers ordinary SFP style connectors enabling a choice of either copper or fiber connectivity in the same manner as existing 10GbE is deployed.

5.     Product Robustness and Reliability

It is critical that a new technology is robust and reliable. Even a few bad customer experiences can create a perception that a technology has issues and is not ready for primetime. The perception of poor reliability is difficult to overcome and can set back the adoption of a new technology for years.

Building a robust and reliable product is hard and requires everything (silicon, hardware, software, and components) to be designed to the highest standards and built to last. Often weakness in one area can cascade and cause challenges that impact the entire system design and limit product reliability.

For example a high powered device may require special cooling such as a mechanical fan. This should be a red flag as it can cause many thermal and mechanical challenges and has the potential to limit the overall reliability and adversely impact mean time between failures.

Fans

Figure 5: Fans on competitors offerings are a Big Red Flag that indicate high power which can limit product lifetime

Fortunately Mellanox ConnectX-4 Lx Adapters are low power and don’t require fans. The adapters are fully qualified and shipping as GA  products. All of these products have undergone rigorous qualification screening and are designed for reliable operation.

6.     Corporate and Financial Stability

When you choose a technology provider you are also choosing a business partner and it is important to consider the financial health and corporate well-being of the company behind the technology. After all supporting and qualifying a new technology is difficult and requires a significant resource investment by the system vendor. You want to make sure that your business partner is financially healthy with a strong leadership team in place and will continue to invest in software and hardware to drive the technology forward. A company that is financially strong with growing revenues and profits has the ability to continue to invest R&D resources to expand application support and develop new technology. It can be a huge setback to find that all of your key contacts at your technology supplier suddenly don’t work there anymore. So when you choose your technology partners consider not just the technology, operational capabilities, and reliability but also the financial health and stability of the companies you work with.

7.     End to End Portfolio

When introducing a new technology it is important that a comprehensive product offering is in place that allows for end to end connectivity. Mellanox has an entire end to end product line including 25, 50, and 100 GbE ConnectX-4/4Lx adapters, Spectrum switches, and LinkX cables that are generally available and shipping in volume. This is important as it allows Mellanox to perform integration and optimization at every level of the product line to ensure that solutions just work. By qualifying our end to end product line we learn a great deal about each individual component which allows us to improve on all fronts.

But it is equally important that we have interoperability with the entire 25 Gb/s Ethernet ecosystem. As one of the founding members of the 25Gb Ethernet consortium , a member of the Ethernet Alliance, and a founding member of the RoCE Initiative; Mellanox is committed to compliance and interoperability in order to drive 25Gb/s Ethernet technology forward.

Conclusion

The race to 25G Ethernet technology has just begun and it is important to be a leader and deliver this new technology. However it goes well beyond what a company says, and there are many much more important concerns that saying you are first. Here we’ve outlined seven critical issues to consider that will ultimately determine who the winner is in the race to 25G Ethernet technology. But no matter which provider wins one thing is for certain – 25G Ethernet is a great new technology that delivers compelling value and the customer will win for sure.

 

Mellanox and NexentaEdge Cranks Up OpenStack Storage with 25GbE!

Mellanox and NexentaEdge High Performance Scale-Out Block & Object Storage  Deliver Line Rate Performance on 25Gbs and 50Gbs Fabrics.

This week at the OpenStack Summit in Austin, we announced that Mellanox end-to-end Ethernet solutions and the NexentaEdge high performance scale-out block and object storage are being deployed by Cambridge University for their OpenStack cloud.

Software-Defined Storage (SDS) is a key ingredient of OpenStack cloud platforms and Mellanox networking solutions, together with Nexenta storage, are the key to achieving efficient and cost effective deployments. Software-Defined Storage fundamentally breaks the legacy storage models that requires a separate Storage Area Network (SAN) interconnect and instead, converges storage onto a single integrated network.

NexentaEdge block and object storage is designed for any petabyte scale, OpenStack or Container-based cloud and is being deployed to support Cambridge’s OpenStack research cloud. The Nexenta OpenStack solution supports Mellanox Ethernet solutions from 10 up to 100 Gigabit per second.

NexentaEdge is a ground-breaking high performance scale-out block and object SDS storage platform for OpenStack environments. NexentaEdge is the first SDS offering for OpenStack to be specifically designed for high-performance block services with enterprise grade data integrity and storage services. Particularly important in the context of all-flash scale-out solutions, NexentaEdge provides always-on cluster-wide inline deduplication and compression, enabling extremely cost-efficient high performance all-flash storage for OpenStack clouds.

Over the last couple of weeks, Mellanox and Nexenta worked jointly to verify our joint solution’s ability to linearly scale cluster performance with the Mellanox fabric line rate. The testbed comprised 3 storage all-flash storage nodes with Micron SSDs and a single block gateway. All 4 servers in the cluster were connected with Mellanox ConnectX-4 Lx adapters, capable of either 25Gbps or 50Gbps Ethernet.

NexentaEdge configured with Nexenta Block Devices on the gateway node demonstrate 2x higher performance as the Mellanox fabric line rate increased from 25Gbps to 50Gbps.

NexentaEdge-graph1

For example, front-end 100% random write bandwidth (with 128KB I/Os) on the NBD devices scaled from 1.3GB/s with 25Gbps networking, to 2.8GB/s with 50Gbps networking. If you consider a 3x replication factor for data protection, these front-end numbers correspond to 25Gbps and 50Gbps line rate performance on the interface connecting the Gateway server to the three storage nodes in the cluster. While NexentaEdge deduplication and compression were enabled, to maximize network load, the dataset used for testing was non-dedupable and non-compressible.

Building and deploying an OpenStack cloud is made easier with a reliable components that have been tested together. Mellanox delivers predictable end-to-end Ethernet networks that don’t lose packets as detailed in the Tolly Report.  NexentaEdge takes full advantage of the underlying physical infrastructure to enable high performance OpenStack cloud platforms that deliver both CapEx and OpEx savings as well as extreme performance scaling compared to legacy SAN-based storage offerings.

Ethernet Performance – Can you Afford an Unpredictable Network?

Overlay networks (such as VXLAN) are the most widely deployed implementation of Software Defined Networks (SDN). But when deploying an overlay SDN network, a bullet proof underlay networks is required. Many simply assume that the underlay network will perform reliably and predictably. But it turns out that at the highest network speeds, predictable performance is extremely hard to deliver, and some vendors actually fall short. Unfortunately, for application level and data center architects the unpredictability of the underlying network can be hidden from view. It is fruitless trying to debug unpredictable application behavior at a system or application level when it is the underlying network that is behaving chaotically and dropping packets. At Mellanox, we deliver predictable networks so that we take the network out of the equation and let providers and customers focus on only their applications – knowing that data communications just works. So for SDN deployments Spectrum is the best underlay for your overlay network.

For those not familiar with overlay networks, this SDN Whitepaper explains more about overlay networks and other options to implement SDN networks.

 

In order to achieve predictable performance, it’s important to understand how modern, open  networking equipment is built. At this year’s Open Compute Project (OCP) Summit in San Jose, we introduced Open Composable Networks (OCN) – which represents the realization of the vision of the Open-Ethernet initiative first launched early in 2013. OCN demonstrates the power of open networking as is explained in the blog: Why are Open Composable Networks like Lego?

 

By disaggregating switches, OCN enables customers to choose the best hardware and the best software. At Mellanox, we are happy to provide customers with solutions at multiple levels, as we know that fundamentally we deliver predictable performance with the best switching solutions available, from the platform all the way down to the ASIC level. This blog provides the details to support that claim and on how the Spectrum switches deliver predictable performance.

 

The most obvious advantages of the Spectrum switch are 37% lower power and less than half the latency of Broadcom devices. But, in fact, Predictable Performance is perhaps even more important to application performance and customer experience.

Today’s advanced switching devices are complex beasts and unfortunately sometimes all their features get reduced to a short list of simple bullets. So when comparing the Mellanox Spectrum based switches to Broadcom Tomahawk based offerings (Tolly Report), one might make the error of thinking they are roughly the same.

Continue reading

RoCE has Crossed the Chasm

In my previous post, I outlined how Gartner and The Register were predicting a gloomy outcome for Fibre Channel over Ethernet (FCoE) and made the assertion that in contrast RDMA over Converged Ethernet (RoCE) had quite a rosy future.  The key here is that RoCE has crossed the chasm from technology enthusiasts and early adopters to mainstream buyers.

 

In his eponymous book, Moore outlines that the main challenge of Crossing the Chasm is that the Early Majority are pragmatists interested in the quality, reliability, and business value of a technology. Whereas visionaries and enthusiasts relish new, disruptive technologies; the pragmatist values solutions that integrate smoothly into the existing infrastructure. Pragmatists prefer well established suppliers and seek references from other mature customers in their industry. And pragmatists look for technologies where there is a competitive multi-vendor eco-system that gives them flexibility, bargaining power, and leverage.

To summarize the three key requirements needed for a technology to cross the chasm are:

  1. Demonstration that the technology delivers clear business value
  2. Penetration of key beachhead in a mainstream market
  3. Multi-vendor, competitive ecosystem of suppliers

 

On all three fronts RoCE has crossed the chasm.

Continue reading