Simplifying Composable Infrastructure with NVMe Hardware Virtualization

 
NVMe Over Fabrics, , ,

 

Today, Mellanox introduced NVMe SNAP technology enabling hardware virtualization of NVMe Flash storage, to achieve all the efficiency and management benefits of remote storage, with the simplicity of local storage.

At the 2009 World Swimming Championships in Rome, controversial new full-body swimsuits came to prominence and gave the top swimmers the edge they needed to break barriers otherwise blocked by physics. An astounding 43 world records were set at the championships, and at the 2008 Beijing Olympics, 94% of all swimming wins and 98% of all swimming medals were achieved by swimmers wearing the new suits. Those who didn’t invest in the new technology found it nearly impossible to win. Until their ban in 2009, we saw “swimsuit wars” with manufacturers competing to create better swimsuits and swimmers violating sponsorship agreements to wear the fastest technology rather than their contracted sponsors’ swimsuits.

Much like cloud service providers use the latest cloud storage technology to gain an edge, swimmers set world records at the 2009 World Swimming Championships in Rome with the help of revolutionary swimsuits that made their bodies more hydrodynamic.

Revolutionary swimsuit technology leads to winning Olympic medals

When everyone is fighting to be number one, new technology can provide the needed edge. Cloud hyperscalers and large enterprises continuously pursue the adoption of new server and storage virtualization technologies to maximize their utilization of resources and enable novel use of their ever-growing infrastructure.

 

The Appeal of Bare Metal Cloud

In traditional cloud, the cloud service provider (CSP) delivers VMs or containers with virtualized CPUs, memory, and network storage. It’s easy to deliver the right virtualized resources to the right customer on-demand—aka composable infrastructure—but you don’t know exactly what kind of physical hardware underlies it, nor what (or who) else is running on the same physical servers. Bare-metal cloud offers on-demand provisioning of physical servers to customers instead of VMs. It’s a rapidly growing market and there are three reasons to go bare metal: 1) Control; 2) Performance; and 3) Security.

  • Control: Customers choose exactly which operating system and applications they install, and don’t have to conform applications to the Cloud SP’s processes for OS/VM/hypervisor provisioning.
  • Performance: The server is dedicated to one application so there is no risk of a “noisy neighbor” hurting application performance. There is no hypervisor performance “tax” and the customer knows exactly which workloads are running on each server.
  • Security: The server is not shared, eliminating the risk that competing tenants (like Coke and Pepsi) could end up sharing the same server. Some customers don’t want (or are not allowed) to share servers.

The customer still gets the flexibility to spin up or decommission physical servers quickly and can install whatever software they require, including customized operating systems or hypervisors. As a bonus, there are no worries about CPU-based software licenses, which may charge for every CPU core on the physical server, even those that are being used by other tenants and other applications.

 

Composable Storage Virtualization Breaks Bare Metal Cloud Shackles

A CSP’s bare metal offering will usually include, for the sake of simplicity, local storage for the customer’s use. This provides easily accessed and fast performing storage under the customer’s full control. But this comes at a price for the Cloud SPs, limiting their ability to efficiently provision remote storage that is easy to migrate and protect. Therein lies a conflict when designing bare metal cloud offering between what is best for the customer (local storage) and what is best and most easily composable for the CSP (networked storage).

In 2017, Amazon AWS introduced its “Nitro” technology to resolve this conflict and became the envy of its cloud competitors. Nitro enables AWS to offer a ‘bare metal cloud’ with virtualized NVMe storage that is far more efficient than other cloud providers. The Nitro technology enables near-local flash storage performance with the added flexibility, serviceability, and dynamic elasticity of virtualized networked storage.  To achieve this, Amazon builds specialized NICs that can virtualize remote, networked storage into local storage for the bare-metal server. While some of the bare metal cloud infrastructure is now in fact virtualized, customers don’t need to be aware of this and are able to operate as if they have fully dedicated infrastructure—including storage, while still receiving benefits of virtualization that are otherwise unattainable on bare-metal offerings.

NVMe SNAP makes composable storage simple for bare metal cloud servers by virtualizing networked storage while making that storage appear as local NVMe SSDs.

NVMe Virtualization for Storage Composability

 

Mellanox NVMe SNAP™ Technology

Mellanox is now introducing NVMe SNAP™ (Software-defined Network Accelerated Processing) enabling in-hardware virtualization of NVMe storage addressing these and more use-cases for seamless storage virtualization. Mellanox’s NVMe SNAP framework enables our customers to integrate into any cloud provider or enterprise with any networked storage solution or storage protocol. NVMe SNAP brings virtualized storage to bare-metal clouds and enables the disaggregation of compute and storage to allow fully optimized resource utilization.

Mellanox NVMe SNAP enables hardware virtualization of NVMe storage

 

NVMe SNAP logically presents networked storage as a local NVMe flash drive on the PCIe bus to the host OS, hypervisor, and software. This allows the host OS/Hypervisor to use its standard NVMe-driver, unaware that the storage is being provided not by a local physical SSD but rather by NVMe SNAP connected to remote storage. Furthermore, the NVMe SNAP framework allows customers to implement sophisticated data management logic (mirroring/RAID, compression, encryption etc.) to the data that it transmits over the network and stores remotely.

NVMe SNAP empowers customers with the freedom to implement their own storage solutions on top of the NVMe SNAP framework. The NVMe SNAP framework runs on Mellanox BlueField™ SmartNIC embedded Arm cores together with our embedded hardware acceleration engines. This powerful combination is agile yet completely transparent to host software allowing it to be integrated into almost any storage solution. Just as new-tech swimsuits enabled swimmers to go faster, Mellanox SmartNIC and NVMe SNAP virtualization technology enables cloud infrastructures to run more efficiently.

 

Disaggregate and Scale Out NVMe-oF

The NVMe SNAP hardware-accelerated, software-defined storage virtualization described above, used in a bare-metal cloud, can be re-used for another, no less important goal of optimizing data-center resource utilization (compute, storage, network, etc.).

Data center architecture has a strong influence on overall resource utilization. The traditional fixed architecture which glues fixed ratios of compute, storage, and networking resources together presents a workload resource allocation challenge that is typically addressed by virtualization’s over-provisioning. But in reality, not all workloads are the same, some are compute intensive, while others are storage hungry. Overall this requires that service providers over-provision local storage, leading to under-utilized resources, resulting in an inefficient 40%-50% average storage utilization rate at data-center scale. This in turn translates to higher than necessary capital and operational spending on the half (or more) of purchased storage capacity that is essentially unused. Hyper-converged storage partially addresses storage utilization, but in an ever-growing data-center, adding fixed nodes with compute and storage glued together, does not allow for the elastic scaling of storage independently from compute, or vice versa.

Networked storage technology has been available for quite some time, but typically at the cost of poor latency, low throughput, high CPU consumption or all of the above. With new technologies such as NVMe-over-Fabrics (NVMe-oF) using RoCE (RDMA over Converged Ethernet) these performance issues have been overcome, but only for compute nodes which support the specific high-performance network storage protocols (like NVMe-oF)

Networked storage together with NVMe SNAP virtualization technology enables hyperscalers to remove their physical disks entirely from their compute nodes while connecting each compute node to a storage cluster transparently and seamlessly. This disaggregation of compute and storage enables storage to be part of a composable infrastructure and offers huge savings in acquisition cost, maintenance, and operating overhead, while leveraging the simplicity of this design with no software alterations or performance impact on the infrastructure at all. Compute nodes can now be added independently of storage and vice versa, allocating exactly the right amount of storage for each compute node and optimizing for any workload. The CSP can now offer logical pools of resources that maximize utilization of the entire rack or cluster.

Disaggregate and scale out with Mellanox NVMe SNAP

 

NVMe of Everywhere

The adoption of local server-attached NVMe flash SSDs has become widespread with all major operating systems offering support. NVMe-oF technology extends this to provide remote access to NVMe storage. NVMe-oF is on the rise, with 2019 predicted to be the year of mass deployments. Nonetheless there is still limited support for NVMe-oF by major OS suppliers, such as Microsoft Windows/Hyper-V and VMware ESXi. This chicken and egg problem is slowing the adoption of NVMe-oF in the cloud and especially into enterprise. Mellanox NVMe SNAP storage virtualization technology overcomes this issue with an “OS agnostic” technology that enables applications to use local NVMe drivers to transparently access remote NVMe storage. This enables storage providers to offer enterprises and clouds a solution that is not dependent on the OS providing NVM-oF support or even being aware that remote storage is being accessed, while gaining shorter integration and deployment time.

 

Summary

Mellanox network infrastructure doesn’t just connect servers and storage. It also delivers efficiency – enabling our customers to reach the full potential of their compute and storage infrastructure, while simultaneously offloading CPUs from wasting precious cycles on repetitive tasks. Building state-of-the-art public and private clouds is more competitive than ever, and just like advanced swimsuits, a technology edge can make all the difference.

With the introduction of NVMe SNAP technology Mellanox has evened the game, and provided the means for enterprises and service providers to win the race to easily composable storage, efficient utilization and advanced bare metal cloud features.

For more information on Mellanox’s NVMe SNAP offering, please contact your local Mellanox sales rep or authorized channel partner.

Additional Resources

About Erez Scop

Erez is Director of Product Management at Mellanox Technologies, managing storage, Data Plane Development Kit (DPDK) and software acceleration product lines. Erez is a member of the dpdk.org governing board that manages the open source project. Before joining Mellanox, Erez was Product Manager at AudioCodes Ltd. where he led their main product lines in the telecom, VoIP and unified communication fields for over 5 years. Erez brings more than 8 years of experience in Product Management backed up by over 10 years in R&D managerial roles. Erez holds a B.Sc. in Electrical and Electronics Engineering and an MBA.

Comments are closed.