Today, Mellanox introduced NVMe SNAP technology enabling hardware virtualization of NVMe Flash storage, to achieve all the efficiency and management benefits of remote storage, with the simplicity of local storage.
At the 2009 World Swimming Championships in Rome, controversial new full-body swimsuits came to prominence and gave the top swimmers the edge they needed to break barriers otherwise blocked by physics. An astounding 43 world records were set at the championships, and at the 2008 Beijing Olympics, 94% of all swimming wins and 98% of all swimming medals were achieved by swimmers wearing the new suits. Those who didn’t invest in the new technology found it nearly impossible to win. Until their ban in 2009, we saw “swimsuit wars” with manufacturers competing to create better swimsuits and swimmers violating sponsorship agreements to wear the fastest technology rather than their contracted sponsors’ swimsuits.
When everyone is fighting to be number one, new technology can provide the needed edge. Cloud hyperscalers and large enterprises continuously pursue the adoption of new server and storage virtualization technologies to maximize their utilization of resources and enable novel use of their ever-growing infrastructure.
In traditional cloud, the cloud service provider (CSP) delivers VMs or containers with virtualized CPUs, memory, and network storage. It’s easy to deliver the right virtualized resources to the right customer on-demand—aka composable infrastructure—but you don’t know exactly what kind of physical hardware underlies it, nor what (or who) else is running on the same physical servers. Bare-metal cloud offers on-demand provisioning of physical servers to customers instead of VMs. It’s a rapidly growing market and there are three reasons to go bare metal: 1) Control; 2) Performance; and 3) Security.
The customer still gets the flexibility to spin up or decommission physical servers quickly and can install whatever software they require, including customized operating systems or hypervisors. As a bonus, there are no worries about CPU-based software licenses, which may charge for every CPU core on the physical server, even those that are being used by other tenants and other applications.
A CSP’s bare metal offering will usually include, for the sake of simplicity, local storage for the customer’s use. This provides easily accessed and fast performing storage under the customer’s full control. But this comes at a price for the Cloud SPs, limiting their ability to efficiently provision remote storage that is easy to migrate and protect. Therein lies a conflict when designing bare metal cloud offering between what is best for the customer (local storage) and what is best and most easily composable for the CSP (networked storage).
In 2017, Amazon AWS introduced its “Nitro” technology to resolve this conflict and became the envy of its cloud competitors. Nitro enables AWS to offer a ‘bare metal cloud’ with virtualized NVMe storage that is far more efficient than other cloud providers. The Nitro technology enables near-local flash storage performance with the added flexibility, serviceability, and dynamic elasticity of virtualized networked storage. To achieve this, Amazon builds specialized NICs that can virtualize remote, networked storage into local storage for the bare-metal server. While some of the bare metal cloud infrastructure is now in fact virtualized, customers don’t need to be aware of this and are able to operate as if they have fully dedicated infrastructure—including storage, while still receiving benefits of virtualization that are otherwise unattainable on bare-metal offerings.
Mellanox NVMe SNAP™ Technology
Mellanox is now introducing NVMe SNAP™ (Software-defined Network Accelerated Processing) enabling in-hardware virtualization of NVMe storage addressing these and more use-cases for seamless storage virtualization. Mellanox’s NVMe SNAP framework enables our customers to integrate into any cloud provider or enterprise with any networked storage solution or storage protocol. NVMe SNAP brings virtualized storage to bare-metal clouds and enables the disaggregation of compute and storage to allow fully optimized resource utilization.
NVMe SNAP logically presents networked storage as a local NVMe flash drive on the PCIe bus to the host OS, hypervisor, and software. This allows the host OS/Hypervisor to use its standard NVMe-driver, unaware that the storage is being provided not by a local physical SSD but rather by NVMe SNAP connected to remote storage. Furthermore, the NVMe SNAP framework allows customers to implement sophisticated data management logic (mirroring/RAID, compression, encryption etc.) to the data that it transmits over the network and stores remotely.
NVMe SNAP empowers customers with the freedom to implement their own storage solutions on top of the NVMe SNAP framework. The NVMe SNAP framework runs on Mellanox BlueField™ SmartNIC embedded Arm cores together with our embedded hardware acceleration engines. This powerful combination is agile yet completely transparent to host software allowing it to be integrated into almost any storage solution. Just as new-tech swimsuits enabled swimmers to go faster, Mellanox SmartNIC and NVMe SNAP virtualization technology enables cloud infrastructures to run more efficiently.
The NVMe SNAP hardware-accelerated, software-defined storage virtualization described above, used in a bare-metal cloud, can be re-used for another, no less important goal of optimizing data-center resource utilization (compute, storage, network, etc.).
Data center architecture has a strong influence on overall resource utilization. The traditional fixed architecture which glues fixed ratios of compute, storage, and networking resources together presents a workload resource allocation challenge that is typically addressed by virtualization’s over-provisioning. But in reality, not all workloads are the same, some are compute intensive, while others are storage hungry. Overall this requires that service providers over-provision local storage, leading to under-utilized resources, resulting in an inefficient 40%-50% average storage utilization rate at data-center scale. This in turn translates to higher than necessary capital and operational spending on the half (or more) of purchased storage capacity that is essentially unused. Hyper-converged storage partially addresses storage utilization, but in an ever-growing data-center, adding fixed nodes with compute and storage glued together, does not allow for the elastic scaling of storage independently from compute, or vice versa.
Networked storage technology has been available for quite some time, but typically at the cost of poor latency, low throughput, high CPU consumption or all of the above. With new technologies such as NVMe-over-Fabrics (NVMe-oF) using RoCE (RDMA over Converged Ethernet) these performance issues have been overcome, but only for compute nodes which support the specific high-performance network storage protocols (like NVMe-oF)
Networked storage together with NVMe SNAP virtualization technology enables hyperscalers to remove their physical disks entirely from their compute nodes while connecting each compute node to a storage cluster transparently and seamlessly. This disaggregation of compute and storage enables storage to be part of a composable infrastructure and offers huge savings in acquisition cost, maintenance, and operating overhead, while leveraging the simplicity of this design with no software alterations or performance impact on the infrastructure at all. Compute nodes can now be added independently of storage and vice versa, allocating exactly the right amount of storage for each compute node and optimizing for any workload. The CSP can now offer logical pools of resources that maximize utilization of the entire rack or cluster.
The adoption of local server-attached NVMe flash SSDs has become widespread with all major operating systems offering support. NVMe-oF technology extends this to provide remote access to NVMe storage. NVMe-oF is on the rise, with 2019 predicted to be the year of mass deployments. Nonetheless there is still limited support for NVMe-oF by major OS suppliers, such as Microsoft Windows/Hyper-V and VMware ESXi. This chicken and egg problem is slowing the adoption of NVMe-oF in the cloud and especially into enterprise. Mellanox NVMe SNAP storage virtualization technology overcomes this issue with an “OS agnostic” technology that enables applications to use local NVMe drivers to transparently access remote NVMe storage. This enables storage providers to offer enterprises and clouds a solution that is not dependent on the OS providing NVM-oF support or even being aware that remote storage is being accessed, while gaining shorter integration and deployment time.
Mellanox network infrastructure doesn’t just connect servers and storage. It also delivers efficiency – enabling our customers to reach the full potential of their compute and storage infrastructure, while simultaneously offloading CPUs from wasting precious cycles on repetitive tasks. Building state-of-the-art public and private clouds is more competitive than ever, and just like advanced swimsuits, a technology edge can make all the difference.
With the introduction of NVMe SNAP technology Mellanox has evened the game, and provided the means for enterprises and service providers to win the race to easily composable storage, efficient utilization and advanced bare metal cloud features.
For more information on Mellanox’s NVMe SNAP offering, please contact your local Mellanox sales rep or authorized channel partner.