Sunday, August 24, 2014

Death of SRIOV and Emergence of virtio

SRIOV functionality  in PCIe devices is introduced to solve a problem of sharing a physical device across multiple virtual machines in a physical server. SRIOV functionality in PCIe devices enables creation of  multiple virtual functions (VFs), typically to assign a virtual function to one virtual machine. Since each VF can be memory mapped,  virtual machines can have direct access to the device, thereby bypassing the VMM (Hypervisor).

NIC, Crypto, Compression and PCRE accelerator vendors have enhanced their PCIe devices to support Virtual Functions (VF).

It worked for some time.  Soon, with the popularity of public clouds and private clouds,  supply chain equation was changed.  It is no longer one or few vendors provide a complete solution to the operators.  Since few vendors got together to provide complete solution to the end operators, it was possible for  VM image vendors to support small number of PCIe NIC and accelerators by adding drivers to their VM images.   Soon after, operators started to procure various components that makes a system from various vendors themselves.  Operators start to get physical servers, accelerators, NICs and virtual machine images from various vendors.  Operators found this model working for them as the cost savings are huge with relatively small addition of integration by themselves.

In this new model,  virtual machine image vendors don't know what kind of accelerators and NIC cards supported by the operators.  Operators might procure newer accelerator and NIC cards in future from a new vendor.  If virtual machine is built for certain SRIOV NIC cards,  that image may not be valid in future if operators procure some other NIC card from a new vendor.  That kind of dependency is not good for all parties - Operators, NIC/Accelerator companies, virtual machine image companies.

One more major change happened in the Industry.  Operators are not just happy with simple demux of the packets based on VLAN ID or MAC ID to determine the VM destination for the incoming packets. Operators wanted to have control over the packet flows for various reasons such as :

  • Service Function Chaining.
  • Traffic Probe 
  • Traffic Control
  • Traffic Policing 
  • Filtering 
  • Overlays
  • Secure Overlays 
  • And many more...
Many NIC cards have simpler mechanisms such as VLAN, MAC level demux.  Since operators need lot of  flexibility,  they started to avoid assignment of VFs to VMs.   SDN and OVS have born out of that need. OVS running in VMM allowed operators to implement above functions without having to depend on the NIC vendors.  

These two reasons -  A. Removal of dependencies among VM image, NIC vendors and operators B.  operator need for controlling the traffic - is slowly killing SRIOV for NICs.

At least one reason - Removal of dependencies among VM image, PCI device vendors and operators - is also killing  SRIOV based accelerator cards.

Virtio:

virtio is a virtulaization standard for Ethernet, Disk and other accelerators.  Virtio in the VMM emulates devices as PCIe devices (emulated PCIe devices) and assign them to the guests. Guest when it comes up discovers these devices and enables them by loading appropriate virtio frontend drivers. Since these devices are emulated by the VMM in software, in theory, there are no reasonable limits on the number of devices that can be emulated and assigned to the guests.

In today world, virtio is used for emulated Ethernet, SCSI and other devices.  Since, this emulation is same irrespective of physical devices,  this naturally provides a flexibility where VM images are independent of physical NICs.  That became more attractive for VM image vendors.  As long as they support virtio front end drivers,  they can be rest assured that they can be run on any physical servers with any NIC or SCSI physical devices. 

If I take NIC as an example,  VMM receives the packets from the physical NIC,  runs through the various functions such as OVS, Linux IPTables etc.. and then sends the packet to the right guest using virtio emulated Ethernet device. In this case, only VMM need to have drivers for the physical NIC. All VM images only need to worry about virtio-Ethernet.

Next step in Evolution 

Though there is a freedom for VM images from physical devices,  but soon,  operators start to find the problems with the performance.  Since packets are traversing through VMM in both directions (ingress and egress),  major performance drops are observed.

That gave the birth to smart-NICs where smart-NICs do almost all VMM packet processing.  To enable flexibility of packet processing enhancements, most of smart-NIC started to be built using programmable entities  (FPGA or Network processors etc..). In addition,  smart-NIC vendors have also started  implement virtio-Ethernet interface to connect to VMs without involving VMM. With this, smart-NIC vendors are solving two issues - Performance issue where the device is directly connected to the guests and VM image independence from hardware. Since virtio-Ethernet is exposed by smart-NIC,  VMs need not know the difference between VMM emulating virtio-Ethernet and smart-NIC emulating virtio-Ethernet.

With great success in Ethernet,  virtio based interface is being extended in other hardware devices such as iSCSI,  Crypto/compression/regex accelerators. 

SRIOV death just began and in few years I won't be surprised if nobody talks about SRIOV.

Comments?