Wednesday, March 5, 2014

Openflow switches - flow setup & termination rate

I attended ONS yesterday.  This is the 1st time, I see many OF hardware switch vendors (either based on NPUs,  FPGAs or ASICs) advertising the performance numbers.  Though the throughput numbers are impressive, flow setup/termination rates are, in my view, are disappointing.  I see the flow setup rate claims are any where between 100 to 2000/sec.

Software based virtual switches support flow setup rate, up to10K/sec for every core. If 4 core part is used, one can get easily 40K flow setups/sec.

In my view,  unless the flow setup rate is improved, it is very unlikely the hardware based OF switches would be popular as the market addressability is limited.

By talking to few, I understand that poor flow setup rate is mainly due to the way TCAMs are used.  As I understand, every flow update (add/delete) requires rearranging the existing entries and that leads to bad performance. I also understand from many of these vendors that they intend to support algorithmic search accelerators to improve the performance of flow insertions and deletions. I also understand from them that this could improve the performance to hundreds of thousands of flow setups/sec.

Unless following are taken care,  hardware based OF switch adoption would be limited.
  • Higher flow setup rate. (Must be better than software based switches)
  • Ability to maintain millions of flows.
  • Support for multiple tables (All 256 tables)
  • VxLAN Support
  • VxLAN with IPSec
Throughput performance is very important and hardware based OF switches are very good at that.  In my view, all of above are also important for any deployments to seriously consider hardware based OF switches in place of software based switches.


Anand said...

Just curious , why do we need IPSEC on VxLAN?

Srini said...
This comment has been removed by the author.
Srini said...

When VxLAN based virtual networks are used, packets from all VMs of various tenants/virtual networks, leave compute nodes with VxLAN/UDP/IP header attached to it. Today, underlying L2 switches/L3 switches treat all vxLAN packets same way. If somebody enables tap on underlay, all tenants traffic would get captured. Previously with VLAN traffic, one has ability to issue capture on a specific VLAN. That is no longer true with VxLAN. Since tap captures all the packets, some may be concerned about security. Hence, some think that all traffic leaving the compute node should be encrypted (IPSec).

Brent Salisbury said...

Great post as usual. Until the edge ToR resembles a compute node I struggle to find the advantage of anything other then the hypervisor compute. Re-ording in TCAM for non-switching/non-bcam lookups is a tall order as insertions could trigger a reorder if there isn't a slot in the lists index (priority) with every flowmod depending on where it lands in the list. I am also wary of tcam throughput unless MHz has changed significantly over the past year as I haven't been paying attention, last I looked it was still fairly slow.

Even more concerning for me is what happens when you need that one feature that isn't available due to an ASIC limitation or even more likely a vendor feature priority not aligning. Example, I am coding OpenStack sec groups tonight and need a dirty workaround for reflexive ACLs from OVS tonight. After taking a look its there, if it wasn't we could cobble something together. In the case of HW, it isn't and would be unrealistic in any time frame that would be worthwhile between trying to find a proxy on the ONF to submit a "patch" to a WG for a spec mod etc. I guess in a round about way I am getting at It's not just performance in hardware that feels inflexible but the dreams of new features in software life cycles doesn't feel very near. Then again, aggregation in SP provider edges are pretty different use cases for classification so it may not be significant.

Keep up the posts, this is absolutely one of my favorite blogs!!