Saturday, February 5, 2011

Clustering of devices with traffic distribution by L2 Switch - One limitation & Mitigation

In my last post on "Data Center/Enterprises Clustering of Devices"  I discussed on how L2 switches are enabling device equipment vendors to provide cluster solution to take up the increasing load on the networks.  Many L2 switches are capable of analyzing multiple different types of  layer 2 headers to get to the inner IP packet and use inner IP packet source and destination IP address fields to distribute the traffic across multiple devices in the cluster using hash distribution.  L2 switches typically understand Ethernet and MPLS related headers such as Ethernet DIX,  LLC/SNAP,  802.1Q VLAN headers and MPLS Label headers.  That is, L2 switch can get hold of IP packets if packets are sent over above mentioned L2 headers. 

In some deployments, IP packets may be encapsulated in multiple L2 headers or L3 tunnels.  Some examples are : PW (Pseudo wire) header,  Ethernet over Ethernet using PW,  GRE/UDP,  GTP/UDP, IPinIP,  Mobile IP and may more.  L2 switches in the market today are not capable of understanding these headers to to get to the inner IP header.   In these cases, distribution based on inner IP header fileds will not be possible.  In these deployments, L2 switches may need to resort to distribution based on L2 header fields such as source and destination MAC or tunnel IP header fields.  Unfortunately,  distribution based on these fields may not be good at all.  If you take an example of this cluster being places between two routers,  MAC addressees of every packet traversing the cluster will be same.  Hence any distribution based on the MAC addresses would go to only one device in the cluster.  Similar would be the case, if distribution depends on the tunnel IP header fields. 

Switches have one capability though.  They can generate the hash based on calculated CRC on some part of the packet or CRC of the Ethernet packet.   CRC of the Ethernet packet can be assumed to provide good distribution as CRC on Ethernet packet is based on the complete Frame payload, that is, including the inner IP packet payload. Switches have capability to take few bits of CRC to distribute the packets across multiple cluster devices.  But the issue with this is that packets belonging to same connection would go to different cluster devices.  Base on my earlier post, cluster devices assume that packets belonging to a connection would always land on the same device. This assumption is no longer true if the cluster solution is being deployed in above mentioned environments.  These types of environments are not common at all in Data Center and Enterprises environments.  So, this problem may not be there in many instances. But service provider environments, multiple L2 and tunnel header situations are not uncommon.

How do cluster solutions work in these environments?

In these environments, if CRC based distribution is used,  switches are really doing packet sprinkling across multiple devices.  In these cluster devices should have additional intelligence.  
  • Cluster devices should be able to get past the headers to get to the inner IP header.
  • Cluster devices among themselves should have understanding of session distribution. One simple method is to do what switches were doing.  That is, they can generate the hash on the IP header fields (source and destination IP) and figure out the device which got the packet is the one which needs to server based on the hash value.  If it is,  it should continue processing the packet. If it is not, then it should give the packet to the device that owns the hash value.  
There could be good amount of traffic among cluster devices.  They can use the switch as their back plane to send and receive the traffic among them.  To avoid other device doing same thing, that is getting hold of the inner IP header and hashing on the inner IP header fields again,  the sending device can send this information along with the packet and receiving device can avoid doing same operations again.

As indicated above,  only few deployments where L2 switch does not do inner IP header based distribution.  If same cluster solution being provided for all kinds of deployments, then it is good for network equipment vendors of cluster solution to provide configuration on the cluster whether the packets which are being distributed by L2 switch is packet sprinkler or intelligent IP flow based sprinkling.  If it is packet based sprinkling, then additional logic in devices can kick in to figure out the real destination device.


No comments: