Though many applications and IP stack don't care about the LAG, links and its properities, one application QoS would need to worry about the link properities - specifically its bandwidth (shaping bandwidth). In ideal world, even QoS does not need to worry about links and its properties. As we all know, to ensure that there is mis-ordering of the packets in a given conversation, distributor function of the LAG module distributes the conversations across the links, not the packets. If there are large number of conversations compared to the links, there is always possibility of equal distribution of the traffic across the links. But when there are small number of conversations, which by the way not so uncommon, then there is a possibility of unequal distribution with respect to the traffic. That is, there could be more traffic in some conversations compared to others. If high traffic conversations go to few links, then there is unequal distribution. Let me cover QoS and changes required in QoS to work with LAG.
LAG distributor normally implements the concept of 'Load Rebalancing'. Load rebalncing happens in three cases.
- When LAG observes that there is unequal distribution.
- When new link is added to the LAG instance.
- When existing link is removed, disabled or broken.
- New conversations would use new hash distribution.
- Current conversations can be put onto other links only if the conversation is idle for X milliseconds - Time at which we know that packets would have been collected by the collector.
Now on to redistribution due to unequal utilitization of links:
Redistribution can be done in two ways - Changing the hash algorithm or fields to be used in hash algorithm. Second is to some how increase the number of conversations. Second method of increasing the conversations would work only in cases where tunnels (such as Ipsec) are conversations. By increasing the number of tunnels, there is a good possibility of increasing the distribution. Actual 5-tuple flows are sent on multiple tunnels. See this link here on how LAG & IPsec work together.
Changing the hash algorithm or adding/removing fields to the hash algorithm would have mis-order issues. In some deployments mis-order once in a while is okay. In those cases, this methoed can be used. To use this method, rebalancing should not happen very frequently. Typically following mehtod is used - If a link utilization is more than X% (Typically 5 to 10% - configurable parameter) away from the average usage of the trunk, then it is candidate for redistribution. Stop doing redistribution for configurable amount of seconds to ensure that there are no frequent redistributions.
Typically QoS shaping & scheduling function runs on top of L2 interfaces. Trunk link would be given the shaping bandwidth. Shaping is typically implemented using token bucket algorithm. Whenever there are tokens available, scheduling function is invoked. Scheduling function selects the next packet and sends the packet out.
LAG instance which is actiing as L2 interface has the shaping bandwidth which is sum of all the links. If the scheudling decision is taken purely based on the LAG trunk bandwidth, there is a possibility that scheduled packet would get dropped if the packet goes on link which is already completely utilized. This happens when there is uneven traffic in the convesations. Rebalancing helps, but it takes some time rebalance the traffic. Hence QoS shaping and scheduling function should not only consider the LAG instance bandwidth, but also the individual link bandwidth while making scheduling decision. By considering both, at least the paket from the high traffic conversation is not scheduled and resides still in the queue, there by avoiding packet drop.
At the same time, it is not good to under utilize other links. Scheduling, in this case, can move to other traffic that fall in other under-utilized links.
LAG is important feature, but it has its own challenges. IPsec and Qos implementations need to work with LAG properly to utilize LAG effectively.