Tuesday, September 28, 2010

Look-aside acceleration & Application Usage scenarios

Performance and flexibility are two different factors that play role on how applications use look-aside accelerators.  As described in  post,  applications use accelerators in synchronous or asynchronous fashion.  In this post, I would give my view of different types of applications and their usage of look-aside accelerators.

I would assume that all applications are running in Linux user space.  I also would assume in this post that all applications are using HW accelerators by memory mapping the registers in the user space.  Based on these assumption, I could categorize applications into these types:

  • Per_packet processing applications with Dedicated core to the User Process and HW Polling Mode :  In this type, application runs in the user process. A core or set of cores are dedicated to the process, that is, these cores are not used for anything else other than executing this process.  Since core is dedicated, it can wait for the events on some HW interface until some event is ready to be processed.  In this mode,  it is expected that Multicore hardware provides single interface to wait for the events.  Application wait in a loop forever for the events. When the event is ready, it takes action based on the type of event and then come back to wait for new events.  This type of application is more suitable for per-packet processing applications such as IP forwarding, Firewall/NAT,  IPsec, MACSec,  SRTP etc..   
    • Per-packet processing applications would use look-aside accelerators in asynchronous fashion. Incoming packets from Ethernet or other L2 interfaces and the results from the look-aside accelerators are given through the common HW interface.   
    • Typical flow would be some thing like - When the incoming packet is ready on Ethernet port,  polling function returns with 'New packet' event.   New packet is processed by the user space and at one time decides that it needs to be sent to the HW accelerator, sends it to HW accelerator and then come back to poll again.   HW accelerator at some time returns the result through same HW interface.  When polling function returns with 'Acceleration result' event, user process processes the result and may send the packet out onto some other Ethernet port.   It is possible that more packets would have been processed by the user process before the acceleration result is returned for previous packets.  Due to this asynchronous nature, cores are utilized well and system throughput would be very good.
    • IPsec, MACSec, SRTP uses Crypto algorithms in asynchronous fashion.
    • PPP and IPsec IPCOMP use compression/decompression accelerators in asynchronous fashion.
    • Some portion of DPI use Pattern Matching acceleration in asynchronous fashion.
  • Per-Packet processing application with Non-Dedicated core to the user process & SW polling mode:  This is similar to above type 'Dedicated core to the user process and HW polling mode'.   In this type,  core(s) are not dedicated to the user process.  Hence HW polling is not used as this would make core not relinquish the control as often for doing other operations.  SW polling is used, typically using ePoll() call.   In this mode, interrupts are using UIO facilities provided by Linux.  When the interrupt is raised whenever the packet is ready or accelerator result is ready. UIO wakes up the epoll() call in the user space.  When the ePoll() returns, it reads the event from HW interface and it executes different function based on event type.  
    • All per-packet processing applications such as IPsec, SRTP, MACSec, Firewall/NAT can also work in this fashion.
    • IPsec, MACSec, SRTP uses Crypto algorithms in asynchronous fashion.
    • PPP and IPsec IPCOMP use compression/decompression accelerators in asynchronous fashion.
    • Some portion of DPI use Pattern Matching acceleration in asynchronous fashion.
  • Stream Based applications :  Stream based applications are normally work at high level away from packet reception and transmission.  For example,  Proxies/Servers work on BSD sockets - The data which they receive is the TCP data, not the individual packets.  Crypto file system is another kind of stream application, where it works on the data, not on the packet.  These applications collect data from several packets. Some times this data gets transformed such as packet data gets decoded into some other form.   HW accelerators would be used on top of this data.  In almost all cases the HW accelerators are used in synchronous fashion.  In this type of applications ,  synchronous mode is used in two ways -  Waiting for the result in a tight loop without relinquishing the control and waiting for the result in a loop by yielding to Operating system.   First sub-mode (tight loop mode) is used when the HW acceleration function takes very less time and second mode (yield mode) is used when the acceleration function takes long. 
    • Public Key acceleration such as RSA sign/verify, RSA encrypt/decrypt, DH operations and DSA sign/verify work in yield mode as these operations take significant number of cycles.  Applications that require this acceleration are:  IKEv1/v2,  SSL/TLS based applications,  EAP Server etc..
    • Symmetric Cryptography such as AES & different modes,  Hashing algorithms, PRF Algorithms would be used in tight loop submode as these operations take less cycles.  Note that  Yielding might take anywhere between 20000 cycles to 200,000 cycles based on number of other ready processes and that is not acceptable latency for these operations.  Applications based on SSL/TLS,  IKEv1/v2,  EAP Server etc..
    • I would put compression/decompression HW accelerator usage in slightly different sub-mode.  Compression/Decompression works in this fashion for each context.
      • Software thread issues the operation.
      • Immediately reads if there is anything pending result (based on previous operations). Note that the thread is not waiting for the result.
      • Works on the result if available
      • And above steps happen in a loop until there is no input data.
      • At the end,  it waits (in yield mode) until the all the result is returned by the accelerator.
    • Application that can use compression accelerators:  HTTP Proxy, HTTP Server,  Crypto FS, WAN optimization etc..
 Any comments?

Sunday, September 26, 2010

LAG, Load Rebalancing & QoS Shaping

LAG feature exposes only one L2 interface to the IP stack for each LAG instance.  It hides all the links in the LAG instance underneath it.  It sounds good in the sense that IP stack & other applications are completely transparent with respect to number of links that are being added and removed.

Though many applications and IP stack don't care about the LAG, links and its properities,  one application QoS would need to worry about the link properities - specifically its bandwidth (shaping bandwidth).  In ideal world,  even QoS does not need to worry about links and its properties. As we all know, to ensure that there is mis-ordering of the packets in a given conversation,  distributor function of the LAG module distributes the conversations across the links, not the packets. If there are large number of conversations compared to the links, there is always possibility of equal distribution of the traffic across the links. But when there are small number of conversations, which by the way not so uncommon, then there is a possibility of unequal distribution with respect to the traffic.  That is, there could be more traffic in some conversations compared to others. If high traffic conversations go to few links, then there is unequal distribution.  Let me cover QoS and changes required in QoS to work with LAG.

Load Rebalancing:

LAG distributor normally implements the concept of 'Load Rebalancing'.  Load rebalncing happens in three cases.
  •  When LAG observes that there is unequal distribution.
  •  When new link is added to the LAG instance.
  •  When existing link is removed, disabled or broken.
Though new link and removal of existing link to/from the LAG instance is not the focus of this article, let me just give a gist of  issues that need to be taken care.  Packet mis-order issue must be taken care well.  When the new link is added,  if hash distribution is changed immediately, some of the existing conversations might be balanced to other links. If it is done arbitrarily, then there is a possibility of packets being received by collector in out-of-order for brief amount of time.  To make sure that new link is used effectively,  there are two methods can be used. Both can be used togehter though.
  •  New conversations would use new hash distribution.
  •  Current conversations can be put onto other links only if the conversation is idle for X milliseconds - Time at which we know that packets would have been collected by the collector.
When link is no longer active,  then packet mis-ordering is no longer a big issue.  The traffic has to flow and new distribution can take effective immediately and distirbute the conversations that belong to the old port to existing ports immediately.

Now on to redistribution due to unequal utilitization of links:

Redistribution can be done in two ways - Changing the hash algorithm or fields to be used in hash algorithm.  Second is to some how increase the number of conversations.  Second method of increasing the conversations would work only in cases where tunnels (such as Ipsec) are conversations.  By increasing the number of tunnels,  there is a good possibility of increasing the distribution. Actual 5-tuple flows are sent on multiple tunnels. See this link here on how LAG & IPsec work together.

Changing the hash algorithm or adding/removing fields to the hash algorithm would have mis-order issues.  In some deployments mis-order once in a while is okay.  In those cases, this methoed can be used. To use this method, rebalancing should not happen very frequently.  Typically following mehtod is used - If a link utilization is more than X% (Typically 5 to 10% - configurable parameter) away from the average usage of the trunk, then it is candidate for redistribution.  Stop doing redistribution for configurable amount of seconds to ensure that there are no frequent redistributions.

QoS:

Typically QoS shaping & scheduling function runs on top of L2 interfaces.  Trunk link would be given the shaping bandwidth. Shaping is typically implemented using token bucket algorithm.  Whenever there are tokens available,  scheduling function is invoked.  Scheduling function selects the next packet and sends the packet out.

LAG instance which is actiing as L2 interface has the shaping bandwidth which is sum of all the links. If the scheudling decision is taken purely based on the LAG trunk bandwidth, there is a possibility that scheduled packet would get dropped if the packet goes on link which is already completely utilized. This happens when there is uneven traffic in the convesations.  Rebalancing helps, but it takes some time rebalance the traffic. Hence QoS shaping and scheduling function should not only consider the LAG instance bandwidth, but also the individual link bandwidth while making scheduling decision. By considering both,  at least the paket from the high traffic conversation is not scheduled and resides still in the queue, there by avoiding packet drop.
At the same time, it is not good to under utilize other links. Scheduling, in this case, can move to other traffic that fall in other under-utilized links.

LAG is important feature, but it has its own challenges.  IPsec and Qos implementations need to work with LAG properly to utilize LAG effectively.

Comments?

eNodeB and IPsec

eNodeB secures the traffic over the IPsec tunnels to the Serving Gateway (SGW) over backhaul network.  Also, eNB creates many tunnels to peer eNBs for X2 and handover traffic.  Though all features related to Ipsec are valid in eNB scenarios too,  some features are worth mentioning in eNB context.

LAG and IPsec:

Please this link here to understand the issues and solutions related to LAG and Ipsec in general. This scenario is very much valid for eNB to SGW communication. Note that traffic from all GTP tunnels in non-handoff scenario go between eNB and SGW on one or few (when DSCP based tunnels) Ipsec tunnels.  When LAG is used between eNB and LAG,  similar issue of not utilizing more than one link would arise.  Both the solutions suggested in earlier article are valid in this scenario too.  In cases where it is difficult to get multiple public IP addresses to the LAG link, then scenario 2 - forceful NAT is only option I can think of.


Capabilities expected in eNB and SGW:

Using LAG effectively requires many tunnels.  It is good to have 1 + (number of links - 1 ) * 32 Ipsec tunnels for good distribution across links.  User traffic, in this case GTP traffic should be balanced across these IPsec tunnels.

Typically there are two GTP traffic tunnels for each cell phone user - One is typically created for Data traffic and another for voice traffic.  Without LAG, normally two Ipsec tunnels are created - One for data traffic coming/going  from/to  all the cell users and another for voice traffic for all voice traffic coming/going to cell users.  GTP traffic is distributed across these two Ipsec tunnels based on DSCP value.

Now, we have lot more Ipsec tunnels.  There should be additional logic in eNB and SGW which distributes the GTP traffic across these multiple Ipsec tunnels.  This logic should distribute the traffic from a given conversation to one Ipsec tunnel.  Each GTP tunnel traffic can be viewed as one conversation.   That is, GTP tunnels are distributed across the Ipsec tunnels.   One way is to look at the TEID (Tunnel Endpoint ID) and use hash on TEID to distribute the traffic across Ipsec tunnels.

Ipsec implementation on eNB and SGW should have capability to create multiple tunnels - number of tunnels to be created should be configurable.  eNB and SGW implementations also should have capability to bring up the tunnels on demand basis too. That is,  they should ensure that these number of tunnels are UP and running as long as there is traffic.  Note that all ipsec tunnel negotiation would have same selectors and ipsec implementations should not be having intelligence to remove old tunnels with same selectors.

If persistent feature is selected on SPD rules, then the implementations should ensure that all ipsec tunnels are UP and running all the time.

As described in the earlier article, it is necessary that Ipsec implementation have capability of doing 'Red side fragmentation' so that the LAG always sees UDP header in every packet which is required for its distribution.

DSCP based Ipsec tunnels:

LTE uses packet based network for  voice, streaming,  interactive and non-interactive data.  Hence it is necessary that Ipsec tunnel honor this priority to ensure that voice and other real-time traffic is given priority.  If both data and voice is sent on the same tunnel, there is a possibility of traffic getting dropped due to sequence number checks as part of anti-replay checks in the receiver.  Even though packets are marked with increasing sequence number in both data and voice traffic and encapsulated in the Ipsec tunnel,  due to local QoS and QoS in intermediate devices, voice traffic may be sent before the data traffic - that is traffic is reordered.  As you know, receiver window right edge moves with newer sequence number.  Due to this, some data packets which have lower sequence number get dropped if they are less than the lower edge.  To avoid unnecessary drops, there are two methods used - Increase the Anti-replay window size  or use different SAs (tunnels) for different kinds of traffic. Second method is normally used.

Due to this feature and above LAG feature, number of tunnels that need to be created in eNB and SGW can go up significantly.  Hence both eNB and SGW should have enough memory and computation power to handle multiple tunnels.

Persistent tunnels

To reduce the latency of initial traffic, it is necessary to have this feature. Tunnels are UP and running all the time even when there is no traffic.  This feature is good if the links are always-on and if there is no cost based on the traffic amount.


DSCP and ECN Copy settings:


Ipsec implementations expected to copy DSCP and ECN bits from inner header to outer header.  Inner header DSCP value is set by applications and this should be continued even when the traffic is tunneled.  This will ensure that the nodes between the eNB and SGW will also give QoS treatment.  Hence it is necessary to copy the DSCP bits from inner header to outer header.

ECN bits indicate the congestion to the peer so that peer entity can inform the source entity to apply the congestion.  TCP protocol has a way to inform the source entity when the receiver gets the IP packets with CE (congestion experienced) bit on in ECN bits of IP header.  Intermediate nodes, including eNB and SGW should honor this by copying from inner header to outer header while encapsulating and copy from outer header to inner header while decapsulation.

Peer IP address adoption 

eNodeB gets the IP address from backhaul provider dynamically. It is possible that IP address might be changed by the provider while traffic is going on.  Ipsec tunnels are expected to be UP and running even if the IP address changes on the gateways.   This internet draft discusses the mechanism to adopt the peer gateway address change.  This feature is expected to be present to ensure that voice traffic does not observe too much of jitter and latency.  Note that tunnel establishment takes hundreds of milliseconds as it involves IKE negotiation of the keys.  This can introduce jitter and latency when the voice traffic is going on at that time. Implementation must implement this draft to eliminate jitter and latency issues when the IP address changes on the remote gateway.

IP Fragmentation and Reassembly

Normally many vendors of eNB give performance with respect to UDP traffic without involving IP fragmentation and reassembly. Even though this gives one data point, this may be misleading to customers. Most of the traffic in Internet today is TCP and same is true in LTE world too.  TCP MSS is chosen such as way that TCP data packet with IP and TCP header would be MTU size.  When the traffic undergoes IPsec encapsulation, it is almost certain that packets would need to be fragmented as it exceeds the MTU of the link.  Though DF bit facility is available for end points to know the Path MTU,  this feature is not used in Ipv4 end points today.  Since packets are fragmented,  reassembly is required on other side.

This feature is implemented in Ipsec implementations, but I am afraid that many implementations, though have very good IPsec performance on non-fragmented packets, they are not optimized when fragmentation and reassembly is required.  Customers need to watch out for this as significant amount of traffic would be fragmented and reassembled.

IPv6 Support

Ipv6 is fast becoming choice of service providers and LTE core network.  Hence Ipv6 support is expected in ipsec implementations.  eNB and SGW must support both Ipv4 and IPv6 tunnels.  Also, they should be able to send IPv4 and Ipv6 traffic on IPv4/Ipv6 tunnels.

TFC

Traffic Flow Confidentiality feature is normally given less importance.  I was told that LTE networks require this feature be implemented in Ipsec tunnels so that the anybody who gets hold of backhaul network traffic will not be able to guess the type of traffic that is going on in the tunnels based on traffic characteristics such as - frequency of traffic,  size of packets,  distribution of packets  etc.. 


AES-GCM

AES-GCM combines both encryption and integrity in one algorithm. Hence it is called combined algorithm.  This algorithms is supposed to be 2x faster than AES-CBC algorithm. Also it is supposed to have half of latency of AES-CBC.  Hence it is good for both performance and also for latency which is required for voice traffic.  Hence it is becoming popular algorithm in eNB and SGW.

Validation Engineers and customers I believe should look for above features in eNB and SGW.

Comments?

Saturday, September 25, 2010

Link Aggregation and Ipsec

Link Aggregation is also called Ethernet Trunking and Bonding.  This feature is described in 802.3ad. In 2008, this was rolled into 802.1AX group.

What is LAG:

LAG combines multiple Ethernet Ports and exposes it as one link to the upper layers in the system.
It is Layer 2 concept.  Only trunk port would be assigned with IP addresses.  Links in the trunk don't have any Layer 3 information.  Only one MAC address would be used for the trunk.  Individual MAC addresses of the links don't appear in any communication other than control protocol (Marker protocol).

How does it work?

LAG contains two components - Distributor and Collector.
Distributor distributes the outgoing traffic across the links that constitute the trunk.  Collector collects the data in inbound direction coming from different links and tunnels through the trunk port to rest of the system.
LAG assumes that all links in the trunk are full duplex and point to point.
Simplest distribution is to distribute the packet by packet across the links based on weight configured on the links.  But there could be packet mis-ordering issues.


What are some critical items to be taken care by the Distributor:

Packet mis-ordering is one of the issues distributor would face if it distributes the traffic blindly on per packet basis.   To avoid mis-ordering,  distributors are expected to send all the packets of given flow (conversation) sent on the same link.  First generation distributors used to apply hash on source and destination IP and select the link based on hash value.  Though this ensures that the traffic belonging to one conversation goes on the same link, but the distribution may not be symmetric for some workloads.  Second generation distributors go one step beyond and apply the hash on TCP/UDP ports too.  This would give better distribution, but it may have some mis-ordering problem if the outbound packets are fragments.  Only first fragment would have transport header and other fragments don't have transport header. In those cases, there is a chance that non-initial fragments go on some other link and lead to mis-ordering.  Since fragments are not very common, some deployment accepts some level of mis-ordering to get the better utilization of the links.

Collector and Packet mis-ordering:  

To ensure that packets are delivered in order, collector should ensure to send the packets up in order it receives on any given link. There is no order to be maintained on packets coming in across links.  Collector also should ensure that it does not starve any link while receiving the packets.

Ipsec and LAG:

In some deployments, traffic is always encrypted via Ipsec and sent to the remote office.  If one tunnel  is used to send the traffic, all the traffic going from the local network to remote gateway contains same source, destination IP addresses.  In case UDP traversal is applied, it would have same source and destination ports.  Even if there are multiple links in LAG, distributor hash will fall onto only one link and other links would not be used. 

Same is true with Reverse traffic.  Also note that Links in Aggregation group are with local ISP.  Remote gateway under same admin control will not know about local Link aggregation.  That is, incoming traffic balancing across the links is in the hands of service provider.  It is okay to assume that most of 802.3ad distributors are configured with to use IP addresses and in some cases even ports.

Since distributors only know the IP addresses and ports of the packet, links would be utilized well in both directions if there are large number of flows with different IP addresses and Ports. 


Solutions

There are two solutions I can think of.

Solution 1:  Using Multiple IP address on the trunk link. 

Create as many tunnels as number of IP addresses on the trunk link with remote gateway.  As described in the link here,  some software should distribute the flows across these IPsec tunnels.  Since each tunnel now has different source IP address in the outer IP header, LAG distribution hash may fall into different links and thereby utilizing the bandwidth well in outbound direction.  One should ensure that, there are many local IP addresses to ensure that all links are used and also all links are used evenly.

Reverse traffic also would be balanced fine as service provider switch also would see different IP addresses (Destination IP).

Getting or assigning  multiple public IP addresses to the trunk may not be possible.  In which case, second solution can be used. But second solution would have some packet overheads.

Solution 2 :   Usage forceful NAT-T

Even though NAT is not detected,  there are ways to force UDP traversal. That is, ESP packets are sent with in UDP payload.  Create as many tunnels as necessary for good distribution at the LAG level.  Each tunnel would have different UDP source port.  Some software in the device is expected to balance the traffic across these tunnels. LAG would distribute the tunnels across multiple links of the LAG.  Reverse traffic also would be balanced on different links due to different destination port values of tunnels.  Since it is expected that LAG distributor look at the transport header for distribution, it is necessary that there are no fragments.  So, it is mandatory that tunnels are configured with redside fragmentation. This will ensure that fragmentation is done before Ipsec encapsulation.


In both the solutions,  both remote and local gateways should have some logic to
  • know that multiple tunnels are created for same selectors for distributing the flows.
  • know how to distribute different conversations to different tunnels.
It requires more tunnel capacity in devices.  This should not be a problem as modern devices has good horse power and enough memory to create some more tunnels with peer gateway.


Comments?

Saturday, September 18, 2010

Web Application firewalls, IPS & Network Anti Virus - Fixing the performance issues

Security professionals know that the intrusion and malware detection is now beyond looking at stream of packets.  Detection require
  • SSL Decryption - Many client side attacks are increasingly hidden in HTTPS connections.( Check this out )
  • Extracting data from the packets (Example:  HTML, Javascript,  Different types of files to detect attacks embedded in the data)  ( See this )
  • Decoding the data (Such as UTF-8, UTF-16, De-compression etc..)
  • Emulation of data if the data is script (such as Javascript) to counter evasion techniques used by attackers.
  • Comparing with known signatures or codelets OR doing some kind of heuristics
This kind of analysis is not possible with stream based firewalls, IPS and AV.  These require collection of data.  They all require proxies. If some IPS/AV vendor says that they do detection without reassembling data and collecting the data, then as end user you will not be wrong to say that they either miss lot of intrusions or give too many false positives.

Computational power to do above is very high.  It is not surprising to see just less than 10Mbps of IPS, AV combined performance in devices which give 1Gbps of firewall, Ipsec throughput.  I hear stories of customer disappointments when they turn on IPS and/or AV functionality in security devices.

Network security analysts advising companies to enable full functionality even for traffic originated from trusted networks.  It should not be surprising anybody as trusted network boundary is reducing  due to mobility of machines in trusted network. That is, machines are moving from trusted to untrusted and vice versa. Examples :  laptops, ipads etc..   These  machines may get infected when they are in untrusted network and may get infect other machines in trusted network when they are brought into corporate networks. That is the reason, now full protection is being enabled on the security devices. 

HTTP is singlemost protocol that occupies majority of network bandwidth in many organizations.  HTTP is also interactive protocol. Any performance issue also  impact the user experience.   Solving HTTP performance problem not only improves user experience, but also would increase the performance of overall system.

Techniques that can be used to improve the performance of HTTP Anti-malware and IPS analysis are given below. End users might look for following features.

  • Avoid doing duplicate IPS and Anti-Malware checks :  It is very common tha same resource is requested by same/multiple users in the orgnaization via HTTP.  Nework device once AV and IPS check is done on the resource should avoid doing the check again.  This requires caching of AV and IPS analysis and using it when the same resource is requested at later time.   Ofcourse, it should have life time so that it checks for AV/IPS if the content of the resource is changed.  Life time can be equal to the Expiry time of the resource which comes along with the HTTP response headers.  If possible,  this system also can do caching of the response which avoids even going to origin server, there by saving the WAN bandwidth too.  I believe that AV/IPS devices would have HTTP Caching moving forward.
  • Auto blacklisting of URIs :  Malware may be served with dynamic content. In which case, above mechanism of caching does not work.  More often, Malware is served using the same URI.  If the data downloaded from a URI contains the malware, that URI can be blacklisted if malware is detected multiple times.   If the request comes to the same URI at later time,  request can be denied without even senidng the request to the origin server.  Always make sure that the newer blacklisted entries are honored by the device.
  • TCP and SSL offload:  Proxies can benefit greatly if some other entity such as intelligent PCI-e takes care of TCP/IP stack and SSL offload.   
  • Implement proxies as per my earlier post.
  • Usage of Multicore processors and distributing the load across multiple cores.  Selection of Multicore processor depends on several factors such as cost, number of cores (performance), acceleration features etc..  But here I am only covering the features.  Features that would help in processing are :
    • Processing power - Higher the processing power, better the performance would be.
    • Cache Size matters:  Unlike typical firewall/Ipsec processing, amount of code that gets executed in doing AV/IPS analysis is lot higher. Higher sized L1 and L2/L3 caches would store more instructions and goes to DDR less often.  Cache for storing data is also important.
    • Acceleration hardware -
      • Compression/Decompression Accelerator:  To take care of decomperssing the compressed files coming in the HTTP response.
      • SIMD (Single instruction Multiple Data) based hardware to do acceleration of
        • Memory /String operations - Copy, Set
        • Checksum, CRC operations
        • HTML and URL decoding operations.
        • and many more...
Hope it helps..

Saturday, September 4, 2010

LSN A+P Q&A

In A+P mode, what is the need for CPE to send NATTed packets to the provider box (PRR or LSN)?
This is mainly for security reasons.  You can't assume that CPE are all good citizens. Here, one public IP address is assigned to multple CPE devices with different port range for source port NAT.  If the rogue CPE or misbehaving CPE uses ports for source port NAT beyond assigned values, it can distrub the traffic of some other CPE.  To ensure that this does not happen, all packets are sent to the centralized box in provider network.  Provider network validates the source ports of outgoing connections of the CPE and transmits out onto the Internet only if source port is one of the ports assigned to the CPE box. Provider box (PRR) maitains the table with IPv6 address (which is used as source IP of tunnel by the CPE ) and the allocated ports.  This table would be referred by the PRR to validate the source ports of the connections.

Can rogue CPE mount DOS attack on other CPE if the rogue CPE knows the IPv6 address and the port allocations of the victim CPE?

In theory, it is possible.  Rogue CPE may not be able to get hold of the traffic of victim CPE, but it can mount DOS attack.  Rogue CPE can disable some portions of victim CPE communication by using all ports.   I believe it is necessary that each CPE authenticates itself to the PRR before sending the traffic over the IPv6 tunnel.  It is not there today, but it is natural expect in my view.

One proposal I saw can make this DoS attack difficult.  If the PRR assigns the random ports to the CPE rather than fixed range, then it makes it difficult for rogue CPE to determine the  exact pots used by other CPEs.

Havind said that, a rogue CPE can, it it wants, mount DOS attack such a way that it can use up the ports of different CPE devices if it knows the IPv6 addresses.  So, it is good to have authenticaiton support bulit into creating the IPv6 tunnel.

I personally belive that IPsec IPv6 tunnels with IKEv2 would be the right fit. It increases the processing requirements, but it is secure.  IPsec allows transport of IPv4 packets over IPv6 tunnel in addition to IPv6 packets in IPv6 tunnel.  It provides not only authentication of the CPE device, but also secures the traffic between CPE device and the PRR.  It also can retain the QoS characterstics of the differnet packets between CPE and LSN.  If data security is not required, ESP with Authentication can be used which is less expensive from computation processing is concerned on the LSN device.

In 3GPP,  Femto Access Points (CPE devices) already have IPsec tunnel with IKEv2 to SeGW (ePDG).  Same tunnel can be used to transport IPv4 packets to the ePDG, if ePDG is equipped with the PRR & LSN functionality.

Are there any CPE devices or Smart phones supporting LSN & A+P functionality?

I don't have this information.  I saw some internet postings from Android based phone vendors asking about LSN.  So, I beleive it is in vendors radar, but not sure whether anybody has solutions in the market. As I indicated in my last post, Cisco has LSN functionality on service provider side  in their portfolio.  A10networks also has some service provider boxes supporting LSN. 

Thursday, September 2, 2010

IPv6 - finally? Is LSN becoming transition technology?

Some articles in recent past is indicating that IPv6 is being adopted by vendors and service providers.

This article here indicates that Cisco has LSN solution for service providers.  As described here, LSN provides transition from IPv4 to IPv6 gradually.  I like the phrase used by Cisco in the article on IPv6 transition -  preserve, prepare and prosper.  Preserve the Ipv4 infrastructure while preparing for IPv4/IPv6 transition (LSN) and then move to Prosper phase with complete Ipv6.

This article here says that comcast is going to test LSN in Q3 2010.  This is very good news that IPv6 is being adopted by service providers.

As I indicated in my previous article LSN itself is not good enough for some subscribers. A+P addition in my view is very much required.

I am not seeing much activity in CPE device market supporting A+P functionality in it.  I guess it is natural that it would happen soon.  I also see some internet drafts which add some additional options in DHCPv6 and PPPv6 to assign PRR address.  I am still yet to see any internet drafts on allocate/de-allocate public IPv4+port pairs dynamically.  If anybody has come across any specifications on this, please comment.