Saturday, July 17, 2010

Large Scale NAT with DS-Lite & A+P

Dual Stack Lite, Address plus Port assignment to CPE devices by ISPs are two most important mechanisms being adopted by ISPs to provide connectivity to IPv4 internet to smart phones, CPE devices in Residential markets and Femto CPEs.

Dual Stack Lite and A+P mechanisms are being done to ensure that IPv6 transition is smooth as Internet becomes Ipv6 addressable over time.

ISPs are facing shortage of public IPv4 addresses. Demand was increased with popularity of Internet in general and in particular the explosion growth of smart phones. Many ISPs are no longer in position to provide the dynamic public IP address to the CPE devices and smart phones. Note that CPE devices and smart phones have become always-on. So dynamic IP address is really becoming static.

Until world moves to the Ipv6,  only way is sharing of IPv4 address across multiple subscribers.

Many mobile service providers are only giving the private IP address to the smart phones and ISPs  This trend is continuing even for CPE devices with its explosive growth.   ISPs maintain mega NAT boxes which translate the traffic from CPE and smart phones with certain number of public IP addresses.  CPE devices already do their own NAT between IP addresses it assigns to local machines in the LAN with the IP address provided by the ISP.  Due to this, there is double NAT.  Though it works in majority of cases, there are some limitations which could be problematic for end customers, hence to the ISP business.
  •  Connectivity could be lost if dynamic  private IP address is assigned by the ISP is part of the private subnet the CPE is configured to assign to the local machines.
  • Though not a big concern immediately,  Bigger ISPs might have customer more than the private IP address space. If one goes with 10.x.x.x network, then ISPs might provide address to 2^24 subscribers.  With smart phones in the range of 120M in 2012,  it is a possibility that ISPs might even don't have many unique private IP addresses to assign.
  • Applications requiring special ALG will not work if both NAT devices (CPE as well as Carrier NAT) don't support the ALGs.   ISP Carrier NAT box may not entertain proprietary ALGs or may not have many ALGs.
  • Two internal machines that need to communicate among themselves (peer-to-peer) applications may not work in double NAT scenarios (hair pin scenarios).
  • Many peer-to-peer applications expect same IP and Port to be used for SNAT even though destination machines are different.  If not supported by Carrier NAT, many peer-to-peer applications may not work.
Large Scale NAT  (LSN) solves some of above limitations by doing NAT at only one place. In this model,  CPE is given the IPv6 address on its WAN interface.  If IPv6 machines are communicating with IPv6 destinations, then there is no IPv4 involved and it works fine.  If IPv4 machine in private network is communicating with public Ipv4 network in the Internet, then these packets are tunneled to the LSN box sitting in the provider network.  Ipv6 is used to tunnel IPv4 packets between CPE and LSN.  LNSN box in Provide network does the NAPT.   LSN eliminates double NAT.  LSN also takes care of overlapping private IP addresses among multiple subscribers by keeping the IPv6 tunnel end point address as one of the identification parameter to map the NAT entry.  It still has problems related to ALGs.  That is, if the LSN does not support ALGs for some applications, then these applications never work.  Also, LSN need to be high performing box with respect to throughput,  latency and jitter.  Though this can be solved by multiple LSN boxes, but ALG problems are too big for adoption of this technology. Also, if any application needs to be hosted, this becomes tough as port forwarding is controlled at Carrier NAT rather than the CPE gateway as we all normally accustomed to.

A+P (Address Plus Port) specifications provides the flexibility of doing NAT with CPE.  In this case, multiple CPEs are given with same public IP address, but with different ports.  CPE NAT is only expected to use assigned ports for source port NAT.  CPE can decide not to do NAT for some connections and in which case LSN in provider network would do the NAT.  Based on different people experience, I believe only few ports are necessary by the CPE due to feature called 'Dense NAT'.  That is, same source port can be used across multiple connections as long as 5-tuple is different across the connections on the external realm. Some web sites using AJAX may make multiple connections at the same time. I believe there are some sites which make almost 60+ connections at a time.  128 port range is good enough for many cases.  What it means is that, even without LSN, same IP address can be used across 128 subscribers assuming each subscriber requires 128 ports.  With A+P alone, the ISP can increase his customer base 128 times with the public IPv4 addresses the ISP has.  Since NAT is done at the CPE, all the facilities as in current CPE boxes are possible. It can have port forwarding feature,  each CPE can have its own ALGs and each one can have port triggering feature.  Having said that, it has its own limitations - IPsec without UDP NAT traversal does not work.  ICMP Echo Request/Reply would need to be taken care little bit more carefully as it does not have port concept.

I believe Comcast is in advanced stages of deploying LSN. Many ISPs would be requiring to install some kind of solution very soon.  In my view, the solution would be combination of LSN and A+P.  ISPs will differentiate subscribers using following subscriptions.
  • Subscribers requiring static public IP address.
    • Subscribers hosting any servers on standard ports which are expected to be reached via their own Domain name.
  • Subscribers requiring dynamic public IP address.
    • Subscribers hosting servers on standard ports with DDNS.
  • Subscribers with shared public IP address and dedicated ports for SNAT.
    • Subscribers hosting games or hosting servers on non-standard ports.
  • Subscribers with shared public IP address and shared ports (LSN).
    • Subscribers just requiring outbound access.
CPE devices and Smart phones require some support if they intend to take advantage of LSN and A+P.  When I was going through the specifications,  I had one doubt on why the CPE NATted packets need to go through the IPv6 tunnel to the PRR (Port Range Router).  I think the reason why they need to go is to ensure that the CPE device really did NAT with the IP address and Ports that were allocated to it by the PRR. It is required to ensure that CPE devices are behaving well and does not damage the connectivity of other CPE devices.

Let us see the kind of changes required in CPE devices.

Features expected in CPE:

  • CPE must support IPv6 addressing on its WAN Interfaces. 
  • Learning of LSN IPv6 address using DHCP extensions, PPP extensions Or CPE should have facility to provide static configuration.
  • In case CPE supports A+P, it should also learn the public IPv4 address and Port range(s)  from DHCP, PPP or via local configuration.  In some cases, more port ranges also can be requested dynamically when the ports are getting exhausted.  If it does not require any port ranges, it should be able to free them back to the PRR.
  • CPE must be able to provide IPv6 addresses to the IPv6 capable hosts in its internal network.
  • CPE must be able to provide IPv4 private addressing to the local hosts in its internal network.
  • CPE must be able to tunnel packets from private IP hosts to the LSN in provider network.
  • In case of A+P, it should have intelligence to figure out which connecting to be NATTed at the CPE and which one are allowed to be done at the LSN.
  • CPE optionally also can support providing the A+P to the local hosts if they are A+P aware. In which case, CPE also acts as local PRR.
Features expected in PRR:
  • It should be able to do Address+Port Management via signaling protocols (DHCP, PPP or web based management).
  • It should be figure out the packets that needs to go through LSN and if so, send those packets to LSN.
  • It should ensure that the packets are NATted by CPE with its delegated addresses and ports. If not, it should discard the packets.
  • It should provide facilities WCCP protocol for security checks (AV, AS,  IPS etc..).
  • It should terminate IPv6 tunnels and should be prepared to make Ipv6 tunnels to the LSN.
  • It should be scalable:  Good algorithms to 
    • Search tunnel for incoming packets from the CPE.
    • Search Address+Port based entries for packets coming from Internet to identify the CPE device and hence the tunnel.
Features expected in LSN:
  • It should be able to terminate large number of tunnels.
  • It should be able to maintain large number of NAT entries.
  • It should be able to work with CPE devices having same private IP address space.
  • It should be able to stateful failover if device fails.
  • It should support popular application ALGs such as:
    • FTP, RTSP, SIP, H.323. MGCP,  PPTP, L2TP and more..
I hope it helps.