Overview:
Stateful inspection firewalls open temporary holes to allow data connections based on information it reads from the control connection. Some protocols such as FTP, SIP, RSTP, H.323, MGCP open a connection and exchange IP address, port information to peer end point for data transfer. Ports that are exchanged in control connections are not well known ports and they are ephemeral. Due to this, administrators can't configure firewall rules to allow these connections without allowing everything. Application Level Gateways (ALGs) are software modules within firewall interpret the protocol packets by extracting the ephemeral port information and open temporary holes to allow data connections to pass through the firewall between protocol end points. Since each protocol is different, multiple ALG modules are required - one for each protocol.
ALGs also do the address and port translation in the protocol data if firewall supports NAT functionality. If the IP address or ports are specified in ASCII form, there is a big possibility where the data length of the packet changes after translation. In case of TCP based ALG, this results into sequence number modifications in TCP header. Firewalls typically take care of maintaining the delta sequence numbers and modify further packets with this delta in both "Sequence number" and "Ack number" fields to ensure the integrity is maintained with client and sever end points of the connection. It is also important to note that ALGs modify different packets during the life of session and firewall software is expected to keep updating the delta sequence numbers appropriately. It is also to be noted that firewalls need to keep the history of delta numbers with respect to original sequence numbers to apply appropriate delta in case of retransmitted packets which are older.
To apply translation on the data, it is required that the ALG has complete PDU. In some protocols such as H.323 and SIP, this can be large. If there is congestion in the network, the end point does not send the PDU in one TCP packet and requires acknowledgment to send rest of PDU. Due to this, newer generation of firewalls send the acknowledgment to make the end point send rest of protocol data.
Many routers change the TCP MSS value of SYN and SYN+ACK packets of transit traffic to lower value whenever there is multimedia traffic to ensure that VOIP packets do not get stuck. As we all know, Voice traffic is delay sensitive and it should be transmitted as soon as possible. If routers has slow link then it takes significant time to transmit 1500 byte packets. If link bandwidth is 256kbps, it takes around 45msec to transmit 1500 byte packet. If WAN controller of the router chooses 1500 byte packet and if VOIP packet comes right after that, then VOIP packet may need to wait upto 45msec there by increasing the delay of real time traffic. By lowering the TCP MSS value, the size of TCP packets generated by end points can be controlled. Broadband routers setting the value of MSS value of transit TCP packets to 256 bytes are quite common. In these cases, the protocol data of complex protocols requiring ALG comes in many packets. Firewall and ALGs ensure to extract the relevant data for opening holes and translation even protocol data is coming in multiple TCP packets.
As discussed before, ALGs open temporary holes - pin holes. If ALGs are not implemented well, attackers can make control connection and send PDUs with data such a way that pin holes are created to access internal critical services. Also attacker can DoS the firewall by sending large number of PDUs which creates large number of pin holes there by causing service disruption to genuine users/connections.
ALG implementation can become very complex. Vulnerabilities increase with complexity. Buffer overflows, boundary conditions are typical problems associated with complex protocol implementation. That is one place validation should concentrate on.
Many protocols specifications (standards) don't specify maximum length of protocol messages - especially text based protocols such as SIP, HTTP etc.. Protocol implementations (end points and ALGs) typically assume the typical sizes while allocating buffers to buffer the data and don't allow the traffic if it exceeds this limit. Since these sizes are not universally adopted by different implementations, this could pose interoperability problems if ALG implementation assumption of size is different from end point implementations. This is one area validation should concentrate on. In my view ALG implementations should not assume any size restrictions for the PDUs which are not interpreted for its operation. For PDUs that are needed to be buffered, this size restriction should be as maximum as it can be. Some times the protocol messages is prepended with size information. In those cases, ALG implementations must allocate the buffer based on this size information. Validation should concentrate to ensure that ALG does not impose any problems in functionality.
As said before, ALGs also do the translations in the protocol data. ALG implementation should ensure that the translations happen for all three cases - Source NAT, Destination NAT and Source & Destination NAT. Many times validation Engineers concentrate on testing using one session. Many times problems related to NAT can't be found if only one session is used. Validation Engineers should test the ALG based firewall implementations with multiple sessions.
Recommendations:
As you can see, validation testing of ALGs is not as simple as running standard applications on both ends of firewall device. For example, running standrad FTP client and Server applications on two sides of firewall device and ensuring the file transfer succeeds is necessary, but not enough validation of FTP ALG. I recommend validation engineers to consider following for each ALG before certifying.
Functional testing:
- Test with standard applications: Make a list of popular applications. Ensure that different combination of applications as client and server succeed in following cases. Configure firewall to allow control connections (initial connections) only.
- Without NAT
- Source NAT : With clients behind internal network and servers in external network.
- with NAT IP address whose length in dotted decimal form is more than that of source IP address.
- with NAT IP address whose length in dotted decimal form is equal to length of source IP address.
- With NAT IP address whose length in dotted decimal form is less than that of source IP address.
- Destination NAT: With servers in internal network and clients in external network.
- DNAT IP address in dotted decimal is equal to length of Destination IP address in dotted decimal form.
- DNAT IP address is more in length than that of destination IP address in dotted decimal form.
- DNAT IP address is less in length than that of destination IP adderss in dotted decimal form.
- Source NAT and destination NAT together: With servers in internal network and clients in external network.
- Explore all options of applications and ensure that all options work with above NAT combination.
- Ensure that Private IP address (client IP address in case of SNAT, Destination IP address in case of DNAT) does not appear on the packets after translation. This requires capturing the packets and searching for IP address in both binary and dotted decimal form.
- Understand the protocol and get familiar with messages and fields and their lengths. If size is not mentioned in the protocol specification, get familiar with realistic maximum length of messages and fields. Test to ensure that ALG does not drop messages when messages with maximum sized fields are sent.
- Ensure that ALGs perform well even when there is temporary packet loss. FragRoute tool can be used to drop the packets. Ensure to test with all NAT combinations.
- Ensure that ALGs perform well when the TCP packet sizes are smaller than protocol messages. FragRoute tool can be used to change the TCP packet sizes. Ensure to test with all NAT combinations.
- Ensure that ALGs perform well when TCP packets with smaller size and reordered. FragRoute tool can be used to reorder the TCP segments. Ensure to test with all NAT combinations.
Negative Testing: This testing is required to ensure that systems don't crash when invalid packets are sent.
- Ensure that ALGs don't misbehave (crash or lockup) when protocol messages and fields of different lengths and values are sent. Make a note of all messages in the protocol specifications. Ensure to send messages of different length by writing your own client and server protocol or instrument the existing client and server implementation. I prefer later to reduce the effort required to do this kind of testing. It is relatively simple to generate messages and fields with different lengths and values for the first message of protocol. But for messages that are deep down the protocol require successful initial protocol phase. Hence I prefer going with instrumenting the existing open source client/server protocol implementations.
- Go through some of the common vulnerabilities found in client and server implementations by searching through the CERT repository. Since ALG is also interpreting the protocol messages, ensure that these kinds of vulnerabilities are not present in the ALG.
Stress testing: This is one important step to ensure that system under test can cope up with capacity specified. Also, it is important to ensure that system performs as per performance requirements.
- Use IXIA/SmartBits kind of tools to simulate large number of client and servers to ensure that system works as specified with respect to capacity.
- Use IXIA/SmartBits to test the connection rate and ensure it satisfies the specifications of the box.
- Use IXIA/SmartBits to test the throughput requirements.
- Use IXIA/SmartBits to test throughput and connection rate combination requirements.
- Repeat above test cases for 12 hours to ensure that the system is stable.