Thursday, December 30, 2010

What are Traffic Monitoring Enabler Switches?

There is increasing trend of  Traffic Monitoring Enabler Switches (TMES) in Enterprise, Data Center and Service provider environments.

Need for TMES:

Traffic monitoring devices are increasingly becoming requirement for networks in Enterprise, Data Center and Service provider environments.  There are multiple types of monitoring devices are being deployed in networks.
  • Traffic Monitoring for intrusion detection:  Security is very important aspect of Enterprise networks.  Intrusion detection is one of the components of comprehensive network security.  IDS devices listen for the traffic passively and do the intrusion analysis on the traffic.  Intrusion attempts and intrusion events are sent to the administrators for out-of-band action.  IDS devices also can be configured to send TCP resets in case of intrusion detection to stop any more traffic going on the TCP connection.  IDS devices also can be configured to block certain traffic for certain amount of time by informing local firewall devices. 
  • Surveillance:  Due to government regulations, all important data needs to be recorded.  Surveillance monitoring devices again listen for the traffic passively and record them in persistent storage for later interpretation.  Surveillance devices also provide capability to recreate the sessions such as Email conversations, file transfer conversations,   Voice and video conversations from the recorded traffic.  Some surveillance devices also provide run time recreation of conversations too.  
  • Network Visibility :  These monitoring devices capture the traffic passively and provide complete network visibility of the traffic. They provide capabilities such as 'Identification of applications such as P2P, IM,  Social networking and many more ',  'Bandwidth usage of different applications,  networks'  and provide analysis for network administrators with valuable information to maintain networks and bandwidth to make  Enterprise critical applications work always.
  • Traffic Trace:   Traffic trace devices help network administrators to find the bottlenecks in different network segments.    These devices tap the traffic at multiple locations in the network and provide the trace capability for finding out the issues in network such as misconfiguration of different devices in network,  choke points etc..
Network administrators face following challenges to deploy multiple monitoring devices.
  • Few SPAN ports in existing L2 switch infrastructure:  Many L2 switch vendors provide one or at the most two SPAN ports.  L2 switches replicate the packets to SPAN ports.  Since there are only two SPAN ports at the most,  only two types of monitoring devices can be connected.  This is one big limitation network administrators face.  
  • Multiple network tap points :  In complex network infrastructure, there are multiple points where monitoring is required.  Placing  multiple monitoring devices at each point is too expensive. Network administrators would like to use same monitoring devices to capture traffic at multiple locations. 
  • Capacity limitations of monitoring devices:  With increasing bandwidth in the networks, it is possible that one monitoring device may not be able to cope with the traffic.  Administrators would like to multiple monitoring devices of same type to capture the traffic with some external component doing load balancing the sessions across multiple monitoring devices.
  • High Capacity Monitoring devices :  There could be instances where monitoring device can take more load. In these cases, one monitoring device can take load from several tap points.  Administrators look for facility to aggregate the traffic from multiple points to one or few monitoring devices of same type.
  • Non-Switch capture points :  Network administrator may like monitoring of traffic in a point where there are no switches -  Router to Server,  Wireless LAN Access Point to Access Concentrator etc..   Since there is no switch, there are no SPAN ports.  Network administrators look for some mechanism such as Inline TAP functionality to capture the traffic for monitoring.
What is TMES?

TMES is a switch device with monitoring enabling intelligence to allow connectivity of multiple monitoring devices of different types without any major changes to the existing network infrastructure.

This device taps the traffic from SPAN ports of existing switches in the network and direct the traffic to attached monitoring devices.

TMES allows:
  • Centralization of monitoring devices.
  • Filtering of the traffic.
  • Balancing of traffic to multiple monitoring devices of a given type.
  • Replication of traffic to different types of monitoring devices.
  • Aggregation of traffic from multiple points to same set of monitoring devices.
  • Truncation of data 
  • Data manipulation & masking  of the sensitive content of the traffic being sent to monitoring devices.
  • Inline TAP functionality to allow capture points where there are no SPAN ports.
  • Time Stamp functionality
  • Stripping off  Layer 2 and Tunnel headers that are unrecognized by monitoring devices.
  • Conditioning of the burst traffic going to the monitoring devices.
Centralization of Monitoring Devices: 

Without TMES,  monitoring devices need to be placed at different locations in the network.  With TMES,  TAP points are connected to TMES and monitoring devices are connected to only TMES ports.

Filtering of Traffic:

This feature of TMES allows filtering of unwanted traffic to a given monitoring device.  Monitoring devices are normally listen for traffic in promiscuous mode. That is, monitoring device gets all the traffic that is going on the wire.  But all the traffic is not interesting to the monitoring device.   Typically monitoring device itself does the filtering.  By offloading filtering out of it,  it saves valuable cycles in receiving the traffic (interrupts) and filtering the traffic.  TMES takes this load out of monitoring device and thereby increase the capacity of monitoring devices.

Filtering of traffic should not only be restricted to unicast. It should be made available even for Multicast and broadcast packets.

Balancing the traffic to Multiple Monitoring devices of a given type of monitoring:

If the amount of traffic that needs to be recorded is very high, then multiple monitoring devices will be deployed.   TMES allows multiple monitoring devices to take the load.  TMES load balances the sessions (not packets) to multiple monitoring devices based on performance of monitoring device.  By balancing based on sessions,  TMES ensures that all the traffic for a given connection go to one monitor device.

Replication of Traffic

When there are different types of monitoring devices, each device is expected to get the traffic.  As discussed above,  traditional L2 switches have at the most two SPAN ports.  TMES is expected to replicate the traffic as many number of times as number of different monitoring device types and send the replicated traffic to the monitoring devices.


Combining the replication feature with load balancing:  Assume that a deployment requires the traffic to be sent to two types of monitoring devices -  IDS and Surveillance. This deployment requires  that the 6Gbps bandwidth traffic to be analyzed and recorded.  If IDS and Surveillance devices can analyze and record only 2Gbps bandwidth, then the deployment requires 3 IDS devices and 3 Surveillance devices.  In this case, TMES is expected to replicate the original traffic twice - One for IDS devices and another for Surveillance devices.  Then TMES is expected to balance the one set of replicated packets to go to one of 3 IDS devices and second set to go to one of three Surveillance devices.  

Aggregation of traffic from multiple points to same set of monitoring devices

As discussed in 'Centralization of Monitoring devices',  traffic from different locations of the network can be tapped.   TMES is expected to provide multiple ports to receive the traffic from multiple locations in the network, filter the traffic, replicate and balance the traffic across monitoring devices.  It is possible that traffic of a given connection might be going through multiple points and hence there could be duplication of traffic coming to the TMES.  It is also possible that duplicated traffic might be going to the same monitoring device.  Monitoring device might get confused with duplicated traffic.  To avoid this scenario,  TMES is expected to mark the packets based on the incoming port on TMFS (that is the capture point) such as adding different VLAN ID based on the capture point or adding an IP option etc..    This would allow monitoring device to distinguish the same connection traffic across different capture points.




Truncation or Slicing of packets


Some monitoring device types such as traffic measuring devices don't require complete data of the packets to come in. By slicing the packet to smaller packet would increase the performance of those monitoring devices. TMESs are expected to provide this functionality before sending the packets to the monitoring devices. Truncate value is with respect to payload of TCP, UDP etc..  Some monitoring devices are only interested in headers upto layer 4.  In this case, truncation value can be 0.  Some monitoring devices may expect to see few bytes of payload. TMESes are expected to provide this flexibility of configuring truncate value.

Truncation of packet content should not reflect in the IP payload size. It should be kept intact to ensure that monitoring devices can figure out the original data length even though it receives truncated packets.



Data  masking 

Based on type of monitoring devices,  administrator may like to mask some sensitive information such as credit cards,  user names and passwords from being recorded.   TMES is expected to provide this functionality  of pattern match and mask the content there by removing privacy concerns.

TMES also might support Data replacement (DR).   DR feature might increase the size of the data. Though it is not a big issue for UDP type of sessions, it requires good amount of handling for TCP connections.  As we all know TCP sequence number represent the bytes, not packets.  So, any changes in the data size requires sequence number update.  It not only requires sequence number update in the affected packet, but also further packets going on the session.  All new packets would undergo the sequence number update. Similarly, ACK sequence number of reverse packets also should be updated while sending the packets to the monitoring devices.

When DR feature is combined with the 'Replication' feature,  this delta sequence number update can be different for different replicated packets.  Delta sequence number update feature is required to ensure that monitoring devices find the packet consistency with respect to sequence numbers and the data.

Some TMES vendors call this as part of DPI feature.


Inline TAP functionality:



Many places in the network might not have L2 switch to get hold of traffic from SPAN ports. If the traffic needs to be monitored from those points, then one choice is to place L2 switch and pass the traffic to the monitoring devices through the SPAN ports.  If the capture points are high, then there is a need for placing multiple L2 switches.  Inline TAP functionality is expected to be there in the TMES.  Two ports of TMES are required to TAP the traffic from these capture points.  These two ports will act as L2 switch while replicating traffic for monitoring devices.  Baiscally, TMES is expected to act as L2 swtich for these capture points. Since there are many capture points, TMES essentially become Multi-switch device with each logical switch having two ports.

Time Stamping of Packets

Analysis of traffic that was recorded acorss multiple monitoring devices would be a requirement in general.  It means that the recording devices should have same clock reference so that analysis engine knows the order in which the packets were received.  Yet times, it is not practical to assume that the monitoring devices will have expensive clock synchronization mechanisms.  Since TMES is becoming a central location to get hold of traffic and redirecting them to monitoring devices, TMES is expected to add time stamp in each packet that is being sent to the monitoring devices. 

IP protocol provides an option called 'Internet TimeStamp'.  This option expects TMES to fill its IP address and timestamp in milliseconds with midnight UT.  

Stripping of L2 and Tunnel headers


Many monitoring devices don't understand complicated L2 headers such as MPLS, PPPoE and tunnel headers such as PPTP (GRE),  GTP (in case of wireless core networks),  L2TP-Data,  IP-in-IP,  IPv6 in IPv4 (Toredo,  6-to-4, 6-in-4) and many more.  Monitoring devices are primarily interested in inner packets. TMESs are expected to provide stripping functionality and provide basic IP packets to the monitoring devices.  Since monitoring devices expect to see some known L2 header,  TMESes typically are expected to strip off tunnel headers and complicated L2 headers and keep Ethernet header intact.  If Ethernet header is not present,  TMESes are expected to add dummy Ethernet header to satisfy the monitoring device reception.


Traffic Conditioning

Monitroing devices are normally rated for certain amount of Mbps. Yet times, there could be bursts in the traffic, even though overall average traffic rate is within the device rating.  To avoid any packet drop due  to brusts,  TMES is expected to condition the traffic going to monitoring device.

Players :

I came across few vendors who are providing solutions meeting most of abvoe requirements.

Gigamon :  http://www.gigamon.com/
Anue Systems:  http://www.anuesystems.com/
NetOptics:  http://www.netoptics.com


I believe this market is yet to mature and there is lot of good upside potential.

There is good need for monitoring devices and hence the need for TMES will only go up in coming years.

No comments: