Thursday, April 15, 2010

SMB Evasions by attackers - Tips to prevent them in IDS/IPS devices

DCE RPC packets can also come with SMB.  This article talked about some of the DCE RPC evasions by the attacker and way to detect the attacks even with these evasion techniques.  Since DCE RPC packets can come on SMB packets, it is important to understand some of evasion techniques used by attackers on the SMB protocol itself. 


DCE RPC messages are predominantly embedded in SMB Messages such as SMB_COM_READ response,  SMB_COM_WRITE and ANDx versions of them. Also DCE RPC messages are also sent with SMB_COM_TRANSACT command and response messages. Note that these evasion techniques are not only useful for detecting attacks in DCE RPC based applications, but also CIFS (SMB) itself. 


Protocol details of SMB are described very well here.  


Many IDS/IPS devices don't have protocol intelligence of SMB and DCE RPC protocols. IDS/IPS systems that depend on generic pattern matching can be bypassed by attackers with simple evasion (obfuscation) techniques.  Let us examine some of the evasion techniques.


1. ANDx messages:


As indicated above, there are commands with ANDx version. Any command/response ending with ANDx have following structure in the packet after SMB Header.




SMB_Parameters
  {
  UCHAR  WordCount;
  Words
    {
    UCHAR  AndXCommand;
    UCHAR  AndXReserved;
    USHORT AndXOffset;
    USHORT FID;
    ULONG  Offset;

    USHORT MaxCountOfBytesToReturn;
    USHORT MinCountOfBytesToReturn;
    ULONG  Timeout;
    USHORT Remaining;
    ULONG OffsetHigh (optional);
    }
  }
SMB_Data
  {
  USHORT ByteCount;
  }





Variable size of SMB Parameters : Many IDS/IPS devices assume that the 'wordcount' is constant for a given ANDx message.  For example,  for SMB_COM_READ_ANDx message, the 'word count' is assumed to be 10 words. But it can be 12 words in case of 64 bit offset (OffsetHigh).  IDS/IPS devices assuming 10 words and interpreting the data would have the detection wrong.  Attacker deliberately set the wordCount to 12 words even though OffsetHigh is 0.  IDS/IPS devices must interpret the 'word Count' to move to the data section. 


Multiple ANDx messages under one SMB message (with one SMB Header): 


Many IDS/IPS devices assume that there is only one command (or response) in the SMB message. But SMB protocol allows multiple ANDx commands (or responses) in one single message.  Every command/response would have its own 'SMB Parameter' and 'SMB Data' blocks. Attackers can put the malicious command/response as non-first command/response in the SMB message to bypass detection by security devices. IDS/IPS devices must interpret the 'AndxCommand' to figure out whether any more commands/responses are present in the message.  AndXCommand is normally set to 0xFF if there are no additional commands. 


Filler between ANDx messages :  AndXOffset field indicates the next command in the SMB message. Since there is explicit mention of offset, is is protocol wise legal to send some additional filler data between Andx commands. Attacker can take advantage of this and put some data to confuse security devices. Security devices thinking that all commands are next to each other would fail to detect the attacks.  Security devices must be aware of this and interpret the AndXOffset appropriately as end systems do.


Out-of-Order of ANDx messages : Here AndX commands can refer to the data in the SMB messages before AndX header. Note that AndXOffset indicates the offset from the beginning of SMB header.  Hence it can be set to any place in the SMB message. This is tricky for IDS/IPS devices as they need to store the complete SMB message before analyzing and hence it increases the memory requirements.  But it is necessary to do this to mitigate any evasion techniques used by attackers.


2.  Transaction Messages


Transaction command messages have this structure. Responses also have similar structure but some fields don't exist. So, be careful in analyzing the command and responses. 




SMB_Parameters
  {
  UCHAR  WordCount;
  Words
    {
    USHORT TotalParameterCount;
    USHORT TotalDataCount;
    USHORT MaxParameterCount;
    USHORT MaxDataCount;
    UCHAR  MaxSetupCount;
    UCHAR  Reserved1;
    USHORT Flags;
    ULONG  Timeout;
    USHORT Reserved2;
    USHORT ParameterCount;
    USHORT ParameterOffset;
    USHORT DataCount;
    USHORT DataOffset;
    UCHAR  SetupCount;
    UCHAR  Reserved3;
    USHORT Setup[SetupCount];
    }
  }
SMB_Data
  {
  USHORT ByteCount;
  Bytes
    {
    SMB_STRING Name;
    UCHAR      Pad1[];
    UCHAR      Trans_Parameters[ParameterCount];
    UCHAR      Pad2[];
    UCHAR      Trans_Data[DataCount];
    }
  }



Fragmentation :  If application payload is bigger than the 'MaxBufferSize' negotiated during setup phase,  application payload is divided across multiple SMB messages with first message having SMB_COM_TRANSACTION command/response and further messages are sent with SMB_COM_TRANSACTION_SECONDARY.   Attackers take advantage of this to evade the detection by security devices which are not reassembling the data that is sent across multiple 'TRANSACTION' messages, even if the real application data is less than 'MaxBufferSize'.  Security devices must ensure that all messages have come in to reassemble by checking the 'TotalParameterCount' and 'TotalDataCount'. If all transaction messages (with same PID, MID, TID, UID in the SMB Header) parameter count and data count adds up to TotalParameterCount and TotalDataCount with different ParameterOffset and DataOffset, then security device can assume that all fragments are received. Note that some attackers try to fool security devices by sending duplicate SMB messages.  Security devices blindly adding 'ParameterCount' and 'DataCount' of all matching SMB messages to match 'Total ParameterCount' and 'TotalDataCount' without checking for unique 'ParameterOffset' and 'DataOffset' can be bypassed with attack detection by sending duplicate SMB messages.


Out-of-Order of Transaction fragments :  As seen before 'Parameter Offset' and 'Data Offset' indicate the position of the parameter and data section of the message in the overall application payload. End SMB systems honor these values while reassembling. So, the order in which they come is not important.  Security devices, if they assume that the packets would be in order' can be evaded by attacker by sending these messages in different order.  



Sunday, April 11, 2010

Detecting the Malware & Intrusions inside SSL/TLS Connections - Tips

I can't stress any more on the need for detecting and preventing attacks on client applications (Desktop/Laptop/Smart phones).  Network security devices certainly have  ways to detect the attacks on HTTP clear connections, but not when attacks are present in the HTTPS connections. Attackers may relay on this and might host malware on HTTPS and invite innocent users to browse these sites through social networking mechanisms.

To detect the attacks on HTTPS and other SSL/TLS connections, it is required that 'Network Security Devices' act as good Man-in-the-middle and get hold of clear data by decrypting it. Typically servers authenticate themselves to the clients using X.509 certificates and client authenticate to the servers using 'user name'/'password' combination.   Network Security Devices protecting servers can do this easily as administrator can upload server certificate to the network security device which in turn can terminate SSL connection on behalf of servers and analyze the clear data.  But when network security devices protect the client machines, server certificates are not in the control of administrator.  On top of that, there are large number of servers (Servers in Internet) that get accessed by client machines.

Since 'Network Security Devices' need to get the clear data,  only choice to terminate the SSL connections even if they are going towards servers in Internet.  So, one way network security devices (NSD) can do this is by creating certificate dynamically  signed by local CA certificate (self signed certificate).  There are three main considerations these NSDs should ensure.
  • User experience at the browser should not suffer on per page/site basis.
  • Performance of NSD should not be overly degraded.
  • Security of the network should not be compromised.
Let us revisit how these requirements are met after discussing the approach first.

Flow of the Connection:
  • NSDs should act as SSL/TLS Proxies.
  • NSD terminates TCP Connection made by client.
  • NSD makes TCP connection to the Server.
  • NSD should also proceed with  the SSL Connection with the Server (Server authenticates with the client via X.509 certificate).
    •  NSD must have all known CAs configured at factory default time. Also, it should have facilities to upload new CA certificates, deleted existing CA certificates.
    • It is good to understand popular CAs from the browser default repository and ensure that those CA certificates are present in NSD  for authenticating original servers
    • If authentication is not successful during authentication phase (There could be multiple reasons : No matching CA certificate,  Certificate was expired,  Certificate was revoked etc.. ), then it should send this information to the client with  with exact reason and all possible information in the certificate extension (vendor) and it can also send this information to the browser (if it is HTTPS). This information can be communicated with the administrator so that he can take any remedy that is possible such as populating with new CA certficate(s).  Whether server authentication is successful or not,  following step of dynamic certificate creation is required.
  •  NSD is expected to have pre-configured certificates for its authentication with the clients. NSD is also expected to have self-signed CA certificate.
    • NSD changes the pre-configured certificate with information from the received certificate from server in previous step.  Information such as 'Subject name',  'Extensions',  'Serial Number',  'Validity' would need to get copied from server certificate.   If this certificate is being created to send 'Server authentication failure', then  ' Subject name' only is copied from the server certificate as 'Serial number' and 'validity' may be the actual culprits for server side authentication failure.
    • Issuer name should be the subject name of 'Self signed CA certificate'.
    • NSD signs the certificate with 'Self Signed Certificate's private key.
    • Put it in the cache. If the cache is full, remove the currently un-used old entry.
    • Note that above operations need not be done, if the cache entry is found matching with the 'Subject name', 'Serial Number', 'Validity', 'Issuer name' of received server certificate.
  • Now NSD terminates SSL Connection on the client side.
    • Browser may object to this authentication first time as it does not recognize the 'self signed CA' certificate. Administrator of the organization is expected to give instructions to employees of organization to store this self-signed CA certificate in browsers. Once it is accepted,  browser will not complain again.
  • Once SSL Connection is established on both server and client side,  clear traffic will be seen by the NSD and it can do every thing it can do to detect and protect client machines the sane way it does with clear connections.
 Now let us revisit on how the main considerations are taken care?

User experience at the browser should not suffer on per page/site basis :  Only thing the end user needs to do in the browser is to accept the 'Self Signed Certificate' provided by the NSD one time.  Since NSD is ensuring to keep the Subject name intact,  browser does not keep alerting the user for every site he/she visits.


Performance of NSD should not be overly degraded :  NSD is not generating the certificate/private key pair for every SSL site that is being visited by users.  It is expected to have some X number of certificates/private key pairs with random subject name,  Serial number.  On per connection basis, these are changed with values from the server certificate received as part of SSL connection  and then the signing processing occurs.  By avoided generation public/private-key pairs,  performance does not go down dramatically.  Also, this new certificate/private-key pair is cached.  Only when there is no space in the cache,  there may be possibility of this pair getting removed.  But immediate HTTPS connections to the same server would not undergo above process.


Security of the network should not be compromised : There could be concerns on security.  But it is not all that bad.  NSD itself is authenticating the server with trusted CA certificates. It also provides the information of CA certificate (Mainly subject name, issuer name, Serial number) in the dynamically generated certificate as part of extension so that clients can look at this information when he/she views the received certificate in browsers.  Clear data is visible within NSD device, but not on the wire. SO there should not be any worry about informational disclosure.

Also, administrators normally would be provided with white list configuration where administrator can configure two kinds of white lists - White list of client IP addresses,  White list of destination IP addresses (administrator can configure domain names).  If the connection is coming from the IP addresses which is in the white list, then NSD is not expected to apply MITM mechanism, that is SSL Connection should not be terminated. Similary, if the the destination IP falls in the white list of destination IP addresses, then also SSL connection should not be terminated.  This facility provides additional tool for administrators to satisfy their end users on doing SSL MITM proxying selectively.

If you are a buyer or already have a security product,  ask your vendor whether the device can protect you from attacks that are hidden in SSL connections.

Residential CPE Devices - Conditional DHCP Server, IPv6 Prefix Inheritance from WAN and Network Label

Even though there are large number of IP addresses in Ipv6 world,  RG environments would still get the IPv6 addresses from ISPs dynamically.  There are advantages of doing this.  Home users don't have to be worry about renumbering their routers and internal machines in the home LAN when they switch to new service provider.  In general, it also reduces the amount of configuration one needs to make on the router.

As explained briefly in the article, WAN interfaces of CPE are configured to get the IP prefixes from the service providers.  These IP prefixes are programmed automatically in the DHCP Servers of the CPE. DHCP Servers in turn assigns IP addresses from these prefixes to LAN machines, Media Servers, NAS Servers, VOIP terminals etc.. 

In case of IPv4, this is done somewhat differently.  Service providers don't provide the IP addresses needed for the local LAN machines via WAN interface.  Home user is expected to configure private IP address range in the DHCP Server.  Outgoing traffic would undergo NAT with public IP address given through WAN connection.

In IPv4 world,  different IP address ranges (pools) can be assigned to DHCP Server and provide IP addresses from different pools based on conditions - DHCP User Class & Vendor Identifier Class options values. This is done to identify different types of devices in the LAN for providing differential treatment by CPE functions such as Security functions and QoS functions.  For example,  VOIP TA boxes can be served IP addresses from a separate pool of IP addresses.  This pool of IP addresses can be used in QoS rules to provide higher priority for the traffic coming from VOIP boxes while forwarding the traffic onto bandwidth constrained WAN interfaces. Administrator (home user) configures both DHCP conditional pools as well as QoS policy rules.

In IPv6 world,  there is no NAT.  And administrator does not configure the DHCP IP address pools - Whether it is general pool or conditional pools.  These IP prefixes are inherited from the dynamically assigned prefixes by Service provider. How does administrator configure security or QoS function to provide differential treatment on the traffic coming from different types of machines in the LAN, if he/she does not know  a priori  IP addresses that get assigned to different types of devices?

Fortunately, there is a way.  Many security and QoS policy rules not only take immediate IP addresses as source or destination IPs, but also they take named objects - Network objects. 

How does this work?
  • CPE software should provide facility for administrators to enter network object names in DHCP common and conditional pools.
    • CPE Software is expected to create this object when configured.
    • When WAN Interface gets dynamic IP prefixes from ISPs,  it informs the LAN Device to inherit the prefixes and program the DHCP pools.  As part of this, CPE software divides the WAN prefix into multiple sub-prefixes and each sub-prefix is assigned to DHCP pools across multiple LAN devices.  As part of this assignment,  CPE software is expected to program the IP addresses in the corresponding Network object records.
    • When related WAN interface  loses the connection, it is expected that CPE software removes the IP prefixes from the network objects too.
  • CPE Software security and QoS functions would need to have facility to take network object names in its source and destination IP fields of policy rules.
    • Since network objects are programmed with right IP prefixes,  Security and QoS functions would provide differential traffic treatment.
So, don't forget to add "network object record' to the DHCPv6 server pools while defining data model.

Sunday, April 4, 2010

Network Services deployments in Data Centers

Application Delivery controllers are already part of the many data centers where there are multiple servers to serve the content to end users. ADCs balance the incoming connections to multiple servers for high availability and also to ensure that load is shared.

Let us see where they fit in the data center architectures as listed in here

You might have heard the term 'Network Services Layer'.  This particular area contains network security devices, ADCs and WAN optimization devices.  Many traditional data centers don't have this layer, but newer data centers have this layer.  This layer works in conjunction with Core switching layer. 

Core switches are expected to be configured to pass the selective traffic to network services layer.  There is no change expected in access switch layer or SAN switch layer.  One can have security at these layers too for additional security,  but I guess it will be some more time before data center administrators add any additional security at access level.

Before getting into the details of capabilities of core switches, devices in network switch layer,  let us first visit the addressing of typical data centers:

Many time private IP addresses are used for servers.  Though public IP address can be assigned to server(s) if there is only one for a domain name or in case of DNS load balancing, it is my observation that many times only private IP addresses are assigned.  There are several reasons:
  •  If there is a need in future to expand to multiple servers due to load or high availability considerations,  no changes would be necessary except for some configuration in ADCs and bring up new server (or virtual machine).
  • Many times, servers need to communicate with other internal servers such as database servers, SANs, application servers etc.. .   Private network gives comfort of security as well as reduce the need for number of public IP addresses.
  • For a given domain name,  multiple services may need to be exposed.  Also, for each service, different physical server or different virtual machine may be required.  Private IP addressing with ADCs translating to right private IP address on incoming connections help in facilitating this. If public IP address is used, then all services need to be put in same physical server or virtual machine.
Typically private centers have one domain name and multiple services. Public data centers have multiple domain names and multiple services in each domain.  Each domain name + service name combination may contain multiple servers serving the content/users.    You might have heard term called server farm.  A given server farm is identified by  IP address (resolved IP of domain name ) and Service (Ex: Port 80, 25 etc..). Server farm is configured with IP addresses of real servers serving the content.  ADCs do this job of selecting the least loaded server dynamically upon receiving client connection and awards the connection to selected real server.  In recent times, real servers are subdivided into multiple subsets and each server farm provides additional configuration (rules) to select the subset.  Then least loaded server is selected from this subset.  One example deployment may put  all videos, images and static content  in some servers (set1) and other servers serve the dynamic content (set2).   ADCs can be configured with rules to select set1 for URL containing some file extensions and set2 for everything else. I hope server farm concept is clear.

It is important to understand how ADCs are deployed first:

Small data centers don't have two tier architecture of switches. Jut a simple ADC is good enough. This acts as switch as well as Load balancer.  ADCs have L2 switch and one or more Ethernet MACs.  Ethernet MACs are connected to the network that gets connected to the WAN links and servers get connected to the L2 switch. Basically, in simple deployments ADC acts as access layer switch on server side and Core router/switch on the core network side.  Let us call this as 'simple data center'.

In complex data centers or in public data centers, two tier architectures are required. In this case, core switches are configured to pass the traffic coming from core network to the ADCs and traffic from server network to ADCs for load balancing purposes.  Traffic from server network might go to ADCs without any special configuration in core switches if the client to server packets were translated with SNAT. Due to this SNAT,  server to client traffic will have DIP as the ADC ip address and hence packets will go to ADCs without any special configuration in the core switches. Data centers are complex because it needs to handle large amount of traffic and/or has large number of server farms.  Due to this, some times one ADC may not be enough to take the load of all server farms.  Multiple ADC devices are used in those cases with each ADC handling traffic belonging to few server farms.  In these cases, core switches are expected to provide facilities to segment the traffic on server farm basis and redirect the traffic to ADCs for balancing the traffic across servers within server farm.  To give an example:  If a data center has 100 server farms and 10 ADCs, then the core switch should have capability of segmenting the traffic 100 ways and passing 10 sets of traffic to appropriate ADCs.   This is typically achieved via VLANs.  

As discussed above each server farm is identified on incoming traffic (from core network) by public IP address (resolved IP of domain name) and the port.  If there are 100 server farms, then 100 VLAN IDs are required.  Core switches can be configured to generate VLAN ID based on the incoming traffic.  Core switches have this capability called 'rules'.  Rule can be created with selectors (in this case destination IP and destination port) and action as 'redirect' with VLAN ID and Port on which to transmit.  Switches, when the traffic matches with this kind of rule, adds VLAN ID to the packet and transmits the packets on the port indicated in the rule.

Let us take an example of public data center.  Let us assume that there are two domain names - www.example1.com (P1 public IP address) and www.example2.com (P2 public IP address)   www.example1.com has two services - Port 80 and Port 25.   www.example2.com has one one service - Port 80.   Example1 company wanted four servers - P11, P12, P13, P14 to serve Port 80 content and 1 server for Port 25.  Example2 has two servers P21 and P22 to server port 80 content.  It was decided to use ADC1 and ADC2 with ADC1 handling two server farms of www.example1.com and ADC2 handling www.example2.com server farm.   Let us also assume that ADCs are deployed in two-arm mode.  On one arm it expects client-to-traffic to land and another arm it expects server to client traffic land.  Let us assume that VLAN11 and VLAN12 are for client and server traffic of www.example1.com. And VLAN21 and VLAN22 are for client and server traffic of www.example2.com.  Let us also assume that two physical ports PR11 & PR12 and PR21 & PR22 are used to connect to ADC1 and ADC2 respectively. 

In above deployment, core switch cluster is expected to be configured with following rules:
  • On ports that are towards the Internet:
    • Source IP : ANY  Destination IP: P1  Destination Port : 80 or 25 --------> Add VLAN11 and send on PR11.
    • Source IP : ANY Destination IP: P2  Destination Port : ANY --------> Add VLAN21 and send on PR21.
  • On ports that are toward the Servers
    • Source IP:  P11 Subnet,  Destination IP: ANY, Source Port 80, 25 ----> Add VLAN12 and send on PR12.
    • Source IP:  P21 Subnet,  Destination IP: ANY -------> Add VLAN22 and send on PR22.
ADCs/Server load balancing devices would need to have capability to instantiate VLAN interfaces based on VLAN ID.

Network Security Devices and WAN Optimization devices also can be deployed in similar way.  It is good if all functions are available in the same device. This would only require one set of VLAN configuration in the switch. If  these devices are independent to each other, then switch need to configure multiple VLANs to pass traffic from one device to another via switch.  If routing is allowed, then the one device can have route to another device and switch will pass the traffic from one system to another as any switch do. 

In above example, ADC1 is handling two server farms. But it is also possible to divide these two servers across two different ADCs even though they belong to same domain. As long as service is different, it should be possible.  It is also possible to use one ADC for multiple domain names too.  In this case, ADCs or network security devices provide 'virtual instance' to ensure that traffic of two different domain names are independent and isolated. I am too lazy to type in the configuration required on switches for these two configurations, but I think you got the picture :-)

Data Center Switch requirements for new Data Center Architectures

Traditionally data centers have three tiers of switches -  Core switches,  Aggregate switches and Access switches.
  • Core Switches :  These switches connect to the network which are connected to the WAN links. This is farthest switch farm with respect to servers.
  • Access Switches :  These switches are also called top-of-rack switches.  Servers (Web Servers, Email Servers, Application Servers, Database Servers and others for which data center is built) get connected to the ports of these switches. 
  • Aggregation Switches:  Aggregation switches is intermediate switch layer which is sandwiched between Core and Access switch layers.  Aggregation switch aggregates the traffic between core and access layers.  Note that there could be lot of traffic among servers (Specifically among application, web and database servers). This traffic need not be seen by the core switches.  This traffic just need to be among the access layer switches.  Aggregation layer eliminates the traffic being seen by every switch.  Core switches only see the traffic going to/coming from WAN/Corporate network.  Aggregation layer also reduces the traffic among access layer switches.
It was necessary to have three tiers in earlier data center architectures due to
  •  Large number of physical machines serving the content requires large number of Ethernet ports.  Due to poor density of the ports on the switches, multiple access layer switches were necessary. Multiple switches means there is lot more traffic across access layer switches. One more hierarchy of switches enable good throughput by eliminating mesh kind of access layer switches for intra switch traffic.
What are some of the changes in Data Centers? One big change is collapse of three tiers to two tiers. Aggregation layer is disappearing.  Let us see what is making this change.
  • Virtualizaton technology is reducing the number of physical machines:  This implies that there are less number of ports.
  • Traffic on each port is increasing : Virtualization and Mulitcore processor are enabling multiple applications in one physical machine.  It is not uncommon to see the requirement of multi-gig traffic on a single port.
  • 10G and in future 40G/100G ports are facilitating the unified fabric for both kinds of traffic - Application traffic and SAN traffic, thus eliminating number of ports and interconnects.
These technologies are reducing the cost by reducing equipment, interconnects, by amount of power required and amount of cooling required. It also reduces the maintenance and hence reduction in cost.

What kind of features one would expect in the switches in new data centers:
  • Latency of traffic should be very less:  By eliminating the aggregation layer itself reduces the latency. But that is not good enough for SAN traffic, Video and Voice workloads.  Non-blocking switching or cut-through switching is expected to support real time traffic such as Video, Voice etc..   Traditionally, switches oversubscribe the bandwidth, that is, switches are not capable of receiving and transmitting of traffic of all ports at the same time with full port bandwidth. Hence the packets get blocked.  In non-blocking switches,  they are expected to send and receive traffic equal to number ports * each port bandwidth.  If there are ten 1G ports, switches are expected to receive 10G traffic and send 10G traffic.  
    •  802.1qbb (Priority based Flow Control):  When there is a congestion in the receiving node, 802.3x pause frame is generated normally. This makes all the traffic pause for some time. This standard allows pause frame generation on 802.1p priority levels. It lets the high priority traffic flow. Switches are expected to honor and generate theses kinds of frames.
    • 802.1qaz (Enhanced Traffic Selection):  This standard allows the bandwidth allocation for different priority levels or group of priority levels.  It lets higher priority bandwidth to be consumed lower priority traffic if there is no higher priority traffic.  SAN traffic would need to be going with higher priority levels. This feature is also expected to be supported by data center switches.
    • 802.1qau (Congestion Notification):  This standard allows end nodes to communicate the congestion notification.  It lets the end node receiving the congestion notification to apply rate limiting on the out traffic.  This feature is also expected to be supported by data center switches.
  • Port Density should be high.
  • Multi-Path support is required - I am not sure whether there are any standard at this time, but spanning tree is not used in these cases as it only provides one path. 
  • VEPA Support would be required eventually. Due to VEPA,  it may need to support C-VLAN and P-VLANs.
  • Large number of VLANs support is required to work with other network services such as ADCs, WAN Optimization and Network Security (Firewall, IPS, IPSec VPN etc..). 
  • Ability to redirect the traffic not only based on L2 and L3 fields, but also L4 fields such as TCP, UDP Source and destination ports. 
  • Any switch architecture should work with VM migration from one physical server to new physical server.
  • Public Data Center networks require Virtual Instance kind of concept within the switches to reuse VLANs (across different subscribers) due to limited number of VLAN IDs.