Sunday, April 6, 2008

UTM Message logging capabilities

UTM devices are multi function security devices. These functions include firewall, IPsec VPN, SSLVPN, IPS, Anti Virus and Anti Spam. Some devices even include web application firewall function. Each security function is different from others. So, one would expect the information in logs is different across different security functions and even sub functions within a security function. Logs are generated for many reasons. Logs are generated not only to indicate the policy violations, intrusion detections, virus or spam detections, but also generated to indicate session information, configuration changes, login failures, system errors, system warnings etc.. Type of information in each of these types of log messages is different from each other. To facilitate analysis of logs by external log analyzers, each family of log messages should have its own format. That is, each family should have its own keywords to represent the values of different parameters. There could be many message families.

Each message family contains multiple different types of logs. For example, there could be multiple types of logs in intrusion message family. Typically, each signature in IPS is one type of log. In signature based IPS, there are as many logs as number of signatures. Each type of log message is typically represented by message IDs.

UTM devices generate logs during its operation. There could be large number of incidents which generate similar type of log multiple times. Messaging system component of UTM devices provide multiple controls for administrator to control logs. Some of the controls are:
  • Message ID level control on enable/disable: If a message ID is disabled, any log incidents of that message ID are not processed, that is, they are not stored or exported to external log collectors.
  • Message ID based Log frequency control: It allows logs, but controls the number of logs generated for processing. Typically it takes two parameters - Log threshold count and log threshold time. At most one log is processed for every 'log threshold count' logs or within the 'log threshold time'. That is, if log threshold count is 100 and log threshold time is 5 minutes, then if number of logs generated are more than 200, but less than 300 within 5 minutes, then it emits three logs - 1st log, 101st log and 201st log.
  • 5 Tuple based Log frequency control: Message ID based log frequency control is good for non-network based logs such as system errors, warnings etc.. But, for connections, this kind of control is not good enough. If there is same intrusion detected in traffic going to multiple victims, then administrator would like to know at least one instance of intrusion going to each victim. With message ID based log frequency control, all victims will not be reported if the intrusions happen within 'log time threshold'. To avoid this, 5 tuple based log frequency control is required. 'Source IP', 'Destination IP' , 'Protocol', 'Source Port' and 'Destination Port' constitute 5 tuple. Each item in tuple can be enabled/disabled. 'Log threshold count' and 'log threshold time' described above is individually valid for each combination of 5-tuple items enabled. Let us say that, if administrator enabled 'source IP' and 'destination IP' addresses in the 5 tuple for a given message ID, then for logs generated for this combination of network connections are processed as per 'log threshold count' and 'log threshold time'. Using this, log systems don't miss attacks and other events happening on individual victim machines and also don't miss reporting 'attackers'.
In multi function security devices like UTM, there could be many message IDs. It is no surprise, even the number of message IDs are in the tune of 1000. Controlling individual message ID would be night mare for administrators. UTM devices need to provide control of logs at higher level than the control on message ID basis. That is where, sub-family of message IDs come in handy. Sub-family is nothing but group of message IDs. This grouping is mainly for log control. This grouping does not define the message format. The controls 'Enable/Disable', '5 tuple based log frequency control' can be specified on per sub-family basis. All message IDs inherit these controls. Of course, administrators are provided control on message ID basis too, when he/she thinks that some message IDs can't inherit the controls from their sub-families.

Though many aspects of this article is based on Intoto iGateway product family concepts, these are generic concepts and valid for any security products. The typical flow of logs is:
  • Logs are generated by applications such as firewall, IPS, AV/AS, WAF etc..
  • Log throttling system discards logs and processes only some logs based on controls configured by administrator.
  • Logs are then stored, exported.
  • Log analyzers (local or external) analyzes logs and even create reports. Also, they provide 'search' functions based on different criteria.
Logs are exported by either using 'syslog' or 'email'. iGateway UTM also has facility to store logs locally in a database (postGres).

Exported logs are sent in a format for log analyzers to easily extract values for different fields. WELF is one format that became quite popular. Though iGateway product family uses WELF syntax, but it does not use keywords as specified by WELF as they are incomplete. Some of the rules it follows are:
  • Each log forms one line with each line containing multiple fields.
  • Each field is formed as keyword=value. Keywords are defined by corresponding message family. Each keyword and value pair is separated by one or more spaces.
  • keyword and values should not have any spaces. They must not have any quotes and '=' sign characters. If the value needs to have spaces, then the value string must be enclosed in double quotes.
  • Mandatory keywords across all message families
    • time: Date and time in double quotes.
    • priority: Priority of the message. One of values from 1 to 7.
    • id : Identity of the device sending the logs. It is configured by administrator on each device.
    • mtype: Message family.
    • mid: Message ID
  • Generic keywords: There are some generic keywords which are valid across multiple message families. Though these are not mandatory, these generic keywords must be used wherever they are needed.
    • vsg : Virtual instance.
    • fromzone: Zone in which the connection is originated.
    • tozone: Zone in which the connection is terminated.
    • userid : User name, if this information is known.
    • usergroup: User group.
    • sip : Source IP address in dotted decimal form.
    • dip: Destination IP address in dotted decimal form.
    • protocol: Protocol of the connection as integer.
    • sport: Source port
    • dport: Destination port.
  • Other keywords are specific to each message family.

No comments: