Thursday, May 1, 2008

TR-069 Protocol and applicability in Network security devices - Opinion

Recently a security software developer asked me a question. He wanted to know whether TR-069 protocol is suitable for managing network security devices. Further he wanted to know what enhancements I would like to see in TR-069 protocol, if any.

I am sure that quite a bit of thought had gone into defining the RPC methods and their usage. It is certainly easier to create new data models and software associated within device and ACS implementations when you compare with SNMP based management.

TR-069 is certainly suitable for network security application configuration. But there are some points developers should be aware of.

AddObject method:
This is one thing I have difficulty in understanding the intentions behind in defining 'AddObject' method in TR-069 protocol. 'AddObject' method does not take any parameter values. To add a row in a table, it requires two methods from ACS - AddObject followed by SetParameterValues method. Due to this, it has some complications and additional logic in TR-069 implementations in security devices.

Configuration of many security functions such as firewall, IPsec, AntiX and IPS consists of multiple rules (rows). Each row has its identification and value of these identification parameter (Let us call it as Instance Key) are typically configured by administrators. This identification parameter must be unique within the table. Some configuration databases identify the rows by integers and some identify by strings. Once the record is created, security software does not allow changes to identification values. Of course, other parameters of the rule (row) are allowed to be changed.

TR-069 defines its own identification for each row in the table called instance number. Instance number is of integer type. Instance number is returned in AddObject response. This can't be used to identify the record in a table as far as security applications are concerned. I guess this instance number is mostly for TR-069 protocol. Since record name/ID (Key) is not part of AddObject, record can't be created in the security applications upon reception of AddObject by the device. It needs to wait until SetParameterValues method is sent by ACS with the record identification value.

Though this is not a major hurdle of using TR-069 in configuring security applications, but it could have been avoided if AddObject allows setting up of parameter values.

I don't see any reason why separate instance number is required as it is defined in TR-069. By avoiding instance number and replace with application Instance Key, it provides many advantages:

  • Device does not need to maintain the state between "Add Object" and "Set Parameter Values" RPC methods to create the instance (row) in the applications.
  • Device complexity increases as it needs to maintain the mapping between instance numbers and application instance key values even across reboots to maintain consistency between devices and ACS.
  • ACS does not need to maintain the mapping of instances with each device instance numbers. Since ACS is expected to manage thousands of devices, it needs to maintain this run time mapping information which limits the ACS scalability.

I like to see the enhancement the TR-069 protocol where it avoids instance number approach for the rows. As a matter of fact, local management engines don't have special instance numbers for each row of tables. One of the parameters itself used as the key to identify the instances in the table. With that in mind, I like to see following approach.

  • One of the parameters in each table object in data model can be identified as the instance key. It requires only one parameter to identify the instance at that level due to tree structure of the data model and nested table objects. Table object is already identified uniquely due to upper (parent) table objects. Due to this, one parameter is good enough to identify the instance within the table object. By having one of data model parameters as the key parameter, ACS also can identify the row by this parameter value.
  • Today configuration transaction is limited to one SetParameterValues method. Adding a row is outside the transaction today. To make the adding instance also as part of transaction, row creation along with its parameters and values should be made part of SetParameterValues RPC method and eliminate AddObject RPC method.
  • Any modifications to the instance at later time need to send the key value along with the parameter names. Existing instance number position in the parameter name can be replaced with the key value.

Mandatory parameters in one RPC method:

Security application makes some configuration parameters 'must to have'. Some of these configuration parameters can have default values, but not all. My experience shows that many mandatory parameters don't have any default values. Due to this, security application software typically expect all the mandatory parameters values as part of its record creation. As discussed before, actual object creation in security software module is done when the first 'SetParameterValue' RPC method is received (after AddObject). So, it is expected that ACS sends SetParameterValues method with all mandatory parameters and its values. TR-069 protocol does not specify any rules in regards to this. Due to this, ACS systems have a choice of sending these parameters in multiple RPC methods. It complicates the device implementation, where it needs to wait until all mandatory parameters are received. This is not practical as device does not know when to give up waiting for the mandatory parameters. I believe, there is an understanding that ACS systems always send most of the parameters' (including mandatory parameters) values in one RPC method.

Since this information is not documented, you might often hear from your QA guys that it is not working as expected. I wish that TR-069 specification clarifies this clearly. By the way, if the approach as indicated in above section (under AddObject method section) is implemented, then this is a mute point.

Default Values for Parameters

TR-069 data model guidelines don't clearly specify that all optional parameters of table objects and normal parameters must have default values.

Many security devices and I am sure other devices also have requirement of resetting to factory defaults upon user action. In addition, some devices provide option for administrators/users to set all/some optional parameters to default values on existing configuration. To facilitate this, I think TR-069 data model definition must mandate setting up the default values for non-mandatory parameters.

Factory Defaults

Many device applications are shipped with default configuration. Many devices also support resetting the device to factory defaults via both hardware and software means. TR-069/TR-104 does not specify the ways to define the factory defaults. Due to this, when device resets its configuration to factory defaults, ACS does not know the configuration of device until it reads the configuration from the device by traversing through the data model template tree.

An additional benefit of defining factory defaults in a standard fashion also helps device to do the factory reset from central place (TR-069 client in device) without letting each application to do its own factory reset.

Validations

TR-069/TR-104 data model definition facilitates the parameter value validations such as data type, possible enumeration values, min and maximum length in case of string and base64 data types, range of values in case of integers. But, there are no validation definitions on number of instances that can be created. Many applications limit the number of instances on per table object basis. Having this number known to ACS facilitates the validation at the ACS itself. Note that this limit might be different from one device type to another type. Some generic software application tune this number based on amount of memory available on the device it is being installed. Some times, this limit is also configurable by the local administrator.

TR-104 mandates the definition of "Number of Instances" parameter for each table object. In addition, mandating one more parameter "Max Instances Supported" for each table object would sole above problem.

Nested Table Objects and Transactions

TR-069 protocol dictates that all parameters within the "Set Parameter Values" method either set in entirety or reject every thing. For all practical purposes, a configuration transaction is limited to one instance of the RPC method. Device implementations ensures this by checking with each application using "Test" functions of applications first for all configuration parameters and then followed by setting the configuration in the applications if all "test" functions are successful - basically tests followed by sets as in SNMP world. Applications' "test" functions also has its own limitations, especially if nested table object instance parameters are being checked if parent table object instance is not yet available in the applications, even though it is available in the same transaction. Since "Set" functions are expected to be called only after all "test" functions are successful, parent table instance would not have been created in the applications. Hence "test" function of nested table object instance would return FAILURE. To avoid complicated "test" functions, I feel that TR-069/TR-104 should put guideline for ACS developers to avoid setting parameters of nested table object instances along with parent table object instances in the same RPC method if it is followed by appropraite "AddObject" methods.

Nested Table Objects and Special cases

Even though it is true for all cases that the nested table instance can't exist without parent table object instance, in some cases it is required that instances of some table objects are created with atleast one instance of child table object. This is true specifically in security rules such as firewall ACL rules and IPSec SPD rules. ACL and IPSec SPD can be represented as table objects in the data model. These rules contain 5-tuple selectors with Source IP, destination IP represented by multiple sets of network objects. That is, source IP (and destination IP) field of the ACL rule is a table object to represent many networks. Administrator would be allowed to add more networks to the "source IP" and "destination IP" tables at later time once ACL rule is created. But when ACL rule is created, security application expect at least one network in "source IP" and "destination IP" fields of ACL rule object.

Today data model does not allow this kind of relationship. Due to this, ACS developers don't know about this dependency. If not taken care of by ACS then the rule creation will be rejected by device continuously.

It is necessary that data model definition allows specifying this special relationship. One simple proposal is to indicate for each table object on whether at least one instance of this is needed to create instance of immediate parent object.

Ordered Lists

Rules in many security applications such as firewall, IPsec VPN, Web application firewall etc.. are ordered. That is, the rules are searched from top rule to bottom rule for matching in the table object and hence the rules must be ordered in the list with top rule being the highest priority rule and bottom rule being the lowest priority rule. Administrators, normally in the course of administration need to add rules in between the list, top of the list, bottom of the list or change the priority of the rules in the list. TR-069 does not provide any RPC methods to change the priority of the rules. Due to this, the data models defining ordered list are expected to have 'priority' parameter. Administartor changes value of this parameter to change the priority of the rules. Devices are expected to form the ordered list based on the value of priority parameter acorss multiple instances in the table object. Though it works fine in cases where the configuration changed rarely, it poses performance penatlies on the device. Let us get into this details.

  • Security applications revalidates run time states every time there is a change in the rule list. Revalidation can be costly in systems where millions of states are created. For example, firewall application in Enterprise/Carrier market segments support more than million sessions where each session corresponds to TCP/UDP/IP connection. If the rules are modified or their order is changed, firewall application is expected to revalidate the sessions and remove sessions that are no longer valid with respect to new rule base.

If a rule needs to be added in between, then the priority value of rules below the current rule need to be changed by at least 1 value less than the current values to accomodate the rule. That is, if there are 500 rules and if one rule is added at position 250, then bottom 250 rules would undergo change. This will raise 250 more additional parameters being modified. This results to 250 changes in the rule base in device. So, there would be 250 revalidations of Millions of sessions. To avoid this performance penatly, it is necessary that each ordere list (table object) has one parameter outside the table object "Revalidate now". ACS sets this value at the end of all priority parameter values in SetParameterValues method. When this parameter is set to 1, device is expected to initiate revalidate process. It avoids revaldation logic in the device for each rule change. This parameter value is not expected to be persistent. Its purpose is to initiate revalidation action. Reading of this parameter should always give value 0 even after it is set.

ACS also needs to know the objects that are ordered in nature. ACS also needs to know the parameter name used to order the list. If ACS does not know this information, then the administartor is forced to change the priority value of all 250 rules in above example manually. That is not some thing which administrators (users) would enjoy. ACS, by knowing this informaton, can change the priority values itself based simple user actions internally and communicate with device appropriately. ACS at the end of any action on the ordered list should send set the "Revalidate Now" parameter. To faciliate this, data model defintion should identify the ordered lists. My proposal is to introduce an attribute "Ordered list" to the table objects. This attribute will not be present for non ordered table objects.

Above guidelines on data model work fine for existing TR-069 protocol. Future TR-069 enhancements needs to think of differnet approach for ACS scaling by introducing new commands in "SetParameterValues" RPC method. SetParameterValues RPC method as explained above also need to take 'Add' command in addition to implicit 'Modify' command supported today. My proposal is to introduce "Move" command for changing the order of instances in the table. "Move" command takes the identifcation value of target row and command attributes such as "first", "last", "before", "after". If the command attribute value is one of "before" or "after", then it also takes the relative row identificatin value. "First" value indicates to put the target row in the beginning of the list. "Last" value put the target row at the end of the list. "Before" keeps the target row immediately before the relative row and "After" keeps the row immediately after the relative row. This proposal eliminates the need for having "priority" parameter. It also reduces the need for sending multiple parameters when the row is added in between. Note that "Revalidate Now" parameter need to be still present if more than one row is being added or more than one "moves" are initiated by administrator.

Log Export

One of main features of network security devices is logging. Detection/Protection of internal resources as important as notifying administrators of any violations. When central management is used to configure network security devices, it is also expected that central management console provides facilities to show alerts, notifications, analyze logs and generate reports. TR-069 protocol does not specify any protocol elements to send the log messages to ACS.

You may need to device your own protocol (such as syslog over IPsec) for sending logs. Also, you need to work with ACS vendors to corelate the logs with configuration.

No comments: