Saturday, June 27, 2009

Linux Open source applications - Porting considerations for developers

You have your proprietary software in Linux and you have a need to integrate with open source applications. Let us see the items the developers need to keep in mind.

  1. Selection of open source package :  Some cases, I find that many open source development projects on a given application.  Consider following to choose one project over another:
    • Is the open source project actively maintained?
      • Check the number of releases made so far.
      • Latest release date
      • Consistency of releases.
      • Activity in the mailing list.
      • Number of developers maintaining the project.
      • Roadmap of features.
      • Usage of this project in commercial or other open source projects.
    • Code License
      • Is this code GPL or BSD licensed? 
      • Are you planning to add significant number of features to the open source application?  If you are, you better choose the BSD licensed code.
      • If it is GPL code,  are the include/library files LGPL?  If not LGPL,  inclusion of the header file itself  in your code or linking with the library contaminates your code with GPL.
      • If there is no BSD or LGPL code, then developers needs to be careful in using the package to ensure that their proprietary code is not contaminated with GPL.  If the code is modified, then developers have no choice other than making the code public. But integration of the open source package with rest of the software need to be ensured that the rest of the code is not contaminated.  More information on how to take care of this is described below.
Once the open source package is selected based on above criteria,  porting of the software to your platform and integration with rest of the software will be the next step.

Porting of the open source software to your platform:  Most of open source packages come with GNU configure script. For more details about configure script, please see here.  If you are compiling it for different target system, please ensure to provide right target type to the configure script.  Some times, configure script may not be written for your target type. Choose the ones closest and modify the configure script according to your requirement.   In addition also ensure to enable/disable features based on your requirement by passing the right arguments to the configure script. Based on the open source package, this may take a day or two. 

Integration with the Management system:  Many open source packages typically take configuration in a file.  When the package is started,  it reads from the configuration file and use it for its operations.   But any serious product provides user interface to the administrator such as CLI, Web GUI,  SNMP, NetConf or TR-069 etc using proprietary configuration & Management software.   The configuration of the open source package is expected to be done via this user interface. Typical activities involved in integrating with configuration & Management system are :
  • Understanding the configuration of the open source package.
  • Creating data model and associated GUI 
  • Creation of backend logic to the configuration & Management system to convert data model elements to the open source package understandable format and store them in the file. Backend logic also should have specific way to inform the open source application daemon to read the file for the changed configuration to be effective.
    • Many times, it is good to store the configuration data in your own format in addition to converting and storing that into the config file for open source daemon.  Storing it in your own format helps in 
      • Reading the data model instance without reading the configuration file.
      • Storing the user entered configuration along with rest of configuration of proprietary software.  This helps in import and export operation of complete device configuration. Otherwise, import & export operations involve many number of files which is little bit more complex.
      • Helps in creation of configuration audit logs with right information easily (specifically when the some part of the configuration is modified - Note that it is expected that the modification operation log shows both old and new values.  New value will be known from user input, but old values need to be retrieved. Retrieval is easy and fast if the configuration is maintained in your own format in the memory).
      • Helps in synchronizing the 'diff' configuration with the participant devices in high availability environments.  In HA environments, configuration created on master device is expected to be sent to all participant devices.  Since participant devices already have some configuration,  master device is expected to send only the 'diff'.  Again maintaining the configuration in your own format as rest of the configuration in memory would make your logic consistent.
      • Basically use 'config' file of open source daemons as a way to communicate with the open source daemon. But maintain the configuration in your own format like you do for your proprietary software.
    • If the open source package is GPL, then ensure that this backend logic does not include any files of open source package and don't use any GPL libraries to link with your backend logic. It is necessary to develop backend logic just by understanding the config file format. Otherwise, GPL contamination is possible and you would be forced to release your configuration & Management software public. 
  • Showcasing the run time statistics in your configuration & management system:  Statistics counters are incremented by the open source daemon as part of its operation. Many open source daemons don't do much on statistics other than updating them. They expect somebody to get hold of the statistics. Configuration & Management system  is expected to show these statistics to the users via CLI/GUI/SNMP interfaces.  To get hold of statistics counters, some logic needs to be developed in open source daemon.  If the open source daemon has 'select/poll' kind of way of looking for events, then create a new socket (Unix sockets or loop back socket) and do appropriate binding.  Define some message header for clients to request specific information (command type).  Create the logic to wait on the socket (via poll) and act on based on command and send the response.  If the open source package is synchronous (that is no poll or select), then create a new thread which waits on the domain/loopback socket and act on the commands.  Client side of this mechanism will with your configuratin & management system. Basically your backend logic for statistics user interface will communicate with the open source daemon in synchronous fashion using domain/loop-back sockets.
  • Integration with Logging/Alert software:  Every product has mechanism to show case the events that happened during its operation. Logging/Alert software normally format the logs and store them to SQL database or send them via syslog or email to management software. Many open source packages do have mechanism to log the events. But there is no standard unfortunately, that is each open source package does this differently.  This is one area of integration which is required to give uniform look and feel for administrators irrespective of origin of events.  One good thing is that many open source applications tunnel all the events via some fixed number of API functions.  These API functions need to be changed to send the events to your logging/alert system.  To avoid GPL contamination, ensure that the your logging/alert software is a daemon by itself.  Add whatever minimal software that is required to   bridge the open source logging API  to communicate with your logging/alerting system. Please note that this glue logic becomes open source and be prepared to release it when requested by others. 


No comments: