Monday, April 18, 2011

Hibernate Versus JDBC

Long back I worked on a project in Java, Centralized Management System, to  manage multiple networking devices.  At that time  I remember having a  big debate on whether to use Hibernate for Java Object persistence or use JDBC directly.  We have finally decided to use Hibernate at that time.

I happened to come across this link which talks about 'when to use Hibernate' and also provides advantages and disadvantages of using hibernate.  I felt that it captures many points. Hence I thought of sharing this.

Please find it here : http://www.mindfiresolutions.com/mindfire/Java_Hibernate_JDBC.pdf

One point  to consider by developers when using  Hibernate:
  • Different parts of application require different fields from a database table or from a set of related tables to be persistent.  Don't try to put all possible fields from database in one single java object class. Some database fields are more often required for your application than some other fields.  In those cases, you are better off using multiple persistent classes - for frequently used fields and for rarely used fields.  If you go with one single java persistent class, then you may be using more heap memory than necessary.   You know your application. Decide on the number of java persistent classes based on 'type of data',  'how often they are required' for applications etc.. For example, some data which is needed for auditing, which is once-in-a-while activity, may require separate java persistent class so that this object can be instantiated and removed on demand basis.  
Features I like in Hibernate (JPA):

  • It separates out the database organization with the business logic.  Due to Object-Relational mapping,  business logic always deals with the java class.  Mapping of java class members with the fields in database tables is defined in the XML file.  Any change in the database organization requires change only in the XML file.  
  • Change in the database server from vendor to another during field upgrade or during development has no or very less impact.
  • Automatic Version  & Time stamping feature of Hibernate ensures that there is no unintentional update of database by one thread using older data.
Above link has lot more information on the benefits. 


Sunday, April 17, 2011

Cloud Multi-Tenancy Support - Authentication, Authorization and Auditing requirements

Authentication and Authorization requirements are common in any web based applications.  Authentication is the process by which the application takes the user credentials, typically using 'log-in' forms and checks the existence of user by checking against user databases (LDAP,  RDBMS, RADIUS etc..).    Authorization is the process in which application allows the access to different parts of application based on type of user accessing it.   In typical application, authorization consists of - Assignment of roles to users as part of creating user in the user database (LDAP, RADIUS etc..).  Mapping the part of application (resource)  to the roles and providing set of permissions to the roles to access mapped resources.  Resources and related permissions for different roles is also called 'Access Control'. 

Many web application development frameworks have minimized the amount of development one needs to do for Authentication and Authorization.   For example, spring-security component of spring framework (http://static.springsource.org) takes care of Authentication & Authorization requirements of many web applications. That is,  for many web based applications,  web developers only need to concentrate on the business logic and can leave the Authentication and Authorization part to the spring-security.  

Cloud applications', mainly SaaS applications',  Authentication & Authorization requirements can't be met by spring-security (Version 3.0.5) as is.  But spring-security provides multiple hooks for extending it.

It is good to know the typical requirements of Cloud applications with respect to Authentication and Authorization first.  This is what this post is concentrating on.  I will try to discuss the how to bridge the gaps in spring-security to enable cloud application in future posts.

Authentication Requirements

Like any web applications,  Cloud applications also require some portions of the application resources to be accessible only to some set of users.  Authentication  is the process where it takes the user credentials using 'login' forms and validates the user by communicating with pre-created user database.   Cloud applications differ from the traditional web applications where cloud application provides services for multiple tenants. That is cloud application resources are instantiated for each tenant.  Rather than having separate server and separate application instance for each tenant,  cloud application services multiple tenants using one server and one application.  Typically, this is achieved using 'Tenant ID" in application and in database tables.  When single tenant based application gets converted to multi-tenant application,  one would see the introduction of 'Tenant ID" in each database table schema.   Note that the tenant is not equivalent to user.  Tenant is typically a company to which the SaaS application provides the services for.  Each tenant would have its own set of users - typically employees to access  its resources in the cloud application. 

Some of the characteristics of Cloud applications with respect to Authentication:

  • Tenant identification :  In many cloud applications,  email address is taken as the login ID.  Email address consists of user name and domain name.  Domain name is, typically , used to identify the tenant instance.  If email address is  xyz@example.com, then xyz is user name and example.com is the domain name.
  • Each tenant has their own authentication database (user database) :  Cloud application vendors are increasingly allowing  their customers (tenants) to provide  authentication server information.  It would eliminate the need for duplicating the user accounts in the cloud applications by companies (tenants). Cloud applications are expected to get the appropriate authentication server information from the domain name of the login and validate the user by communicating with the authentication server.  Since tenants might use different types of authentication servers and hence it becomes requirement for cloud application to support multiple authentication protocols. 
Any authentication framework for cloud applications should satisfy following requirements (this list is beyond what is available in spring-security in its 3.0.5 version)
  •  Ability to identify the tenant from the login credentials
    • From domain name of email address.   Note that  multiple domain names might be part of one tenant ID.  That is, one should not assume that there is one-to-one correspondence between domain name and tenant ID.   Multiple domain names for a given tenant ID can arise  for multiple reasons -  Mergers of companies where domain names of merged companies exist for few months after merger,  Multiple divisions of company may be represented in the domain name portion of the tenant.  
      • Ability to identify the tenant ID using application specific mapping table - Domain name versus tenant ID.
    • As a separate POST variable indicating the tenant ID.
  • Ability to get the Authentication Server information from the RDBMS where the authentication server information is stored based on the tenant ID.  Authentication Server information based on the tenant ID is part of the application specific RDBMS table.
    • A tenant might have multiple authentication servers. One can apply different strategies to select the authentication server.  Some of the strategies that are useful are:
      •  Round-Robin:  Yet times, the tenant has multiple authentication servers either for load distribution and for high availability.  In those cases, this strategy might be chosen.  If this strategy chosen,  servers are chosen on round-robin fashion.
      • Until authentication is successful :   Multiple authentication servers might be provided in order of priority.   If this strategy is chosen,  authentication servers are tried in the roder until authentication is successful or until all authentication servers are tried.
      • Sub-domain based authentication server:  If this strategy is chosen,  then the set of authentication servers is chosen based on the domain name of email address.  That is,  domain name is not only used to determine the tenant ID, but also is used to get the list of authentication servers.  Then the above strategy can be applied on this list of authentication servers.
  • Ability to support multiple authentication protocols:  As discussed above,  Cloud application vendor will not be able to mandate their customers (tenants) to go with one kind of authentication servers.  Though LDAP and Active Directory (which also can be accessed using LDAP) are most commonly used, it is necessary that framework allows cloud applications work with  multiple authentication protocols.   It is expected that the information regarding communication protocol such as x.509 certificate and Certificate chain in case of SSL protocols would be expected from the application specific tables in RDBMS.  This information can come along with authentication servers' information. 
    • LDAPv3 with/without SSL:  I believe SSL is must, as one would not like to send the user credentials in clear on the wire.
    • CAS (Centralized Authentication Service)
    • SAMLv2  (IDP is fast becoming common choice of authentication in recent past).
    • OpenID
    • And more ...  (Though RADIUS is one popular protocol, I don't see it being used that often, may be it is due that RADIUS is based on  UDP and related security concerns).
  • Firewall Traversal while communicating with authentication servers:  Some companies (tenants) though like cloud applications to authenticate their users with company authentication database, they don't like to open the firewall hole for cloud applications to make a connection to their servers.  Some company administrators are paranoid to create an inbound hole in their firewall. But they are fine with creating outbound hole. In those cases, it is required that framework provide a proxy that can be installed in the company premises behind firewall.  Proxy is expected to work as proxy between cloud applications and internal authentication servers.  Proxy is expected to make a persistent connection to the cloud application server all the time.  Cloud applications, rather than communicating with authentication server directly,  it sends the authentication messages to the proxy on the established connection and proxy in turn makes the connection to the authentication server and proxies the messages to the authentication server. Similarly, it receives the messages from the authentication servers and forwards them the cloud application server using pre-created connection. 

Authorization Requirements

Traditional single-tenant based web applications can work with authorization functionality provided by spring-security.   Method/Class based authorization and page level authorization is good enough for many single-tenant based web applications.  spring-security provides mechanism to associate method/page/class with granted authorities.  As we discussed earlier, each user is associated with the roles in the authentication database.  As part of authentication process, in addition to validation of user with the authentication database, the authentication process also gets the authorities (Roles) of the user and keeps it in the security context.  At later time during authorization phase, this information is used by spring-security.

As I understand spring-security in later versions introduced the concept of ACL and domain objects.  This seems to be promising.  Using this new framework,  application business logic can create ACL which consists of 'Domain Object',  'Roles' and 'Permissions'.  Domain object is reference to the application business level object.  'Roles' is list of roles that are allowed to access the domain object and 'Permissions' is set of bits indicating permissions given to the roles to access the domain object.  Though this seems to be promising for cloud applications,  it still does not meet the requirements of cloud applications.

Characteristics of Cloud applications with respect to Authorization:
  • There are multiple types of users in Cloud applications
    • Users of Cloud application vendor.   These users are typically administer the cloud application as a whole.  There are different users with different roles to operate different parts of the application.  
    • Users of tenants:   Cloud applications, as discussed above, support multiple tenants.  Each tenant has multiple employees.  Based on the type and authority of each employee, access to different resources of the application would be permitted.    Each tenant might define different roles.  That is, role names across different tenants might not be same in the system.
  •  Each user is configured with the roles in the authentication database.
  •  Support for Collaboration :  Some (not all) cloud application require a resource of one tenant to be accessible to other tenants.  Hence some cloud application provide ACL for a given domain object with not only owner tenant ID, but also with respect to other tenants.  That is, ACL is expected to consist of multiple rules for a given domain object with each rule having tenant ID and permissions.
Any authorization framework for cloud applications should satisfy following requirements
  •  Ability for application to specify (add/delete/modify) the ACL for needed domain objects:
    • As discussed, each ACL record in the ACL table corresponds to one domain object.  To allow multi-tenancy,  Domain object needs to be represented by tenant ID,  application name of the domain  and domain object identification.  Each ACL itself consists of multiple rules - Access Control Rules.  They can be represented in a separate table with each row corresponds to one rule.  Rule consisting of  "Tenant ID"/'Group of tenant IDs",  "Roles of that tenant ID" that can access the domain object, "Permissions" on the domain object.  Due to Collaboration based Cloud applications,  tenant ID in the rule can be other tenants than the tenant of the domain object in the ACL.   To simplify the configuration of collaboration related AC rules,  tenant IDs may be grouped and groups can be reused across multiple AC rules of different ACLs.
    • Ability for business logic to find out whether the authenticated principal and his/her authorities can access the domain object that is being accessed. 
    • Performance considerations:  Since it is possible to have ACL for many domain objects,ACLs can become very big in the system.   It may be necessary to have reference to the ACL from the domain object. This could avoid costly query of the ACL table.  
  • Ability to inherit the ACL from the parent domain object: 
    • Cloud applications have different resources and configuration of ACL and associated rules can become very complex.  Inheritance of ACLs can help reduce administrator overhead.   Domain objects that are being configured with ACLs can inherit the ACLs of its parent or grand parent or any other resource. It does not seem to be good to allow inheritance of the ACLs from domain objects of other tenants.  To allow inheritance, ACL for the domain object can be configured with list of  'Inherit from" and the identification of the parent domain object - Application name of the domain and its identification of the domain object.  In addition to inheriting it from the other ACLs, it can also its own Access control rules. 
    • Due to inheritance,  there could be multiple ACLs to search for a given domain object - Its own ACL,  multiple ACLs as it can inherit from multiple other domain objects.  Also Domain objects, from which ACLs are inherited from, may themselves be inheriting ACLs from others.  Due to multiple ACLs, order of ACL to find the permissions is important. It is always good to check its own ACL first and then first level inheritance and second level inheritance and so on.  
Auditing Requirements

Any configuration changes to some important resources of an application must be logged for later auditing.  Since cloud applications have multiple tenants,  tenant ID must be logged along with configuration audit logs. 
Configuration audit logging must have information related the exact changes that were applied to the resource.  Addition of any application record must have all the information (field names and values) in the audit log.  In case of deletion of any application record,. audit log  must have similar information as 'Add'.   Modify operation should result in audit log having old information and new information. 

It is also important that the audit log also contains Date & Time at which the update happened,  User name of the user that changed the configuration,  role that was used to give the permission. 

Internet Resources on Multi-tenancy and spring-security.:

I have found two valuable resources related to Multi-tenancy and spring-security.  It gives a good picture on how spring-security can be enhanced to support multi-tenancy.  Though they solve some of the requirements mentioned above, they don't  satisfy all of above requirements. But it gives  very good understanding of how spring-security can be customized.   This knowledge can be used to customize spring-security for above requirements.

Securing a multitenant SaaS application 
Extend Spring Security to Protect Multi-tenant SaaS Applications