Monday, January 4, 2010

Data Redundancy Elimination across files

HTTP, CIFS and other application level proxies typically apply DRE within the scope of file being requested.  That is, if the content is changed with the same file name,  DRE is done effectively by these proxies.  Client proxy as part of request sends the signature information of the file it has and Server proxy identifies the delta information from signatures and new file it gets from the origin server.  Then delta information is sent to the client proxy which in turn gest the new content from the old file and delta information it gets.  That is, the scope of delta information is limited to the scope of a given file.

It is my observation that Enterprise users, when they modify documents or  presentation files etc.., they tend to make a new copy and give new file name to it, that is they version the changes by keeping different files. One might argue that this is bad way of keeping the versions. But this happens more frequency than one can imagine.  In those cases, file based delta information does not work.  Though one may argue that  these instances of may be smaller, but they are not insignificant.

Good WAN optimization device should be able to do DRE on the blocks across files. I am not sure whether WAN optimization vendors are doing this already, but as a end user, you should look for this feature.

