Friday, October 7, 2011

Improving decryption performance in Apache Santuario 1.5

Andreas Veithen alerted me some months back to a performance problem in Apache Santuario when decrypting messages. The issue emerged when some profiling was done on Dennis Sosnoski's test-code for measuring WS-Security performance across different web services stacks (see the original article here).

The test scenario involves deploying an Apache CXF 2.4.2 endpoint in Tomcat and repeatedly testing the "signencr" invocation defined in the article (WS-Security signing of body and headers, with timestamp and encryption of body) using a CXF client. Two types of test-runs were executed, 1000 "large" messages at 0.2 density in one run, and 10000 "small" messages at 0.05 density in another. When doing some profiling using a sampling profiler on the client, it emerged that the time it took to deserialize a decrypted XML String into a DOM element was taking around 20% of the total execution time for all WS-Security processing!

The way the default deserializing algorithm works in Apache Santuario 1.4.x is to parse the decrypted XML String into a new Document object, which is then imported into the existing Document. As Apache Xerces defers the creation of Node objects, the import operation triggers the full expansion of the DOM tree.

There are a number of alternatives to using the DocumentBuilder/importNode approach used in Santuario 1.4.x. The first approach is to use a Transformer object to transform the Source (XML String) into a placeholder Node belonging to the existing Document. This approach avoids having to explicitly import the nodes back to the existing Document. The second approach, is to use the streaming API available in the JDK 1.6 (not an option for Santuario 1.5 which must compile against the JDK 1.5).

Here are some (ad-hoc) test results. The first results show the total time for each test-run using both algorithms:
  • Large Messages:
    • Document Serializer: 119.46s
    • Transform Serializer: 115.68s
  • Small Messages:
    • Document Serializer: 222.32s
    • Transform Serializer: 216.76s
The next results show the time spent in the Serializer.deserialize() operation as a percentage of the total WS-Security processing time:
  • Large Messages:
    • Document Serializer: 19.92%
    • Transform Serializer: 18.04%
  • Small Messages:
    • Document Serializer: 24.54%
    • Transform Serializer: 18.36%
The Serializer interface is now public, and a different implementation can be set on XMLCipher. Two implementations are provided in the code, DocumentSerializer (the default algorithm) and TransformSerializer. If anyone is interested in running experiments of their own, the StreamSerializer algorithm is available here. The TransformSerializer implementation is not the default as it requires Xalan to work properly, and as this library is optional in Santuario 1.5.

Do you have any suggestions on how this could be improved further? Clearly, the time it takes to deserialize a decrypted XML String into a DOM node still takes far longer than it should. A fully StAX approach for XML Security would surely offer much improved performance - this is under development and planned for next year.

No comments:

Post a Comment