Improve Rome performance: from JDOM to StAX
Par ericlemerdy le mercredi 9 mai 2007, 02:19 - Métier - Lien permanent
Rome is an open-source java API to deal with several syndication feeds. It is using JDOM to parse and generate the XML stuff. This study is to imagine how make Rome use StAX instead of JDOM and how does it cost.
Problem
Rome 0.9 uses JDOM to parse and generate the supported XML syndication feeds. It is known that JDOM is not as much efficient than simple to use by programmer.Since Java 6 is out, anyone can use StAX to generate and parse XML in an effective and simple way.
So I said to myself: "Why this amazing Rome project is not using this so easy-to-use API ?"
Motivations
A quick look and search in their dev' mailing list let me know that they do not want to do earlier optimization (in July 2004, the question was then considering using DOM4J). It has been proved that prior optimizations are not efficient for the final version of the product, it just makes the code more difficult to understand. And I totally respect their point of view as main developpers.But, for me, open-source is powerfull for that particular capacity to cost nothing than people involvement. So If somebody want to know if StAX is as simpler as JDOM and increases Rome performance, it is his business to do it (or pre-evaluates it like me). Moreover, we can think that Rome is close to his final version now and its API will not move very much.
Preliminary consequences
- Rome is no more dependant to JDOM
- Rome become dependant to Java 6 or the StAX API plus an implementation
Protocol
Everything done with Eclipse. Rome's version is 0.9.- Download Rome source code
- Import it in Eclipse (it is natively an Eclipse project)
- 942 errors should appears. There are due to the non-satified JDOM library dependancy.
- Note that Rome internal architecture is friendly. Only one branch is dependant to JDOM (
com.sun.syndication.io.*
is dependant (51 Classes) whereascom.sun.syndication.feed.*
is not (65 Classes)). - All unit test Classes are dependant to JDOM (
com.sun.syndication.unittest.*
(32 Classes)).
- Source analysis
- Writing: Interfaces asks implementors to generate a JDOM Document whereas StAX output directly in a stream.
- Reading: Easy to avoid passing a Document.
Conclusions
- Writing: we have to change the interface : The
com.sun.syndication.io.WireFeedOutput
(Wrapper ofcom.sun.syndication.io.WireFeedGenerator
) class has to be changed to not working with a document any more. Changes in deeper level classes is easier. There will be no more Document in memory, just the feed "business objects" to be written. - Reading: change the interface : The
com.sun.syndication.io.WireFeedInput
(Wrapper ofcom.sun.syndication.io.WireFeedParser
) class should not deal with a JDOM Document any more. The other changes are transparent. There will no more have an in-memory Document during parsing.
I will next do the job and evaluate performance gains with Eclipse TPTP on the Unit Tests. If the conclusion is satisfying, I will consider provide a patch to Rome developper team.