Martin Bravenboer. Connecting XML Processing and Term Rewriting with Tree Grammars . Institute of Information and Computing Sciences, Utrecht University, The Netherlands. Master's thesis INF/SCR-04-08, November 20, 2003. (ps, pdf)


The widespread acceptance of xml for exchanging tree-like data between software tools has resulted in a growing interest in tools for binding xml to native data structures of a language and the design of dedicated languages for processing xml.

In the Stratego/XT project the components of transformation system operate on term representations of programs. These representations are encoded in the aterm format, which has a more explicit structure than xml. Most of these components are implemented in Stratego, a transformation language that supports a separation of rewrite rules and rewriting strategies.

We have developed a set of tools that unifies xml processing and term rewriting. The different needs of xml processing application are served by providing several levels of term representations of xml. The tool kit consists of syntax, document and data-oriented representations of xml and the tools for manipulating them. In modular and reusable tools we apply the basic principles of tree and hedge grammars to obtain a natural, more explicitly structured representation of an xml document in a term.

Our tools enable interoperability of xml and aterm tools. All term representations of xml can be transformed using Stratego. This enables the implementation of complex xml transformations using the strategic rewriting paradigm and the generic traversals of Stratego.