Saturday, January 24, 2009

A DSL to output XML, in Java

Yesterday, I had to output some simple XML in Java. I couldn't find any decent way of doing this out there. Writing with SAX sucks, and DOM didn't look much better either. I just println'ed the thing out and went on my way.

Today I revisited the problem. I wanted to easily express the XML tree structure and evaluate it as something, a String for instance. I came up with a neat solution heavily based on generics.

Here is a usage example of it:



XmlBuilder xmlBuilder = new XmlBuilder();
String s = Xml.newDocument(new XmlBuilder())
.kid("start")
.kid("kid")
.kid("subkid").attr("key", "value")
.close()
.close()
.close();

System.out.println(s);

Note that calling close() sometimes returns a node (that can also close()), but the last time returns directly a String!

This is the output:



<?xml version="1.0"?>
<root>
<child>
<sub_child key="value">
</sub_child>
</child>
</root>


Here is the simplified crux of the idea:



public class Node<T> {
private final String name;
private final T returnValue;
private final XmlHandler handler;

Node(String name, T returnValue, XmlHandler handler) {
this.name = name;
this.returnValue = returnValue;
this.handler = handler;
}
//...

Node<Node<T>> child(String label) {
return new Node<Node<T>>(label, this, handler);
}

T close() {
//...
handler.endNode(name);
return returnValue;
}
}
The magic part is methods child() and close(). Node<T> means "when close() is called, return T". Imagine we want to output an XML as a string. We want the root node to be of type Node<String>, so when that closes, we get a String back. Right. Now pay attention to method to the child() method. It returns a Node<Node<T>>, meaning "here is a node that when it closes, you return back to this node" (and we give "this" as a returnValue of the new node).

This means that the compiler can figure out the nesting level of each Node. The root Node is just Node<String>. All children of that are of type Node<Node<String>>. Children of those children are Node<Node<Node<String>>>, and so on. When a close() is invoked, a nesting goes away, like pealing an onion kind of way, till we get back to the result we want.

XmlBuilder is similar to ContentHandler, but simplified. Here it is:


public interface XmlHandler<T> {
void startNode(String name, Map attributes);
void endNode(String name);

T result();
}


The above library can be used with any implementation of XmlHandler - producing a String is only a viable use-case. One could easily create a DOM as well, simply by creating another XmlHandler implementation. XmlBuilder in the example above is XmlHandler, and this is why the root node is of type Node. We could implement an XmlHandler and get back the root node as Node.

To conclude, this is a nice DSL for expressing XML trees, combined with an abstract handler to process the tree. A point to highlight is that this modelling allows the compiler to know the nesting of each node, so it knows how many consequtive "close()" I can call for example. I must call the right amound of close() to get back to the result I want, or else a type error will occur.

Still, this kind of type-safety doesn't automatically translate to valid XML, because the user could store a node somewhere and close it twice - we would only find that at runtime.

All in all, that was a good exercise in Java generics. I would give the complete source if there was an easy way to attach it here, but I guess the margin is too narrow. :-)