XML::Simple is good enough for simple tasks (i.e. config files), but if you want to get down to business you might prefer XML::LibXML
The distribution is both a XML parser (and validator) and a XML generator. However the roles are not clearly separated and the methods are not orthogonal. The docs are OK, but unwieldy – there are many classes, and the methods are listed in the synopsis instead of the TOC. This also means you have to grep / search in the browser to get to a method’s description. The method you’re looking for might be hidden in one of the many classes, between other methods that are somehow unrelated to your purpose. In this series I’ll try to point out and illustrate the methods for the most common tasks.
What’s what
In the XML::LibXML namespace we have:
- ::Document – this will hold everything, help you to set XML version, encoding and compression, create nodes, set the root node and last, but not least validate and serialize your document into a string
- ::Node – base class for XML nodes, you never create a node directly (you create elements mostly). It has, however many methods which you can use
- ::Element – this class represents an XML element, inherits the methods from ::Node and is your workhorse for XML building
First Example
use XML::LibXML; $version = '1.0'; # is there any other? $encoding = 'UTF-8'; # cause it's the standard, that's why! $doc = XML::LibXML::Document->new( $version, $encoding ); # my XML DOM # Element creation $x = $doc->createElement( 'x' ); # <x/> element $y = $doc->createElement( 'y' ); # <y/> element # Attributes $x->setAttribute( 'lang', 'en' ); # <x lang="en" /> $y->setAttribute( 'baz', 1 ); # <y baz="1" /> # XML structure and text nodes $x->appendChild( $y ); # <x><y/></x> $y->appendTextNode( 'foo' ); # <y>foo<y/> $x->appendTextChild( 'name', 'bar' ); # <x><name>bar</name></x> $doc->setDocumentElement( $x ); # make <x/> the root element print $doc->toString( 1 ); # serialize it
This is the result:
foo bar
What happened?
The first step is to create a DOM document class (line 5). I like to think of it as a context object for all things XML. Then I create two nodes (lines 8, 9). You can create them using the DOM object or directly from their class (XML::LibXML::Element->new).
After that I add an attribute to each node (lines 12, 13). This is a short-cut method, it creates an attribute node with the given name and content and appends it to the node. You can also do this step by step, using the XML::LibXML::Attr class.
Then I arrange the nodes so that <y/> is the child node of <x/> (line 16). A text node is created and appended to <y/> with another short-cut method (line 17). I use another short-cut to create two nodes: a child element node (<name/>) and a text node as its child, and append them to <x/>.
The last step in building my XML is to declare <x/> as the document (or root) node, i.e. the top-most node in the document.
Then I serialize and print the whole thing out. The parameter passed to toString detemines how compact the serialization is:
- 0 – no line breaks or indentation
- 1 – line breaks and indentation for nested element nodes
- 2 – line breaks for element and text nodes and indentation for nested element nodes (it’s like 1, but text nodes are printed on separate lines from element nodes)
One last note: the order of the steps does not matter. I could’ve added the attributes last and set the document node first thing, the end result would the same. Well, at least as long as you don’t lookup nodes before they are added. For example, you can’t refer to <x/> with the $doc->documentElement method before you add it as a document element.
