Action against software patentsGnome2 LogoW3C LogoRed Hat Logo
Made with Libxml2 Logo

The XML C parser and toolkit of Gnome

Upgrading 1.x code

Developer Menu
API Indexes
Related links

Incompatible changes:

Version 2 of libxml2 is the first version introducing serious backwardincompatible changes. The main goals were:

  • a general cleanup. A number of mistakes inherited from the very earlyversions couldn't be changed due to compatibility constraints. Examplethe "childs" element in the nodes.
  • Uniformization of the various nodes, at least for their header and linkparts (doc, parent, children, prev, next), the goal is a simplerprogramming model and simplifying the task of the DOM implementors.
  • better conformances to the XML specification, for example version 1.xhad an heuristic to try to detect ignorable white spaces. As a result theSAX event generated were ignorableWhitespace() while the spec requirescharacter() in that case. This also mean that a number of DOM nodecontaining blank text may populate the DOM tree which were not presentbefore.

How to fix libxml-1.x code:

So client code of libxml designed to run with version 1.x may have to bechanged to compile against version 2.x of libxml. Here is a list of changesthat I have collected, they may not be sufficient, so in case you find otherchange which are required, drop me amail:

  1. The package name have changed from libxml to libxml2, the library nameis now -lxml2 . There is a new xml2-config script which should be used toselect the right parameters libxml2
  2. Node childsfield has been renamedchildrenso s/childs/children/g should be applied(probability of having "childs" anywhere else is close to 0+
  3. The document don't have anymore a rootelement it hasbeen replaced by childrenand usually you will get alist of element here. For example a Dtd element for the internal subsetand it's declaration may be found in that list, as well as processinginstructions or comments found before or after the document root element.Use xmlDocGetRootElement(doc)to get the root element ofa document. Alternatively if you are sure to not reference DTDs nor havePIs or comments before or after the root elements/->root/->children/g will probably do it.
  4. The white space issue, this one is more complex, unless special case ofvalidating parsing, the line breaks and spaces usually used for indentingand formatting the document content becomes significant. So they arereported by SAX and if your using the DOM tree, corresponding nodes aregenerated. Too approach can be taken:
    1. lazy one, use the compatibility callxmlKeepBlanksDefault(0)but be aware that you arerelying on a special (and possibly broken) set of heuristics oflibxml to detect ignorable blanks. Don't complain if it breaks ormake your application not 100% clean w.r.t. to it's input.
    2. the Right Way: change you code to accept possibly insignificantblanks characters, or have your tree populated with weird blank textnodes. You can spot them using the commodity functionxmlIsBlankNode(node)returning 1 for such blanknodes.

    Note also that with the new default the output functions don't add anyextra indentation when saving a tree in order to be able to round trip(read and save) without inflating the document with extra formattingchars.

  5. The include path has changed to $prefix/libxml/ and the includesthemselves uses this new prefix in includes instructions... If you areusing (as expected) the
    xml2-config --cflags

    output to generate you compile commands this will probably work out ofthe box

  6. xmlDetectCharEncoding takes an extra argument indicating the length inbyte of the head of the document available for character detection.

Ensuring both libxml-1.x and libxml-2.x compatibility

Two new version of libxml (1.8.11) and libxml2 (2.3.4) have been releasedto allow smooth upgrade of existing libxml v1code while retainingcompatibility. They offers the following:

  1. similar include naming, one should use#include<libxml/...>in both cases.
  2. similar identifiers defined via macros for the child and root fields:respectively xmlChildrenNodeandxmlRootNode
  3. a new macro LIBXML_TEST_VERSIONwhich should beinserted once in the client code

So the roadmap to upgrade your existing libxml applications is thefollowing:

  1. install the libxml-1.8.8 (and libxml-devel-1.8.8) packages
  2. find all occurrences where the xmlDoc rootfield isused and change it to xmlRootNode
  3. similarly find all occurrences where the xmlNodechildsfield is used and change it toxmlChildrenNode
  4. add a LIBXML_TEST_VERSIONmacro somewhere in yourmain()or in the library init entry point
  5. Recompile, check compatibility, it should still work
  6. Change your configure script to look first for xml2-config and fallback using xml-config . Use the --cflags and --libs output of the commandas the Include and Linking parameters needed to use libxml.
  7. install libxml2-2.3.x and libxml2-devel-2.3.x (libxml-1.8.y andlibxml-devel-1.8.y can be kept simultaneously)
  8. remove your config.cache, relaunch your configuration mechanism, andrecompile, if steps 2 and 3 were done right it should compile as-is
  9. Test that your application is still running correctly, if not this maybe due to extra empty nodes due to formating spaces being kept in libxml2contrary to libxml1, in that case insert xmlKeepBlanksDefault(1) in yourcode before calling the parser (next toLIBXML_TEST_VERSIONis a fine place).

Following those steps should work. It worked for some of my own code.

Let me put some emphasis on the fact that there is far more changes fromlibxml 1.x to 2.x than the ones you may have to patch for. The overall codehas been considerably cleaned up and the conformance to the XML specificationhas been drastically improved too. Don't take those changes as an excuse tonot upgrade, it may cost a lot on the long term ...

Daniel Veillard