Towards an XML-free future for the digital humanities
I was pleasantly surprised that my talk on the AustESE (Australian scholarly editing) infrastructure went down so well with the audience. Less surprising perhaps was their negative reaction to my suggestion that there might be a life for the digital humanities outside of XML. XML has for a long time (since 1998 at least) defined what the digital humanities were all about, and so to cultivate the creation of an alternative that would overcome its fundamental limitations may indeed seem like heresy. Not only does practically every tool in DH depend on XML (TEI Guidelines, XSLT, XQuery, XPath, Oxygen, etc.) but also the skills of digital humanists are based on those same technologies. To suggest that XML may not be the way forward seems to imply two unpalatable consequences:
- all the texts we have encoded so far may have to be redone
- all the tools we have developed on top of XML would have to be thrown away
This seems crazy, as well as heretical. But let me explain why I think it is not.
In answer to the first objection a fully-featured import facility would overcome any fears that encodings would have to be revised. The ability to ’round-trip’ the data back to XML (albeit with some loss) would also quell fears of ‘lock-in’ to a possibly unstable alternative.
In answer to the second objection, the skills of digital humanists and all other technicians evolve continuously. We are at the mercy of the software industry, and learn whatever tools they offer us to do our work. What I am suggesting is that we instead devise our own tools to do our specialised job far better. As an added bonus such a suite of tools would be under our control and not subject to commercial whims.
The industrial future of XML
XML was created by the W3C with help from Microsoft, who saw it as a way of implementing web-services. Messages would be passed from the client to the server about actions that the service could or would perform. Since then, the ‘bloated, opaque and insane complexity’ of XML web services as Tim Bray called them, has led many technologists to reject them in favour of a simpler noun-based methodology called REST. REST in a nutshell treats a service as if it was composed of static web-pages. ‘Get me this’, ‘here have that’ or ‘delete this’ etc is what REST services are all about. Although originally designed to work with XML, REST services are increasingly being crafted with pure JSON, a much simpler encoding strategy that is gaining some powerful advocates. How much longer programmers will support XML remains unknown; it’s very deeply entrenched. But that they will eventually replace it with something simpler can hardly be in doubt. And when they do, those tools on which we rely will cease to be maintained and thus will soon die. With Microsoft rapidly moving towards a predominantly mobile desktop metaphor based on JSON, HTML5 and Javascript, there seems no room for old-style ‘enterprisey’ XML in a future that is rushing towards us.
categorie: Senza categoria tag: Uncategorized