Extensible Mark-up Language (XML) has very quickly established itself as a viable technology with a huge range of real-world applications. One of the main reasons for its importance and wide acceptance is the fact that it offers a working solution to one of the key problems faced by software developers and computer users alike: the exchange of incompatible data. Each software environment creates its own unique type of binary file which only it can understand. When data is exported in XML format, it becomes a known quantity, independent of the environment in which it was originated.
Adobe’s PDF format is another example of a platform-independent data format which has gained wide acceptance. When a document is saved as a PDF file, its format is set in stone, it can viewed and printed with its layout and formatting intact, without the need for the software which created the original file. However, whereas the PDF format concerns itself primarily with the presenting information,
is used to describe and encapsulate the information itself.
Though XML itself is still fairly new, the idea behind it is not. Back in the 1970s, Standard Generalized Markup Language (SGML) was developed in an attempt to create an application-independent method of describing data. SGML is a text-based language which uses the concept of adding mark-up to data which describes the data itself. An SGML document contains both data and a set of rules defining the structure of the data. SGML is a pretty complex language and, unlike XML, has never become mainstream. In the early 1990s, SGML was used to develop HTML and in the late 1990s, SGML was also used as the basis for the development of XML. So, basically,
is a restricted form of SGML.
XML has already proved itself to be an excellent medium for storing, describing and transporting data, particularly over the web. It offers developers flexibility, clarity and simplicity. An XML document resembles an HTML document and consists of the same human-readable tags. However, the tags used to markup an HTML document are predetermined: only a fixed set of tags can legitimately be used. XML allows you to create your own markup language and define the tags which are legitimate for your data. It does this via the mechanism of a schema document, which can itself be an XML document. The schema document defines the vocabulary and grammar which may be used within the XML document containing your data.
The fact that, when creating or generating XML documents, you can invent all the rules, means that you never have to twist and force your data into a container which was not designed to hold it. You design tags which reflect the nature of your information; you create a schema document which defines the hierarchical structure of that information; and you specify the type of data each element within your document is permitted to contain. In short, if you end up with an XML documents which is not suitable for holding your information, you have only yourself to blame!