What’s the big deal with XML?

XML (eXtensible Markup Language) is one of those initialisms that remains in the background of the tech industry. It pops up in almost every major office software suite (did you ever wonder what the x stands for in .docx?). It shows up all the websites that use XHTML (about 34% of the websites out there) and each RSS feed that you read. And, if you know any technical writers, they probably won’t shut up about DITA (Darwin Information Typing Architecture).

So, what makes XML so special? Well, the answer depends on who you are and how you use XML. Writers like XML because it gives them full control of their text. Developers like XML because they can easily repurpose content. And, designers like it because they can apply stylistic changes on the fly. But, before we explain these benefits in too much detail, we should probably explain what XML is.

What is XML?

XML stands for eXtensible Markup Language. Like HTML, XML is a markup language. Markup is the information that tells a computer how to read content, such as text. For example, this website uses HTML markup (actually, a combination of HTML, CSS, PHP, and Markdown) to tell your web browser where to display text, what font to use, and when to display a word in bold. Without some sort of markup, it would be almost impossible for your computer to properly display content in web browsers, desktop publishing programs, or even applications like games.

Unlike HTML, XML focuses on what the data is rather than how the data looks. That is, the markup in XML tells the computer the content type, and a separate style sheet tells the computer how to display the content. In HTML, the markup only tells the computer how to display the content. So, if you want to change the font in all of the headings on your HTML webpage, you have to change each one individually. In XML, you can simply change the entry that specifies the heading font in the style sheet.

Another notable difference is that XML doesn’t use predefined tags (tags are units of markup). HTML markup is the same everywhere. So, if you use an <h1> tag in HTML, an HTML reader automatically knows that it is a top-level heading. In XML, a style sheet is needed to define the tag, but you can also name the tag anything you want. For example, you could name it <section_heading>, <page_heading>, or even <the_only_heading_you_will_ever_need>. This is what puts the eXtensible in eXtensible Markup Language (XML), and you can create as many tags as you want – which means that you can make your content as nuanced or lightweight as you need.

So, why is XML so great?

Extensibility

XML emerged as a response to the inflexibility and inefficiency of HTML. The standard on which XML is based, SGML (Standard Generalized Markup Language), was deemed too complex, so XML was made to simplify it. XML ditches the unnecessary elements of SGML while still retaining the key principles that the markup needs to describe the content. By abandoning predetermined tags and unnecessary elements and allowing content creators the flexibility to create their own tags, XML offers a lightweight and flexible solution for transporting information, especially text.

Extensibility also makes XML easier to use and maintain than other markup languages. Authors can intuitively name tags based on their functions. Consider the following example.

< ?xml version="1.0" encoding="UTF-8"?> 


  
    Dear Mr. Hughes,
  
  
    I so enjoyed your last film and do hope that you make another.
  

Based on the names used in the tags, you can infer the basic structure of the document and the type of content that each tag contains. Furthermore, you or a new employee could return to this document several years down the road and still figure out what the tags mean.

Strictness

XML is also far stricter than HTML. The rigid syntax required by XML results in smaller, faster, and lighter browsers. For example, XML is stricter than HTML in the following cases:

  • Case sensitivity: HTML is not case sensitive, while XML is.
    A <Paragraph> tag is not the same as a <paragraph> tag.

  • End tags: HTML lets you get away with not closing elements, while XML doesn’t.
    A <p> tag won’t work without the corresponding end tag (</p>).

  • Quotation marks: HTML lets you use value delimiters without quotation marks, while XML doesn’t.
    <object width=100> won’t work, but <object width=”100”> will.

  • Nesting: HTML lets you overlap elements, while XML doesn’t.
    <b><i>Cattle</b></i> won’t work, but <b><i>Cattle</i></b>.

In the past, some Internet browsers devoted up to 50% of their code to correct the mistakes or inconsistencies in HTML content. By imposing a more rigid set of rules, browsers and other programs that process text (often called parsers) no longer need to account for mistakes in the markup. Paying a little more attention when writing is a small price to pay for better overall performance.

In addition to being strict about syntax, XML can also include rules about the structure of your documents. For example, if you were creating an employee database, you could create an <employee> element that must contain <first_name> and <last_name> elements. Doing so ensures that the required information is included and that all unnecessary information, such as an employee number, is excluded.

Easy Data Exchange

Although extensibility and simplicity added to its early success, XML’s greatest claim to fame is that it allows authors to easily publish the same content to different media. Because XML concentrates only on the content type, the content remains independent of the medium. So, authors can write and edit a document in XML and publish it to a website, user’s manual, and helpdesk script. This facet of the XML is often touted as single-source – multi-target. In fact, content reuse has catapulted DITA, an XML standard maintained by OASIS Technical Committee, to the forefront of the technical writing world.

Furthermore, XML relies on free open standards, such as the XML 1.0 Specification, so it avoids the bulk, complexity, and inaccessibility of propriety data formats, such as those used in the older versions of desktop publishing applications like Word. XML content and markup is stored as text that authors can configure directly. Even when using an XML editor, such as FrameMaker, authors can still output the XML text to make changes or transfer the content to another format.

Conclusion

So, back to the question at hand: What makes XML so special? Well, it’s right there in the name: extensibility. Writers and developers are constantly finding new ways to use XML to accomplish their goals in a variety of media formats, and XML supports the freedom they need to do it. That, coupled with its ability to be intuitive and strict in both code and structure and easily distributed to multiple channels means that XML will continue to be a staple of the IT community for years to come.