Using XML in Static and Dynamic Web Pages, Part I DOROTHY
J. HOSKINS Web developers can add new functionality to their pages with XML and XSL. Early adopters face a steep learning curve, but tools will be available soon to ease the pain. |
|
| XML is a behind-the-scenes
technology; if its working right, its transparent to users.
You probably have browsed pages that have XML behind them and didnt
even know it. Thats the way it should be.
In this article, Ill outline some basic concepts, then discuss some issues for XML development regarding browser support. |
|
Basic Concepts of Using XML for Web Pages |
The basic concept behind using XML in Web sites is that the content (text information, graphics, links, etc.) can be handled independently of its presentation (display, or look and feel) and can be manipulated according to its structure. For example, consider a product information page. It contains the following information: product name, catalog part number, overview, list of features, list of benefits, table of specifications, pricing, company logos and contact information, and product artwork.
As HTML, the information can only be described as it will be formatted for presentation with the set of standard HTML tags (after the <html>, <head> and <body> tags): <p>The Wonderful Healthee Widget comes in your choice of colors and finishes:</p><ul><li>colors: carrot, celery, tomato, eggplant, or butternut</li> <li>finishes: satiny, nubbly, burlap</li></ul> <img src="colorswatch.gif" alt="colors of the Healthee Widget" /> <img src="finishes.jpg" alt="finishes of the Healthee Widget" /> and so on. Structurally, HTML provides a nested hierarchy with very limited set of structure possibilities; <html> contains <head> and <body>, <body> contains <ul>, which contains <li>, and so on. With XML, the tags (elements) can give us information about the meaning of the information: <product name=Healthee Widget manu=Healthee Products, Inc.><description><p>The Wonderful Healthee Widget comes in your choice of colors and finishes:</p> <colors> <color>carrot</color> <color>celery</color> <color>tomato</color> <color>eggplant<color> <color>butternut</color> </colors> <finishes> <finish>satiny</finish> <finish>nubbly</finish> <finish>burlap</finish></finishes> <illus type=infographic> <img src="colorswatch.gif" alt="colors of the Healthee Widget"/> </illus> <illus type=infographic><img src="finishes.jpg" alt="finishes of the Healthee Widget"/> </illus> </description> </product> The names of elements in the XML file hierarchy can be meaningful and precise. For example, this set of XML element names clearly indicates that carrot is a color, so it wont be mistaken for a vegetable. Another key property of XML is that you can locate elements by their position in the hierarchy of the file. So when you search within an XML file, you can ask for a specific element by name and location, which makes XML content data-like and not just text-like. The syntax is similar to that of file paths: document/product/description/color retrieves a specific part (or node) of the XML file, just as c:/windows/programs/WinWord.exe retrieves a specific file. This introduces a whole set of processing capabilities for manipulating XML as trees of nodes in memory, which is fundamental to its power for building a variety of views of the same XML content. For example, for our sample XML file, here are three of many possible views: View 1: show each product, its description paragraph, its illustration, colors and finishes in a separate block followed by a horizontal rule View 2: show just the product names and associated illustrations in a list of links View 3: show all colors for every product in a table so that users can locate products with matching or complementary colors These views are created by applying XSL (eXtensible Stylesheet Language) stylesheets to the XML. |
Building Blocks: XML Content, XSL Styling |
XML element names make it easy to understand what the content means. XML deliberately avoids specifying anything about how it should look. To be displayed, XML must either be be transformed into HTML before serving, or parsed by the browser and combined with an XSL file.
Getting the XML content to display in a browser requires mapping the elements in the XML file to HTML tags. For example, you can map the <description> to a <div> and the <colors><color> section to a <ul><li> structure. The rules for the mapping are contained in the XSL file, with statements that match up parts of the two files. [Since the syntax of XSL is complex, Ill skip over most of it and give references at the end of this article.-djh] When you develop the XSL mapping, you have opportunities for filtering the XML. If you dont map an element in the source XML to an output, it is omitted from the HTML output. And if you want to reorganize the XML, you can use order sort=ascending or similar statements in the XSL. You can also output the content in a completely different order in the HTML, for example, putting product picture before product name. You have the choice of writing the XSL file so that it includes all of the HTML code for formatting each block of output (fonts, colors, sizes, and placement of the HTML output), or you can link to a CSS stylesheet. Styling in the Browser (client-side)XSL for client-side formatting is used by the browser application after it calls an XML parser. The browser builds a representation of the XML structure, locates the XSL file that it references, applies the XSL rules and outputs the HTML. (This sounds like a lot of work, but in fact, the process is very fast and users do not experience any unusual delays in viewing the HTML pages). XSL can be applied statically by hard-coding a link reference to the XSL file inside the XML file itself, or dynamically by using JavaScript or VB Script (i.e. to combine a source XML file with an XSL file when someone clicks a button or makes a selection from a dropdown list). Parser and Browser ProblemsThe different versions of browsers currently in use vary in their support of XML. As the W3C standards for XSL evolved, the different browsers have had different parsers bundled with them. For the IE browsers, version IE 6 uses the msxmlparser3, but prior IE browsers (those that had some support for XML) use older XML parsers and so cant process everything now being coded. The only solution to this problem is to provide server-side processing of the XML by XSL, unless your project permits you to force users to download a newer parser or use IE 6. Styling before Sending HTML (server-side)When the XML will be used in browsers that dont support XML (pre-IE 4 and Netscape 4.7), or if you cant make users download the most recent parser, you can process the XML to create HTML before its served to the browser. Typically, this is whats going on with .asp and .jsp pages. The user never receives XML; the XML has already been converted to HTML. Therefore, the browser support and XML parser version issues dont arise. Server-side processing requires a different set of programming skills than regular HTML page composition. If youre going to venture into server-side XML/XSL, youll need to learn how to write .asp (or .jsp or .php) pages. |
Summing Up and Moving Forward |
There are many things you will need before you embark on using XML in Web pages. The basic building blocks are: a well-formed XML document (which does not have to be validated by a DTD, but usually should be); an XSL stylesheet that processes the XML to create the output you want; CSS to control the visual formatting of HTML; and, if you are providing interactive controls for the viewer, Javascript or other code for the controls.
However, this effort can reap enormous benefits. After you have a body of content in XML files, those XML files can be searched across, combined, and manipulated in far more powerful ways than anything standard HTML permits. At this time, XML is typically used for content management or data-driven Web sites where the XML development is part of a larger-scale project to streamline publishing efforts and create cross-media workflows. If you feel like this is far more work than you want to undertake, remember that the whole world wants to use XML and tools will be forthcoming. You can look at XML, learn about XSL, and assume that within the next 18 months tools for working with it will be much more mature than they are at present. As FrontPage, Dreamweaver and Flash get more XML-oriented, hand-coding XML and XSL will be as quaint as writing your HTML in Notepad. But as the early pioneers of HTML experienced, developers who learn what XML is and how it works with XSL will be well positioned to maximize the tools when they become available. In Part II, I will cover some basic methods for making XML work with dynamic Web pages using XSL, CSS, and Javascript. |
For More Information |
There are many resources for learning about XML and XSL. Some of my favorites are DevX for how-tos and code samples (http://www.devx.com/, http://www.devx.com/xml/morearticles.asp), PlanetPublish for XML tools (http://www.planetpublish.com/), and the XML Cover Pages for basic references on SGML, XML, XSL, etc. (http://www.oasis-open.org/cover/sgml-xml.html). For more information about browser capabilities in handling XML, see Steve Franklins Common Browser Implementation Issues at http://www.webreview.com/browsers/browser_implementation.shtml and Browser XML Display Support Chart by Simon St. Laurent at http://www.xml.com/pub/a/2000/05/03/browserchart/, although this last one is now a little out-dated. |
| Dorothy Hoskins is President and CEO of Textenergy LLC, a firm devoted to methods and tools for transforming documents for cross-media publishing. Technical documentation and catalog publishing are the primary types of content with which Textenergy works. You can contact Ms. Hoskins via e-mail at info@textenergy.com. | |
|
Using XML in Static and Dynamic Web Pages, Part 1 Practical Magic Reluctant Trainer Resources & References Home Second / Third Quarter 2002 (Volume 5, #2) Copyright © 1998, 2002 Society for Technical Communication |
|