XSLT processor vendors are free to add their own private extensions to the language. The XSLT specification even specifies how they should indicate if an extension element or extension function is supported by their implementation.
In the stylesheet, certain namespaces can be specified to be XSLT extension namespaces with the xsl:extension-element-prefixes attribute on the stylesheet element. Elements in those namespaces will be processed using the extensions of the used processor.
If the stylesheet author wants to know if the processor supports a certain extension element, the function element-available() can be called with the element name as the parameter. If the processor supports this element, the function should return true.
The same information can be retrieved about extension functions using the function-available() function.
When Microsoft released Internet Explorer 5.0, it wanted to ship with it an XML parser that conformed as much as possible to all XML-related standards at that time. XSLT was at that time still a part of the XSL working draft. The XSLT support in IE5 is based on the transformations chapter in the working draft of December 1998. They did quite a good job, but the specification moved on, split itself in two, and by now the MSXML implementation is a very weak and non-compliant version of the now final recommendation of XSLT 1.0. This IE5 implementation of MSXML is version 2.0.
Microsoft has announced that they will support the full specification in a next release. When this book is available, at least a developer's preview is available (called MSXML 2.6). This preview implements the standards much better, but still a lot remains to be done. More information can be found from http://msdn.microsoft.com/downloads/webtechnology/xml/msxml.asp.
The new implementation will support both the W3C XSLT 1.0 recommendation as well as the MSXML 2.0 implementation. Which implementation is used depends on the namespace of the stylesheet elements. The MSXML 2.0 implementation uses the namespace:
In Appendix D, you can see for each element if it is supported in IE5 (the MSXML 2.0 library). Here we will try to give you a notion of what is unsupported, what is ill-supported and what works fine.
MSXML 2.0 does a good job on:
q literal elements and attributes.
q the element element, the attribute element, the comment element.
q the choose, when and otherwise elements.
q the for-each element.
q the if element.
Some elements can be used in most cases, but fail to support more complex uses or certain attributes. These include:
q apply-templates: you cannot use the mode attribute.
q template: you cannot use the name attribute and the mode attribute. The priority rules are not implemented (see the section entitled 'What if Several Templates Match?').
q processing-instruction: is called pi in MSXML 2.0.
q stylesheet: IE5 does not support any of the attributes for the stylesheet element. Note that the version attribute is defined as required in XSLT.
q value-of: disable-output-escaping is not supported. See below for undocumented tricks to do this anyway.
The XPath expressions that can be used in lots of places in XSLT are only partially implemented. Basically only the shorthand notation is supported. For details, see the XPath section earlier in this chapter.
The following elements are not supported in MSXML 2.0:
q apply-imports, import, include
q attribute-set (and the related attributes)
q param, with-param, variable
q sort (MSXML 2.0 has implemented attributes on some elements to allow sorting)
q text (MSXML 2.0 has an undocumented cdata element that does more or less the same)
q transform (which is the same thing as stylesheet anyway)
q and a whole bunch of top level elements
Although this is a fairly long list, most of these unimplemented elements are the kind you will rarely use anyway. Some of them, however, are dearly missed.
Most of the specified additional functions that can be used in XSLT are unsupported in IE5. At the same time, MSXML 2.0 features some functions that can be very useful in overcoming these shortcomings.
There are some unsupported standard functions:
q generate-id():MSXML 2.0 has a function available called uniqueID() that can do the same
MSXML 2.0 has a formatNumber() function that works almost identically, except for localization using a decimal-format element
IE5 has a very powerful context() function. This can be used to do the same. context(-1) is equivalent to current().
Although MSXML 2.0 has some limitations compared to the full XSLT specification, it is still a very useful transformation tool. When using it, there are some problems that all developers stumble into. The developer community has been looking for solutions and work-arounds for almost two years now. These are a few of the most important ones.
If you have an XML document containing a piece of text that should appear literally in the output, you can run into trouble. The XML parser and XSLT processor will replace some characters with XML entities, to keep the output well-formed. That is fine, but sometimes we don't care whether the output is well-formed, we just want that exact string to appear in the output. The output is not supposed to be XML anyway – it might be HTML. XSLT allows us to do so by using the disable-output-escaping on the value-of and text elements. IE5 does not support disable-output-escaping, but it does allow the use of an undocumented attribute: no-entities='true' on the eval element. We can use this to generate unescaped content, for example, using the following code:
with the following template:
This would generate this output:
Note that this is not well-formed XML, but that was exactly what we where trying to do. But this also means that we must be very careful using this feature. Note also that this feature is undocumented, so Microsoft might remove it from future versions just when you least expect it.
IE5 does not support modes and calling templates by name, but it does allow something else: locally scoped templates. These can be included as a child element of an apply-templates element and the processor will try to use this template before any of the globally scoped templates. Look at this sample:
The stylesheet has a template defined for use with Author elements. It generates a b element with the Author element's content in it. The root template performs two apply-template actions on all authors in the source document. The first one will match on the template for Author elements and output the following:
The second apply-templates element has a template defined locally. This local template also matches the selected nodes, so the second apply-templates element will generate:
Let's have a look at some more examples to demonstrate the use of XSLT. In the last part of this section, we will look at using XSLT to style an XML document in HTML. There will be more examples there. Here we will cover examples that are not HTML-related, but targeted to converting one XML dialect into another. This will be a very common case in business-to-business e-commerce, where XML documents containing orders, inventories, product descriptions, etc., are sent automatically and converted on the fly to a format that is suitable for the target system.
Think of a system that retrieves product descriptions from several suppliers to present users in the organization with a coherent view of all available products. Some of these suppliers will have their product range available in an XML format. In an ideal world, an agreement could be made with all suppliers about the format used for delivering the data. Unfortunately, in the real world suppliers will not be willing to do that, the user will have to settle for what he can get. Some will conform to an industry standard but, in the end, transformation from some other format to that which is required will be necessary.
The format that can be natively imported by our application looks like this:
The XML descriptions we receive from Clippers Inc look like this:
We want to transform this delivered format into our native format using XSLT. We could create a stylesheet for the transformation like this:
Let's have a look at the sample little by little. There is only one template, matching the root. This template contains a framework for the output document. The Product element and its ID child element are inserted as literals. The value of the ID element is fetched from the source document, by inserting the value of the product-reference attribute from the source. The same thing is done for the name. We create a name element with literals and insert a value from the source document in it. Note that we chose to use the short name from the source and discard the long name. The Product_category element is hard-coded. We expect only products in this category from this supplier.
Now comes the hard part. The supplier information is not provided in this case. Some suppliers will, some will not. We could choose to hard-code the supplier information in the stylesheet. But that would force us to update the stylesheet every time the supplier changes its address or we get a new contact person. We decided to store all supplier information in our own format in one file. While transforming the document, the processor does a lookup in the supplier_lookup.xml document and copies a whole fragment from that document to the destination document using copy-of.
Our second example is for a publishing company; all books are stored in a giant XML document (in fact it is stored in a database, but this database allows access to the data as if it were an XML document). A fragment of this document looks like:
Note how the second book has several authors. For making an overview of the most successful authors, the publisher wants to transform this huge books file to something like this:
Authors will be ranked by the total number of copies of books sold, and this should also determine their position in the document. So, the best selling author in the books document should be the highest on the list. This can be accomplished by this stylesheet:
Some things in this stylesheet are worthy of further comment. First, note how the sum() and count() functions are used, both in the author template for calculating the number of publications and total number sold for each author, and in the sort element within the apply-templates element. Note how the current() function is used to match the author-ref elements to the author elements they refer to. An interesting thing to note is that the current() function within the apply-templates element refers to the current context after selecting the new set.
If the source document is large, this stylesheet will probably take a long time to process. Many calculations are done in counting and summing the nodes. In these counting actions, a lot of searching is done on books that have an author-ref element with a certain ref attribute. We could also implement this using a key. If the processor is optimized for using keys, this will speed things up significantly (but I don't know of any such processor at the time of writing). Even if it doesn't give us a performance gain (it still might in the future), our code becomes somewhat cleaner. Then the stylesheet would look like this. See if you can figure it out.
At the beginning of the document, we added an xsl:key element. It is called 'books-by-author'. The key will give us a direct access to a set of nodes from the source document. With the match attribute we specify which nodes we want to be able to access. In our case, we want access through the key to all book elements in the document (match="/publisher/books/book"). With the use attribute we specify the key value we want to use to access a book element. This is apparently the ref attribute on the author-ref child element(s) of the book (use="author-ref/@ref").
Now if we use the key() function anywhere in the stylesheet like this:
This will return a node set containing all book elements that have an author-ref child element with ref="rh". Effectively these are all books by Robert Heinlein. Using this, we could simplify some the expressions in the stylesheet significantly.