![]() |
Basic Search Advanced Search |
|
||||||||||||
![]() |
https://perfectxml.com/XQuery.asp
If our sample data is in a file named books.xml, then the following query returns the entire document: A dynamic error is raised if the doc() function is not able to locate the specified document or the collection() function is not able to locate the specified collection.
Locating Nodes: Path ExpressionsIn XQuery, path expressions are used to locate nodes in XML data. XQuery’s path expressions are derived from XPath 1.0 and are identical to the path expressions of XPath 2.0. The functionality of path expressions is closely related to the underlying data model. We start with a few examples that convey the intuition behind path expressions, then define how they operate in terms of the data model.The most commonly used operators in path expressions locate nodes by identifying their location in the hierarchy of the tree. A path expression consists of a series of one or more steps, separated by a slash, /, or double slash, //. Every step evaluates to a sequence of nodes. For instance, consider the following expression: This expression opens books.xml using the doc() function and returns its document node, uses /bib to select the bib element at the top of the document, and uses /book to select the book elements within the bib element. This path expression contains three steps. The same books could have been found by the following query, which uses the double slash, //, to select all of the book elements contained in the document, regardless of the level at which they are found:
Predicates are Boolean conditions that select a subset of the nodes computed by a step expression. XQuery uses square brackets around predicates. For instance, the following query returns only authors for which last="Stevens" is true:
If a predicate contains a single numeric value, it is treated like a subscript. For instance, the following expression returns the first author of each book:
Note that the expression author[1] will be evaluated for each book. If you want the first author in the entire document, you can use parentheses to force the desired precedence:
Now let’s explore how path expressions are evaluated in terms of the data model. The steps in a path expression are evaluated from left to right. The first step identifies a sequence of nodes using an input function, a variable that has been bound to a sequence of nodes, or a function that returns a sequence of nodes. Some XQuery implementations also allow a path expression to start with a / or //.Such paths start with the root node of a document, but how this node is identified is implementation-defined. For each / in a path expression, XQuery evaluates the expression on the left-hand side and returns the resulting nodes in document order; if the result contains anything that is not a node, a type error is raised. After that, XQuery evaluates the expression on the right-hand side of the / once for each left-hand node, merging the results to produce a sequence of nodes in document order; if the result contains anything that is not a node, a type error is raised. When the right-hand expression is evaluated, the left-hand node for which it is being evaluated is known as the context node. The step expressions that may occur on the right-hand side of a / are the following:
Working from left to right, XQuery first evaluates the input function, doc("books.xml"), returning the document node, which becomes the context node for evaluating the expression on the right side of the first slash. This right-hand expression is bib, a NameTest that returns all elements named bib that are children of the context node. There is only one bib element, and it becomes the context node for evaluating the expression book, which first selects all book elements that are children of the context node and then filters them to return only the first book element.Up to now, we have not defined the // operator in terms of the data model. The formal definition of this operator is somewhat complex; intuitively, the // operator is used to give access to all attributes and all descendants of the nodes in the left-hand expression, in document order. The expression doc("books.xml")//bib matches the bib element at the root of our sample document, doc("books.xml")//book matches all the book elements in the document, and doc("books.xml")//@year matches all the year attributes in the document. The // is formally defined using full axis notation: // is equivalent to /descendant-orself:: node()/. For each node from the left-hand expression, the // operator takes the node itself, each attribute node, and each descendant node as a context node, then evaluates the right-hand expression. For instance, consider the following expression: The first step returns the document node, the second step returns the bib element, the third step—which is not visible in the original query evaluates descendant-or-self::node()to return the bib element and all nodes descended from it, and the fourth step selects the first author element for each context node from the third step. Since only book elements contain author elements, this means that the first author of each book will be returned.In the examples we have shown so far, NameTest uses simple strings to represent names. NameTest also supports namespaces, which distinguish names from different vocabularies. Suppose we modify our sample data so that it represents titles with the title element from the Dublin Core, a standard set of elements for bibliographical data [DC]. The namespace URI for the Dublin Core is http://purl.org/dc/elements/1.1/. Here is an XML document containing one simple book, in which the title element is taken from Dublin Core: In this data, xmlns:dcx="http://purl.org/dc/elements/1.1/" declares the prefix "dcx" as a synonym for the full namespace, and the element name dcx:title uses the prefix to indicate this is a title element as defined in the Dublin Core. The following query finds Dublin Core titles:
The first line declares the namespace dc as a synonym for the Dublin Core namespace. Note that the prefix used in the document differs from the prefix used in the query. In XQuery, the name used for comparisons consists of the namespace URI and the “local part,” which is title for this element.Wildcards allow queries to select elements or attributes without specifying their entire names. For instance, a query might want to return all the elements of a given book, without specifying each possible element by name. In XQuery, this can be done with the following query: The * wildcard matches any element, whether or not it is in a namespace. To match any attribute, use @*. To match any name in the namespace associated with the dc prefix, use dc:*. To match any title element, regardless of namespace, use *:title.
Creating Nodes: Element, Attribute, and Document ConstructorsIn the last section, we learned how to locate nodes in XML documents. Now we will learn how to create nodes. Elements, attributes, text nodes, processing instructions, and comments can all be created using the same syntax as XML. For instance, here is an element constructor that creates a book: As we have mentioned previously, the document node does not have explicit syntax in XML, but XQuery provides an explicit document node constructor. The query document { } creates an empty document node. Let’s use a document node constructor together with other constructors to create an entire document, including the document node, a processing instruction for stylesheet linking, and an XML comment:
Constructors can be combined with other XQuery expressions to generate content dynamically. In an element constructor, curly braces, { }, delimit enclosed expressions, which are evaluated to create open content. Enclosed expressions may occur in the content of an element or the value of an attribute. For instance, the following query might be used in an interactive XQuery tutorial to teach how element constructors work:
Here is the result of executing the above query for our sample data:
Enclosed expressions in element constructors permit new XML values to be created by restructuring existing XML values. Here is a query that creates a list of book titles from the bibliography:
The output of this query follows:
Namespace declaration attributes in element constructors have the same meaning they have in XML. We previously showed the following Dublin Core example as XML text—but it is equally valid as an XQuery element constructor, and it treats the namespace declaration the same way:
Computed element and attribute constructors are an alternative syntax that can be used as the XML-style constructors are, but they offer additional functionality that is discussed in this section. Here is a computed element constructor that creates an element named title, with the content "Harold and the Purple Crayon". Inside the curly braces, constants are represented using XQuery’s native syntax, in which strings are delimited by double or single quotes.
Here is a slightly more complex constructor that creates nested elements and attributes using the computed constructor syntax:
The preceding example uses literals for the names of elements. In a computed element or attribute constructor, the name can also be an enclosed expression that must have the type QName, which represents an element or attribute name. For instance, suppose the user has written a function that takes two parameters, an element name in English and a language, and returns a QName that has been translated to the desired language. This function could be used in a computed element constructor as follows:
The result of the above query is
In constructors, if sequences of whitespace characters occur in the boundaries between tags or enclosed expressions, with no intervening non-whitespace characters, then the whitespace is known as boundary whitespace. Implementations may discard boundary whitespace unless the query specifically declares that space must be preserved using the xmlspace declaration, a declaration that can occur in the prolog. The following query declares that all whitespace in element constructors must be preserved:
The output of the above query is
If the xmlspace declaration is absent, or is set to strip, then boundary whitespace is stripped:
Combining and Restructuring NodesQueries in XQuery often combine information from one or more sources and restructure it to create a new result. This section focuses on the expressions and functions most commonly used for combining and restructuring XML data.FLWOR ExpressionsFLWOR expressions, pronounced “flower expressions,” are one of the most powerful and common expressions in XQuery. They are similar to the SELECT-FROM-WHERE statements in SQL. However, a FLWOR expression is not defined in terms of tables, rows, and columns; instead, a FLWOR expression binds variables to values in for and let clauses, and uses these variable bindings to create new results. A combination of variable bindings created by the for and let clauses of a FLWOR expression is called a tuple.For instance, here is a simple FLWOR expression that returns the title and price of each book that was published in the year 2000: This query binds the variable $b to each book, one at a time, to create a series of tuples. Each tuple contains one variable binding in which $b is bound to a single book. The where clause tests each tuple to see if $b/@year is equal to “2000,” and the return clause is evaluated for each tuple that satisfies the conditions expressed in the where clause. In our sample data, only Data on the Web was written in 2000, so the result of this query is
The name FLWOR is an acronym, standing for the first letter of the clauses that may occur in a FLWOR expression:
The for and let ClausesEvery clause in a FLWOR expression is defined in terms of tuples, and the for and let clauses create the tuples. Therefore, every FLWOR expression must have at least one for or let clause. It is extremely important to understand how tuples are generated in FLWOR expressions, so we will start with a series of artificial queries that show this in detail for various combinations of for clauses and let clauses.We have already shown an example that binds one variable in a for clause. The following query creates an element named tuple in its return clause to show the tuples generated by such a query: In this example, we bind $i to the expression (1, 2, 3), which constructs a sequence of integers. XQuery has a very general syntax, and for clauses or let clauses can be bound to any XQuery expression. Here is the result of the above query, showing how the variable $i is bound in each tuple:
Note that the order of the items bound in the tuple is the same as the order of the items in the original expression (1, 2, 3). A for clause preserves order when it creates tuples.A let clause binds a variable to the entire result of an expression. If there are no for clauses in the FLWOR expression, then a single tuple is created, containing the variable bindings from the let clauses. The following query is like the previous query, but it uses a let clause rather than a for: The result of this query contains only one tuple, in which the variable $i is bound to the entire sequence of integers:
If a let clause is used in a FLWOR expression that has one or more for clauses, the variable bindings of let clauses are added to the tuples generated by the for clauses. This is demonstrated by the following query:
If a let clause is used in a FLWOR expression that has one or more for clauses, the variable bindings from let clauses are added to the tuples generated by the for clauses:
Here is a query that combines for and let clauses in the same way as the previous query:
This query lists the title of each book together with the number of authors. Listing 1.3 shows the result when we apply it to our bibliography data.
<book> <title>TCP/IP Illustrated</title> <count>1</count> </book> <book> <title>Advanced Programming in the UNIX Environment</title> <count>1</count> </book> <book> <title>Data on the Web</title> <count>3</count> </book> <book> <title>The Economics of Technology and Content for Digital TV</title> <count>0</count> </book> If more than one variable is bound in the for clauses of a FLWORexpression, then the tuples contain all possible combinations of the items to which these variables are bound. For instance, the following query shows all combinations that include 1, 2, or 3 combined with 4, 5, or 6:
Here is the result of the above query:
A combination of all possible combinations of sets of values is called a Cartesian cross-product. The tuples preserve the order of the original sequences, in the order in which they are bound. In the previous example, note that the tuples reflect the values of each $i in the original order; for a given value of $i, the values of $j occur in the original order. In mathematical terms, the tuples generated in a FLWOR expression are drawn from the ordered Cartesian cross-product of the items to which the for variables are bound.The ability to create tuples that reflect combinations becomes particularly interesting when combined with where clauses to perform joins. The following sections illustrate this in depth. But first we must introduce the where and return clauses. The where ClauseA where clause eliminates tuples that do not satisfy a particular condition. A return clause is only evaluated for tuples that survive the where clause. The following query returns only books whose prices are less than $50.00: Here is the result of this query:
A where clause can contain any expression that evaluates to a Boolean value. In SQL, a WHERE clause can only test single values, but there is no such restriction on where clauses in XQuery. The following query returns the title of books that have more than two authors:
Here is the result of the above query:
The order by ClauseThe order by clause sorts the tuples before the return clause is evaluated in order to change the order of results. For instance, the following query lists the titles of books in alphabetical order: The for clause generates a sequence of tuples, with one title node in each tuple. The order by clause sorts these tuples according to the value of the title elements in the tuples, and the return clause returns the title elements in the same order as the sorted tuples. The result of this query is
The order by clause allows one or more orderspecs, each of which specifies one expression used to sort the tuples. An orderspec may also specify whether to sort in ascending or descending order, how expressions that evaluate to empty sequences should be sorted, a specific collation to be used, and whether stable sorting should be used (stable sorting preserves the relative order of two items if their values are equal). Here is a query that returns authors, sorting in reverse order by the last name, then the first name
The result of this query is shown in Listing 1.4.
<author> <last>Suciu</last> <first>Dan</first> </author> <author> <last>Stevens</last> <first>W.</first> </author> <author> <last>Stevens</last> <first>W.</first> </author> <author> <last>Buneman</last> <first>Peter</first> </author> <author> <last>Abiteboul</last> <first>Serge</first> </author> The order by clause may specify conditions based on data that is not used in the return clause, so there is no need for an expression to return data in order to use it to sort. Here is an example that returns the titles of books, sorted by the name of the first author:
The result of this query is
The first book in this list has editors, but no authors. For this book, $a1/last and $a1/first will both return empty sequences. Some XQuery implementations always sort empty sequences as the greatest possible value; others always sort empty sequences as the least possible value. The XML Query Working Group decided to allow vendors to choose which of these orders to implement because many XQuery implementations present views of relational data, and relational databases differ in their sorting of nulls. To guarantee that an XQuery uses the same sort order across implementations, specify “empty greatest” or “empty least” in an orderspec if its expression can evaluate to an empty sequence.Two books in our data are written by the same author, and we may want to ensure that the original order of these two books is maintained. We can do this by specifying a stable sort, which maintains the relative order of two items if the comparison expressions consider them equal. The following query specifies a stable sort, and requires empty sequences to be sorted as least: This query returns the same result as the previous one, but is guaranteed to do so across all implementations.Collations may also be specified in an order by clause. The following query sorts titles using a U.S. English collation: Most queries use the same collation for all comparisons, and it is generally too tedious to specify a collation for every orderspec. XQuery allows a default collation to be specified in the prolog. The default collation is used when the orderspec does not specify a collation. Here is a query that sets http://www.example.com/collations/eng-us as the default collation; it returns the same results as the previous query:
When sorting expressions in queries, it is important to remember that the / and // operators sort in document order. That means that an order established with an order by clause can be changed by expressions that use these operators. For instance, consider the following query:
This query does not return the author’s last names in alphabetical order, because the / in $authors/last sorts the last elements in document order. This kind of error generally occurs with let bindings, not with for bindings, because a for clause binds each variable to a single value in a given tuple, and returning children or descendents of a single node does not lead to surprises. The following query returns author’s last names in alphabetical order:
The return ClauseWe have already seen that a for clause or a let clause may be bound to any expression, and a where clause may contain any Boolean expression. Similary, any XQuery expression may occur in a return clause. Element constructors are an extremely common expression in return clauses; for instance, the following query uses an element constructor to create price quotes: Listing 1.5 shows the result of the above query.
<quote> <title>TCP/IP Illustrated</title> <price>65.95</price> </quote> <quote> <title>Advanced Programming in the UNIX Environment</title> <price>65.95</price> </quote> <quote> <title>Data on the Web</title> <price>39.95</price> </quote> <quote> <title>The Economics of Technology and Content for Digital TV</title> <price>129.95</price> </quote> Element constructors can be used in a return clause to change the hierarchy of data. For instance, we might want to represent an author’s name as a string in a single element, which we can do with the following query:
Here is the result of the above query:
Another application might want to insert a name element to hold the first and last name of the author—after all, an author does not consist of a first and a last! Here is a query that adds a level to the hierarchy for names:
Here is one author’s name taken from the output of the above query:
This section has discussed the most straightforward use of for and return clauses, and it has shown how to combine FLWOR expressions with other expressions to perform common tasks. More complex uses of for clauses are explored later in separate sections on joins and positional variables.
The Positional Variable atThe for clause supports positional variables, which identify the position of a given item in the expression that generated it. For instance, the following query returns the titles of books, with an attribute that numbers the books: Here is the result of this query:
In some data, position conveys meaning. In tables, for instance, the row and column in which an item is found often determine its meaning. For instance, suppose we wanted to create data from an XHTML web page that contains the table shown in Table 1.2.
TABLE 1.2 Table from an XHTMLWeb Page
The XHTML source for this table is shown in Listing 1.2. In this table, every entry in the same column as the Title header is a title, every entry in the same column as the Publisher header is a publisher, and so forth. In other words, we can determine the purpose of an entry if we can determine its position as a column of the table, and relate it to the position of a column header. Positional variables make this possible. Since XHTML is XML, it can be queried using XQuery. Listing 1.7 shows a query that produces meaningful XML from the above data, generating the names of elements from the column headers.
let $t := doc("bib.xhtml")//table[1] for $r in $t/tbody/tr return <book> { for $c at $i in $r/td return element{ lower-case(data($t/thead/tr/td[$i])) } { string( $c) } } </book> Note the use of a computed element constructor that uses the column header to determine the name of the element. Listing 1.8 shows the portion of the output this query generates for the partial data shown in Table 1.2.
<book> <title>TCP/IP Illustrated</title> <publisher>Addison-Wesley</publisher> <price>65.95</price> <year>1994</year> </book> <book> <title>Advanced Programming in the Unix Environment</title> <publisher>Addison-Wesley</publisher> <price>65.95</price> <year>1992</year> </book>
|
Value Comparison Operator | General Comparison Operator |
eq | = |
ne | != |
lt | < |
le | <= |
gt | > |
ge | >= |
for $b in doc("books.xml")//book
where xs:decimal($b/price) gt 100.00
return $b/title
If the data were governed by a W3C XML Schema that declared price to be a decimal, this cast would not have been necessary. In general, if the data you are querying is meant to be interpreted as typed data, but there are no types in the XML, value comparisons force your query to cast when doing comparisons—general comparisons are more loosely typed and do not require such casts. This problem does not arise if the data is meant to be interpreted as string data, or if it contains the appropriate types.for $b in doc("books.xml")//book
where $b/author/last eq "Stevens"
return $b/title
The reason for the error is that many books have multiple authors, so the expression $b/author/last returns multiple nodes. The following query uses =, the general comparison that corresponds to eq, to return books for which any author’s last name is equal to Stevens:
for $b in doc("books.xml")//book
where $b/author/last = "Stevens"
return $b/title
There are two significant differences between value comparisons and general comparisons. The first is illustrated in the previous query. Like value comparisons, general comparisons apply atomization to both operands, but instead of requiring each operand to be a single atomic value, the result of this atomization may be a sequence of atomic values. The general comparison returns true if any value on the left matches any value on the right, using the appropriate comparison.for $b in doc("books.xml")//book
where $b/price = 100.00
return $b/title
In this query, 100.00 is a decimal, and the = operator casts the price to decimal as well. When a general comparison tests a pair of atomic values and one of these values is untyped, it examines the other atomic value to determine the required type to which it casts the untyped operand:
for $b in doc("books.xml")//book
where $b/author/first = "Serge"
and $b/author/last = "Suciu"
return $b
The result of this query may be somewhat surprising, as Listing 1.17 shows.
Listing 1.17
Surprising Results
<book year = "2000"> <title>Data on the Web</title> <author> <last>Abiteboul</last> <first>Serge</first> </author> <author> <last>Buneman</last> <first>Peter</first> </author> <author> <last>Suciu</last> <first>Dan</first> </author> <publisher>Morgan Kaufmann Publishers</publisher> <price>39.95</price> </book>Since this book does have an author whose first name is “Serge” and an author whose last name is “Suciu,” the result of the query is correct, but it is surprising. The following query expresses what the author of the previous query probably intended:
for $b in doc("books.xml")//book,
$a in $b/author
where $a/first="Serge"
and $a/last="Suciu"
return $b
Comparisons using the = operator are not transitive. Consider the following query:
let $a := ( <first>Jonathan</first>, <last>Robie</last> ),
$b := ( <first>Jonathan</first>, <last>Marsh</last> ),
$c := ( <first>Rodney</first>, <last>Marsh</last> )
return
<out>
<equals>{ $a = $b }</equals>
<equals>{ $b = $c }</equals>
<equals>{ $a = $c }</equals>
</out>
Remember that = returns true if there is a value on the left that matches a value on the right. The output of this query is as follows:
<out>
<equals>True</equals>
<equals>True</equals>
<equals>False</equals>
</out>
Node comparisons determine whether two expressions evaluate to the same node. There are two node comparisons in XQuery, is and is not. The following query tests whether the most expensive book is also the book with the greatest number of authors and editors:
let $b1 := for $b in doc("books.xml")//book
order by count($b/author) + count($b/editor)
return $b
let $b2 := for $b in doc("books.xml")//book
order by $b/price
return $b
return $b1[last()] is $b2[last()]
This query also illustrates the last() function, which determines whether a node is the last node in the sequence; in other words, $b1[last()] returns the last node in $b1.for $b in doc("books.xml")//book
let $a := ($b/author)[1],
$sa := ($b/author)[last="Abiteboul"]
where $a << $sa
return $b
In our sample data, there are no such books.
let $l := distinct-values(doc("books.xml")//(author | editor)/last)
order by $l
return <last>{ $l }</last>
Here is the result of the above query:
<last>Abiteboul</last>
<last>Buneman</last>
<last>Gerbarg</last>
<last>Stevens</last>
<last>Suciu</last>
The fact that the union operator always returns nodes in document order is sometimes quite useful. For instance, the following query sorts books based on the name of the first author or editor listed for the book:
for $b in doc("books.xml")//book
let $a1 := ($b/author union $b/editor)[1]
order by $a1/last, $a1/first
return $b
The intersect operator takes two node sequences as operands and returns a sequence containing all the nodes that occur in both operands. The except operator takes two node sequences as operands and returns a sequence containing all the nodes that occur in the first operand but not in the second operand. For instance, the following query returns a book with all of its children except for the price:
for $b in doc("books.xml")//book
where $b/title = "TCP/IP Illustrated"
return
<book>
{ $b/@* }
{ $b/* except $b/price }
</book>
The result of this query contains all attributes of the original book and all elements—in document order—except for the price element, which is omitted:
<book year = "1994">
<title>TCP/IP Illustrated</title>
<author>
<last>Stevens</last>
<first>W.</first>
</author>
<publisher>Addison-Wesley</publisher>
</book>
let $b := doc("books.xml")//book
let $avg := average( $b//price )
return $b[price > $avg]
For our sample data, Listing 1.18 shows the result of this query.
Listing 1.18 Result of Query for Books More Expensive Than Average
<book year = "1999"> <title>The Economics of Technology and Content for Digital TV</title> <editor> <last>Gerbarg</last> <first>Darcy</first> <affiliation>CITI</affiliation> </editor> <publisher>Kluwer Academic Publishers</publisher> <price>129.95</price> </book>
Note that price is the name of an element, but max() is defined for atomic values, not for elements. In XQuery, if the type of a function argument is an atomic type, then the following conversion rules are applied. If the argument is a node, its typed value is extracted, resulting in a sequence of values. If any value in the argument sequence is untyped, XQuery attempts to convert it to the required type and raises an error if it fails. A value is accepted if it has the expected type.
Other familiar functions in XQuery include numeric functions like round(), floor(), and ceiling(); string functions like concat(), string-length(), starts-with(), ends-with(), substring(), upper-case(), lower-case(); and casts for the various simple types. These are all covered in [XQ-FO], which defines the standard function library for XQuery; they need no further coverage here since they are straightforward.
XQuery also has a number of functions that are not found in most other languages. We have already covered distinct-values(), the input functions doc() and collection(). Two other frequently used functions are not() and empty(). The not() function is used in Boolean conditions; for instance, the following returns books where no author’s last name is Stevens:
for $b in doc("books.xml")//book
where not(some $a in $b/author satisfies $a/last="Stevens")
return $b
The empty() function reports whether a sequence is empty. For instance, the following query returns books that have authors, but does not return the one book that has only editors:
for $b in doc("books.xml")//book
where not(empty($b/author))
return $b
The opposite of empty() is exists(), which reports whether a sequence contains at least one item. The preceding query could also be written as follows:
for $b in doc("books.xml")//book
where exists($b/author)
return $b
XQuery also has functions that access various kinds of information associated with a node. The most common accessor functions are string(), which returns the string value of a node, and data(), which returns the typed value of a node. These functions require some explanation. The string value of a node includes the string representation of the text found in the node and its descendants, concatenated in document order. For instance, consider the following query:
string((doc("books.xml")//author)[1])
The result of this query is the string "Stevens W." (The exact result depends on the whitespace found in the original document—we have made some assumptions about what whitespace is present.)
for $b in doc("books.xml")/bib/book
where some $ba in $b/author satisfies
($ba/last=$l and $ba/first=$f)
order by $b/title
return $b/title
This code returns the titles of books written by a given author whose first name is bound to $f and whose last name is bound to $l. But you have to read all of the code in the query to understand that. Placing it in a named function makes its purpose clearer:
define function books-by-author($last, $first)
as element()*
{
for $b in doc("books.xml")/bib/book
where some $ba in $b/author satisfies
($ba/last=$last and $ba/first=$first)
order by $b/title
return $b/title
}
XQuery allows functions to be recursive, which is often important for processing the recursive structure of XML. One common reason for using recursive functions is that XML allows recursive structures. For instance, suppose a book chapter may consist of sections, which may be nested. The query in Listing 1.19 creates a table of contents, containing only the sections and the titles, and reflecting the structure of the original document in the table of contents.
Listing 1.19 Query to Create a Table of Contents
define function toc($book-or-section as element()) as element()* { for $section in $book-or-section/section return <section> { $section/@* , $section/title , toc($section) } </section> } <toc> { for $s in doc("xquery-book.xml")/book return toc($s) } </toc>
If two functions call each other, they are mutually recursive. Mutually recursive functions are allowed in XQuery.
Variable Definitions
A query can define a variable in the prolog. Such a variable is available at any point after it is declared. For instance, if access to the titles of books is used several times in a query, it can be provided in a variable definition:
define variable $titles { doc(“books.xml”)//title }
To avoid circular references, a variable definition may not call functions that are defined prior to the variable definition.
Listing 1.20 Module Declaration for a Library Module
module "http://example.com/xquery/library/book" define function toc($book-or-section as element()) as element()* { for $section in $book-or-section/section return <section> { $section/@* , $section/title , toc($section) } </section> }
Functions and variable definitions in library modules are namespacequalified. Any module can import another module using a module import, which specifies the URI of the module to be imported. It may also specify the location where the module can be found:
import module "http://example.com/xquery/library/book"
at "file:///c:/xquery/lib/book.xq"
The location is not required in an import, since some implementations can locate modules without it. Implementations are free to ignore the location if they have another way to find modules.import module namespace b = "http://example.com/xquery/library/book"
at "file:///c:/xquery/lib/book.xq"
<toc>
{
for $s in doc("xquery-book.xml")/book
return b:toc($s)
}
</toc>
When a module is imported, both its functions and its variables are made available to the importing module.
define function outtie($v as xs:integer) as xs:integer external
define variable $v as xs:integer external
XQuery does not specify how such functions and variables are made available by the external environment, or how function parameters and arguments are converted between the external environment and XQuery.
define function reverse($items)
{
let $count := count($items)
for $i in 0 to $count
return $items[$count - $i]
}
reverse( 1 to 5)
This function uses the to operator, which generates sequences of integers. For instance, the expression 1 to 5 generates the sequence 1, 2, 3, 4, 5. The reverse function takes this sequence and returns the sequence 5, 4, 3, 2, 1. Because this function does not specify a particular type for its parameter or return, it could also be used to return a sequence of some other type, such as a sequence of elements. Specifying more type information would make this function less useful.define function is-document-element($e as element())
as xs:boolean
{
if ($e/.. instance of document-node())
then true()
else false()
}
All the built-in XML Schema types are predefined in XQuery, and these can be used to write function signatures similar to those found in conventional programming languages. For instance, the query in Listing 1.21 defines a function that computes the nth Fibonacci number and calls that function to create the first ten values of the Fibonacci sequence.
Listing 1.21 Query to Create the First Ten Fibonacci Numbers
define function fibo($n as xs:integer) { if ($n = 0) then 0 else if ($n = 1) then 1 else (fibo($n - 1) + fibo($n - 2)) } let $seq := 1 to 10 for $n in $seq return <fibo n="{$n}">{ fibo($n) }</fibo>
Listing 1.22 shows the output of that query.
Listing 1.22 Results of the Query in Listing 1.21
<fibo n = "1">1</fibo> <fibo n = "2">1</fibo> <fibo n = "3">2</fibo> <fibo n = "4">3</fibo> <fibo n = "5">5</fibo> <fibo n = "6">8</fibo> <fibo n = "7">13</fibo> <fibo n = "8">21</fibo> <fibo n = "9">34</fibo> <fibo n = "10">55</fibo>
Schemas and Types
On several occasions, we have mentioned that XQuery can work with untyped data, strongly typed data, or mixtures of the two. If a document is governed by a DTD or has no schema at all, then documents contain very little type information, and queries rely on a set of rules to infer an appropriate type when they encounter values at run-time. For instance, the following query computes the average price of a book in our bibliography data:
avg( doc("books.xml")/bib/book/price )
Since the bibliography does not have a schema, each price element is untyped. The avg() function requires a numeric argument, so it converts each price to a double and then computes the average. The conversion rules are discussed in detail in a later section. The implicit conversion is useful when dealing with untyped data, but prices are generally best represented as decimals rather than floating-point numbers. Later in this chapter we will present a schema for the bibliography in order to add appropriate type information. The schema declares price to be a decimal, so the average would be computed using decimal numbers.define function books-by-author($author)
{
for $b in doc("books.xml")/bib/book
where some $ba in $b/author satisfies
($ba/last=$author/last and $ba/first=$author/first)
order by $b/title
return $b/title
}
Because this function does not specify what kind of element the parameter should be, it can be called with any element at all. For instance, a book element could be passed to this function. Worse yet, the query would not return an error, but would simply search for books containing an author element that exactly matches the book. Since such a match never occurs, this function always returns the empty sequence if called with a book element.Listing 1.23 Schema Import and Type Checking
import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" define function books-by-author($a as element(b:author)) as element(b:title)* { for $b in doc("books.xml")/bib/book where some $ba in $b/author satisfies ($ba/last=$a/last and $ba/first=$a/first) order by $b/title return $b/title }
In XQuery, a type error is raised when the type of an expression does not match the type required by the context in which it appears. For instance, given the previous function definition, the function call in the following expression raises a type error, since an element named book can never be a valid author element:
for $b in doc("books.xml")/bib/book
return books-by-author($b)
All XQuery implementations are required to detect type errors, but some implementations detect them before a query is executed, and others detect them at run-time when query expressions are evaluated. The process of analyzing a query for type errors before a query is executed is called static typing, and it can be done using only the imported schema information and the query itself—there is no need for data to do static typing. In XQuery, static typing is an optional feature, but an implementation that supports static typing must always detect type errors statically, before a query is executed.Listing 1.24 Assigning a Namespace Prefix in Schema Imports
import schema namespace b = "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" define function books-by-author($a as element(b:author)) as element(b:title)* { for $b in doc("books.xml")/b:bib/b:book where some $ba in $b/b:author satisfies ($ba/b:last=$l and $ba/b:first=$f) order by $b/b:title return $b/b:title }
When an element is created, it is immediately validated if there is a schema definition for its name. For instance, the following query raises an error because the schema definition says that a book must have a price:
import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd"
default element namespace = "urn:examples:xmp:bib"
<book year="1994">
<title>Catamaran Racing from Start to Finish</title>
<author><last>Berman</last><first>Phil</first></author>
<publisher>W.W. Norton & Company</publisher>
</book>
The schema import feature reduces errors by allowing queries to specify type information, but these errors are not caught until data with the wrong type information is actually encountered when executing a query. A query processor that implements the static typing feature can detect some kinds of errors by comparing a query to the imported schemas, which means that no data is required to find these errors. Let’s modify our query somewhat and introduce a spelling error—$a/first is misspelled as $a/firt in Listing 1.25.
Listing 1.25 Query with a Spelling Error
import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" define function books-by-author($a as element(author)) as element(title)* { for $b in doc("books.xml")/bib/book where some $ba in $b/author satisfies ($ba/last=$a/last and $ba/first=$a/firt) order by $b/title return $b/title }
An XQuery implementation that supports static typing can detect this error, because it has the definition for an author element, the function parameter is identified as such, and the schema says that an author element does not have a firt element. In an implementation that has schema import but not static typing, this function would actually have to call the function before the error would be raised.
However, in the following path expression, only the names of elements are stated:
doc("books.xml")/bib/book
XQuery allows element tests and attribute tests, node tests that are similar to the type declaration used for function parameters. In a path expression, the node test element(book) finds only elements with the same type as the globally declared book element, which must be found in the schemas that have been imported into the query. By using this instead of the name test book in the path expression, we can tell the query processor the element definition that will be associated with $b, which means that the static type system can guarantee us that a $b will contain title elements; see Listing 1.26.
Listing 1.26 Type Tests in Path Expressions
import schema "urn:examples:xmp:bib" at "c:/dev/schemas/eg/bib.xsd" default element namespace = "urn:examples:xmp:bib" define function books-by-author($a as element(author)) as element(title)* { for $b in doc("books.xml")/bib/element(book) where some $ba in $b/author satisfies ($ba/last=$a/last and $ba/first=$a/first) order by $b/title return $b/title }
Sequence Types
The preceding examples include several queries in which the names of types use a notation that can describe the types that arise in XML documents. Now we need to learn that syntax in some detail. Values in XQuery, in general, are sequences, so the types used to describe them are called sequence types. Some types are built in and may be used in any query without importing a schema into the query. Other types are defined in W3C XML Schemas and must be imported into a query before they can be used.
Built-in Types
If a query has not imported a W3C XML Schema, it still understands the structure of XML documents, including types like document, element, attribute, node, text node, processing instruction, comment, ID, IDREF, IDREFS, etc. In addition to these, it understands the built-in W3C XML Schema simple types.
Table 1.4 lists the built-in types that can be used as sequence types.
In the notation for sequence types, occurrence indicators may be used to indicate the number of items in a sequence. The character ? indicates zero or one items, * indicates zero or more items, and + indicates one or more items. Here are some examples of sequence types with occurrence indicators:
element()+ One or more elements
xs:integer? Zero or one integers
document-node()* Zero or more document nodes
Sequence Type Declaration | What It Matches |
element() | Any element node |
attribute() | Any attribute node |
document-node() | Any document node |
node() | Any node |
text() | Any text node |
processing-instruction() | Any processing instruction node |
processing-instruction("xmlstylesheet") | Any processing instruction node whose target is xml-stylesheet |
comment() | Any comment node |
empty() | An empty sequence |
item() | Any node or atomic value |
QName | An instance of a specific XML Schema built-in type, identified by the name of the type; e.g., xs:string, xs:boolean, xs:decimal, xs:float, xs:double, xs:anyType, xs:anySimpleType |
Listing 1.27 An Imported Schema for Bibliographies
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:bib="urn:examples:xmp:bib" targetNamespace="urn:examples:xmp:bib" elementFormDefault="qualified"> <xs:element name="bib"> <xs:complexType> <xs:sequence> <xs:element ref="bib:book" minOccurs="0" maxOccurs="unbounded" /> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element ref="bib:creator" minOccurs="1" maxOccurs="unbounded"/> <xs:element name="publisher" type="xs:string"/> <xs:element name="price" type="currency"/> <xs:element name="year" type="xs:gYear"/> </xs:sequence> <xs:attribute name="isbn" type="bib:isbn"/> </xs:complexType> </xs:element> <xs:element name="creator" type="person" abstract="true" /> <xs:element name="author" type="person" substitutionGroup="bib:creator"/> <xs:element name="editor" type="personWithAffiliation" substitutionGroup="bib:creator"/> <xs:complexType name="person"> <xs:sequence> <xs:element name="last" type="xs:string"/> <xs:element name="first" type="xs:string"/> </xs:sequence> </xs:complexType> <xs:complexType name="personWithAffiliation"> <xs:complexContent> <xs:extension base="person"> <xs:sequence> <xs:element name="affiliation" type="xs:string"/> </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType> <xs:simpleType name="isbn"> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{9}[0-9X]"/> </xs:restriction> </xs:simpleType> <xs:simpleType name="currency"> <xs:restriction base="xs:decimal"> <xs:pattern value="\d+.\d{2}"/> </xs:restriction> </xs:simpleType> </xs:schema>
Here is an example of a bibliography element that conforms to this new definition:
<bib xmlns="urn:examples:xmp:bib">
<book isbn="0201563177">
<title>Advanced Programming in the Unix Environment</title>
<author><last>Stevens</last><first>W.</first></author>
<publisher>Addison-Wesley</publisher>
<price>65.95</price>
<year>1992</year>
</book>
</bib>
We do not teach the basics of XML Schema here—those who do not know XML Schema should look at XML Schema primer [SCHEMA]. However, to understand how XQuery leverages the type information found in a schema, we need to know what the schema says. Here are some aspects of the previous schema that affect the behavior of examples used in the rest of this chapter:
<xs:simpleType name="isbn">
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{9}[0-9X]"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="currency">
<xs:restriction base="xs:decimal">
<xs:pattern value="\d+.\d{2}"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="creator" type="person" abstract="true" />
<xs:element name="author" type="person" substitutionGroup="bib:creator"/>
<xs:element name="editor" type="personWithAffiliation" substitutionGroup="bib:creator"/>
Listing 1.28 Content Model for the Book Element
<xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element ref="bib:creator" minOccurs="1" maxOccurs="unbounded"/> <xs:element name="publisher" type="xs:string"/> <xs:element name="price" type="currency"/> <xs:element name="year" type="xs:gYear"/> </xs:sequence> <xs:attribute name="isbn" type="bib:isbn"/> </xs:complexType> </xs:element>
The following elements are globally declared: bib, book, creator, author, editor. The type of the bib and book elements is “anonymous,” which means that the schema does not give these types explicit names.
All of the named types in this schema are global; in fact, in XML Schema, all named types are global.
Now let us explore the sequence type notation used to refer to constructs imported from the above schema. The basic form of an element test has two parameters: the name of the element and the name of the type:
element(creator, person)
To match an element, both the name and the type must match. The name will match if the element’s name is creator or in the substitution group of creator; thus, in the above schema, the names author and editor would also match. The type will match if it is person or any other type derived from person by extension or restriction; thus, in the above schema, personWithAffiliation would also match. The second parameter can be omitted; if it is, the type is taken from the schema definition. Because the schema declares the type of creator to be person, the following declaration matches the same elements as the previous declaration:
element(creator)
In XML Schema, element and attribute definitions may be local, available only within a specific element or type. A context path may be used to identify a locally declared element or attribute. For instance, the following declaration matches the locally declared price element, which is found in the globally declared book element:
element(book/price)
Although this form is generally used to match locally declared elements, it will match any element whose name is price and which has the same type as the price element found in the globally declared book element. A similar form is used to match elements or attributes in globally defined types:
element(type(person)/last)
The same forms can be used for attributes, except that (1) attributes never have substitution groups in XML Schema; (2) attributes are not nillable in XML Schema; and (3) the element name is preceded by the @ symbol in the XQuery syntax. For instance, the following declaration matches attributes named price of type currency:
attribute(@price, currency)
The following declaration matches attributes named isbn of the type found for the corresponding attribute in the globally declared book element:
attribute(book/@isbn)
Table 1.5 summarizes the declarations made available by importing the schema shown in Listing 1.27.element(n, person nillable)
Sequence Type Declaration | What It Matches |
element(creator, person) | An element named creator of type person |
element(creator) | Any element named creator of type xs:string—the type declared for creator in the schema. |
element(*, person) | Any element of type person. |
element(book/price) | An element named price of type currency—the type declared for price elements inside a book element. |
element(type(person)/last) | An element named last of type xs:string—the type declared for last elements inside the person type. |
attribute(@price, currency) | An attribute named price of type currency. |
attribute(book/@isbn) | An attribute named isbn of type isbn—the type declared for isbn attributes in a book element. |
attribute(@*, currency) | Any attribute of type currency. |
bib:currency | A value of the user-defined type currency" |
<n xsi:nil=”true” />
import schema namespace bib="urn:examples:xmp:bib"
define function discount-price($b as element(bib:book))
as xs:decimal
{
0.80 * $b//bib:price
}
It might be called in a query as follows:
for $b in doc("books.xml")//bib:book
where $b/bib:title = "Data on the Web"
return
<result>
{
$b/bib:title,
<price>{ discount-price($b/bib:price) }</price>
}
</result>
In the preceding query, the price element passed to the function exactly matches the declared type of the parameter. XQuery also defines some conversion rules that are applied if the argument does not exactly match the type of the parameter. If the type of the argument does not match and cannot be converted, a type error is raised. One important conversion rule is that the value of an element can be extracted if the expected type is an atomic type and an element is encountered. This is known as atomization. For instance, consider the query in Listing 1.29.
Listing 1.29Atomization
import schema namespace bib="urn:examples:xmp:bib" define function discount-price($p as xs:decimal) as xs:decimal { 0.80 * $p//bib:price } for $b in doc("books.xml")//bib:book where $b/bib:title = "Data on the Web" return <result> { $b/bib:title, <price>{ discount-price($b/bib:price) }</price> } </result>
When the typed value of the price element is extracted, its type is bib:currency. The function parameter expects a value of type xs:decimal, but the schema imported into the query says that the currency type is derived from xs:decimal, so it is accepted as a decimal.
In general, the typed value of an element is a sequence. If any value in the argument sequence is untyped, XQuery attempts to convert it to the required type and raises a type error if it fails. For instance, we can call the revised discount-price() function as follows:
let $w := <foo>12.34</foo>
return discount-price($w)
In this example, the foo element is not validated, and contains no type information. When this element is passed to the function, which expects a decimal, the function first extracts the value, which is untyped. It then attempts to cast 12.34 to a decimal; because 12.34 is a legitimate lexical representation for a decimal, this cast succeeds. The last conversion rule for function parameters involves type promotion: If the parameter type is xs:double, an argument whose type is xs:float or xs:decimal will automatically be cast to the parameter type; if the parameter type is xs:float, an argument whose type is xs:decimal will automatically be cast to the parameter type.import schema namespace bib="urn:examples:xmp:bib"
define function discount-price($p as element(bib:book/bib:price))
as xs:decimal
{
0.80 * $p
}
If the price element had an anonymous type, this would be the only way to indicate a price element of that type. Since our schema says a price element has the type bib:currency, the preceding function is equivalent to this one:
import schema namespace bib="urn:examples:xmp:bib"
define function discount-price($p as element(bib:price, bib:currency))
as xs:decimal
{
0.80 * $p
}
The same conversion rules that are applied to function arguments are also applied to function return values. Consider the following function:
define function decimate($p as element(bib:price, bib:currency))
as xs:decimal
{
$p
}
In this function, $p is an element named bib:price of type bib:currency. When it is returned, the function applies the function conversion rules, extracting the value, which is an atomic value of type bib:currency, then returning it as a valid instance of xs:decimal, from which its type is derived.
xs:date("2000-01-01")
Constructor functions check a value to make sure that the argument is a legal value for the given type and raise an error if it is not. For instance, if the month had been 13, the constructor would have raised an error.xs:string( 12345 )
Some types can be cast to each other, others cannot. The set of casts that will succeed can be found in [XQ-FO]. Constructor functions are also created for imported simple types—this is discussed in the section on imported schemas.import schema namespace bib="urn:examples:xmp:bib"
bib:isbn("012345678X")
The constructor functions for types check all the facets for those types. For instance, the following query raises an error because the pattern in the type declaration says that an ISBN number may not end with the character Y:
import schema namespace bib="urn:examples:xmp:bib"
bib:isbn("012345678Y")
Listing 1.30 Declaring the Type of a Variable
import schema namespace bib="urn:examples:xmp:bib" for $b in doc("books.xml")//bib:book let $authors as element(bib:author)+ := $b//bib:author return <result> { $b/bib:title, $authors } </result>
Since the schema for a bibliography allows a book to have editors but no authors, this query will raise an error if such a book is encountered. If a programmer simply assumed all books have authors, using a typed variable might identify an error in a query.
The instance of Operator
The instance of operator tests an item for a given type. For instance, the following expression tests the variable $a to see if it is an element node:
$a instance of element()
As you recall, literals in XQuery have types. The following expressions each return true:
<foo/> instance of element()
3.14 instance of xs:decimal
"foo" instance of xs:string
(1, 2, 3) instance of xs:integer*
() instance of xs:integer?
(1, 2, 3) instance of xs:integer+
The following expressions each return false:
3.14 instance of xdt:untypedAtomic
"3.14" instance of xs:decimal
3.14 instance of xs:integer
Type comparisons take type hierarchies into account. For instance, recall that SKU is derived from xs:string. The following query returns true:
import schema namespace bib="urn:examples:xmp:bib"
bib:isbn("012345678X") instance of xs:string
Listing 1.31 Function Using the typeswitch Expression
define function wrapper($x as xs:anySimpleType) as element() { typeswitch ($x) case $i as xs:integer return <wrap xsi:type="xs:integer">{ $i }</wrap> case $d as xs:decimal return <wrap xsi:type="xs:decimal">{ $d }</wrap> default return error("unknown type!") } wrapper( 1 )
The case clause tests to see if $x has a certain type; if it does, the case clause creates a variable of that type and evaluates the associated return clause. The error function is a standard XQuery function that raises an error and aborts execution of the query. Here is the output of the query in Listing 1.31:
<wrap xsi:type="xs:integer">1</wrap>
The case clauses test to see if $x has a certain type; if it does, the case clause creates a variable of that type and evaluates the first return clause that matches the type of $x. In this example, 1 is both an integer and a decimal, since xs:integer is derived from xs:decimal in XML Schema, so the first matching clause is evaluated. The error function is a standard XQuery function that raises an error and aborts execution of the query.Listing 1.32 Using typeswitch to Implement Simple Polymorphism
import schema namespace bib="urn:examples:xmp:bib" define function pay-creator( $c as element(bib:creator), $p as xs:decimal) { typeswitch ($c) case $a as element(bib:author) return pay-author($a, $p) case $e as element(bib:editor) return pay-editor($e, $p) default return error("unknown creator element!") }
The treat as Expression
The treat as expression asserts that a value has a particular type, and raises an error if it does not. It is similar to a cast, except that it does not change the type of its argument, it merely examines it. Treat as and instance of could be used together to write the function shown in Listing 1.33, which has the same functionality as the function in Listing 1.32.
Listing 1.33 Using treat as and instance of to Implement Simple Polymorphism
import schema namespace bib="urn:examples:xmp:bib" define function pay-creator( $c as element(bib:creator), $p as xs:decimal) { if ($c instance of element(bib:author)) then pay-author($a, $p) else if ($c instance of element(bib:editor)) then pay-editor($e, $p) else error("unknown creator element!") }
In general, typeswitch is preferable for this kind of code, and it also provides better type information for processors that do static typing.
Implicit Validation and Element Constructors
We have already discussed the fact that validation of the elements constructed in a query is automatic if the declaration of an element is global and is found in a schema that has been imported into the query. Elements that do not correspond to a global element definition are not validated. In other words, element construction uses XML Schema’s lax validation mode. The query in Listing 1.34 creates a fully validated book element, with all the associated type information.
Listing 1.34 Query That Creates a Fully Validated Book Element
import schema namespace bib="urn:examples:xmp:bib" <bib:book isbn="0201633469"> <bib:title>TCP/IP Illustrated</bib:title> <bib:author> <bib:last>Stevens</bib:last> <bib:first>W.</bib:first> </bib:author> <bib:publisher>Addison-Wesley</bib:publisher> <bib:price>65.95</bib:price> <bib:year>1994</bib:year> </bib:book>
Because element constructors validate implicitly, errors are caught early, and the types of elements may be used appropriately throughout the expressions of a query. If the element constructor in Listing 1.34 had omitted a required element or misspelled the name of an element, an error would be raised.
Relational programmers are used to writing queries that return tables with only some columns from the original tables that were queried. These tables often have the same names as the original tables, but a different structure. Thus, a relational programmer is likely to write a query like the following:
import schema namespace bib="urn:examples:xmp:bib"
for $b in doc("books.xml")//bib:book
return
<bib:book>
{
$b/bib:title,
$b//element(bib:creator)
}
</bib:book>
This query raises an error, because the bib:book element that is returned has a structure that does not correspond to the schema definition. Validation can be turned off using a validate expression, as shown in Listing 1.35, which uses skip.
Listing 1.35 Using validate to Disable Validation
import schema namespace bib="urn:examples:xmp:bib" for $b in doc("books.xml")//bib:book return validate skip { <bib:book> { $b/bib:title, $b//element(bib:creator) } </bib:book> }
The validate expression can also be used to specify a validation context for locally declared elements or attributes. For instance, the price element is locally declared:
import schema namespace bib="urn:examples:xmp:bib"
validate context bib:book
{
<bib:price>49.99</bib:price>
}
If an element’s name is not recognized, it is treated as an untyped element unless xsi:type is specified. For instance, the following query returns a well-formed element with untyped content, because the bib:mug element is not defined in the schema:
import schema namespace bib="urn:examples:xmp:bib"
<bib:mug>49.99</bib:mug>
A query can specify the type of an element using the xsi:type attribute; in this case, the element is validated using the specified type:
import schema namespace bib="urn:examples:xmp:bib"
<bib:mug xsi:type="xs:decimal">49.99</bib:mug>
If a locally declared element is not wrapped in a validate expression that specifies the context, it will generally be treated as a well-formed element with untyped content, as in the following query:
import schema namespace bib="urn:examples:xmp:bib"
<bib:price>49.99</bib:price>
To prevent errors like this, you can set the default validation mode to strict, which means that all elements must be defined in an imported schema, or an error is raised. This is done in the prolog. The following query raises an error because the bib:price element is not recognized in the global context:
import schema namespace bib="urn:examples:xmp:bib"
validation strict
<bib:price>49.99</bib:price>
The validation mode may be set to lax, which is the default behavior, strict, as shown above, or skip if no validation is to be performed in the query.
Contact Us | | Site Guide | About PerfectXML | Advertise | Privacy | |