Markup, Character Data, and Parsing 113 Because the document type declaration specifies the root element, this must be the first element the parser encounters. If any other element but the one identified by the DOCTYPE line appears first, the document is immediately invalid. Listing 3-1 shows a very simple XHTML 1.0 document. The DOCTYPE is html (not  xhtml),  so  the  document  body  begins  with  <html  ....>  and  ends  with </html>. Listing  3-1  Simple XHTML 1.0 Document with XML Prolog and Document Body <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"     ""> <html xmlns="" xml:lang="en" lang="en">   <head>     <title>XHTML 1.0</title>   </head>   <body>     <h1>Simple XHTML 1.0 Example</h1>     <p>See the <a href= "">DTD</a>.</p>   </body> </html> Markup, Character Data, and Parsing An  XML  document  contains  text  characters  that  fall  into  two  categories:  either they are part of the document markup or part of the data content, usually called character data, which simply means all text that is not part of the markup. In other words, XML text consists of intermingled character data and markup. Lets revisit an earlier fragment. <Address>   <Street>123 Milky Way</Street>   <City>Columbia</City>   <State>MD</State>   <Zip>20777</Zip> </Address> The  character  data  comprises  the  four  strings  123  Milky  Way,  Columbia, MD, and 20777; the markup comprises the start and end tags for the five ele- ments Address, Street, City, State, and Zip. Note that this is similar but not iden- tical, to what we previously called content. For example, although each chunk of character data is the content of a particular element, the content of the Address ele- ment is all of the child elements. We can think of all the character data belonging to both the element that directly contains it and indirectly to Address. (In fact, in some  Page 113  Wednesday, April 24, 2002  11:34 AM