XML/Web Services 100 Interview Questions - Part 1 of 5
Preparing for an interview? Or just want to refresh your XML skills? Here are 100 questions on XML and related technologies.
This first part presents 20 questions to assess the basic concepts. Part II would focus on XML implementation questions specific to Microsoft platform, part III on Java XML, part IV on Web services concepts, and finally part V on some advanced design/implementation questions.
Please note that the answers/opinions expressed in this article are solely based on our experience. If you disagree or have any questions, we welcome your comments. Please email your comments us at .
||December 31, 2002
1. What is XML, and why it is gaining such a momentum?
XML, or Extensible Markup Language, is an excellent way of representing data in a structured format. The most popular application of XML is for data-exchange. Like HTML, XML is also a textual, tag-based "markup" language. However, unlike HTML, which has a fixed set of tags and focuses on "presentation", XML does not have any fixed set of tags, and XML is all about "data". It is possible to create our own tags, and write data inside these tags in an XML document. The XML data is structured in a hierarchical format, and there are many "parsers" available that allow easily getting to the data values.
Some of the reasons behind XML's success include:
In addition to data-exchange, XML is being used for various other purposes, such as content management, XML-based configuration files, eBusiness, document publishing, application integration, and most notably XML-based messaging or Web services.
- The ability to define and use our own tags makes XML "extensible", and self-describing.
- XML's textual nature makes it highly portable allowing us to send and receive data from one platform to the other without any issues (such as encoding).
- The availability of many free XML parsers and processors, makes it really easy to create, read, and manage XML documents.
- As mentioned earlier, XML is all about data. Separating "presentation" from the actual "content" has many benefits, including the flexibility of providing/presenting data to multiple destinations/devices.
- The availability of various other standards surrounding the XML (such as XSLT, XPath, and XML Schema); and support for these standards in various toolkits/APIs.
- XML is the "standard" backed by W3C and supported by major vendors.
2. Take any example, such as Employees, or Customers, and write a sample XML document.
The above sample XML document illustrates some good practices, such as writing XML declaration line including the encoding attribute; use of namespaces, etc; and also highlights the facts that you know about comments, attributes, etc.
<?xml version="1.0" encoding="UTF-8" ?>
<!--ABCCorpUSA Employees -->
|3. What is the difference between well-formed and valid XML document?
W3C, in the XML specification, has defined certain rules that needs to be followed while creating XML documents. The examples of such rules include: having exactly one root element, having end-tag for each start- tag, using single/double quotes for attribute values, and so on. If an XML document follows all these rules, it is said to be well-formed document and XML parsers can be used to parse and process such documents.
Document Type Definitions (DTDs) or XML Schemas can be used to define the structure and content of a specific class of XML documents. This includes the parent-child relationship details, attribute lists, data type information, value restrictions, etc. In addition to the well-formedness rules, if an XML document also follows the rules specified in the associated DTD/Schema, it is said to be a valid XML document. All valid XML documents are well-formed; but the reverse is not always true, that is, well-formed XML documents do not necessarily have to be valid.
4. What is W3C and what is its role in the development of XML family of technologies?
World Wide Web Consortium (W3C), founded by Tim Berners-Lee (inventor of the Web), is a group of less than 100 full-time staff and about 500 members, which includes corporations, government, universities, etc. is an Internet standards body that creates recommendations such as HTML and XML. A very well-defined process is followed to take the idea and turn into a recommendation, which then requires sufficient vendor and developer support to become a successful Web technology.
W3C has divided the Web-related work into five domains (Source: W3C Web site):
- Architecture Domain: develops the underlying technologies of the Web, such as URI, DOM, XML, Web Services, etc.
- Document Formats Domain: works on formats and languages, such as HTML, CSS, MathML, SVG, etc.
- Interaction Domain: to improve user interaction with the Web, and to facilitate single Web authoring to benefit users and content providers alike; includes speech, multimedia, multimodal, etc.
- Technology and Society Domain: to develop Web infrastructure to address social, legal, and public policy concerns. Activities include Semantic Web (RDF), Platform for Privacy Preferences (P3P), XML Signature (xmldsig), XML Encryption, XML Key Management, etc.
- Web Accessibility Initiative (WAI): to lead the Web to its full potential includes promoting a high degree of usability for people with disabilities, and people not familiar with technologies, etc. This group defines guidelines for browsers, multimedia players, etc.
|5. If XML is all about "text" data, is it possible to include binary data (such as an image) in an XML document? If yes, how?
Yes, it is possible to include binary data as part of XML document. The binary data needs to be hex or base64 encoded. Most of the parsers and various XML tools offer the encoding/decoding support for binary data, for example, using Microsoft's XML parser (MSXML), if a node data type is set to bin.base64, the parser automatically does the base64 encoding/decoding.
|6. Is it possible to include international character (such as Japanese, or Chinese) characters inside an XML document. If yes, how?
Yes, XML being a Unicode-text-based standard supports international characters. The encoding attribute value in the XML declaration line (<?xml …?>) needs to be properly set in such cases.
|7. What is the use of Namespaces in XML?
As XML does not predefine any set of tags/attribute names. Hence, it is quite possible that two totally different XML documents, defined by two totally different people/ companies, use the same tag name/attribute name. If an application needs to use (merge, process, etc.) above two XML documents together, this would cause confusion and we need a way to distinguish deterministically between tags with the same name. XML Namespaces are used for this purpose. Namespaces are defined using URI (Uniform Resource Identifier), and then associated with the element/attribute names. Namespaces are also used to "group" a logically related set of XML vocabulary.
|8. Was XML created to replace HTML? Or will XML ever replace HTML?
It is a fact that the problem with HTML is that combines "data" with the "presentation" details, but XML was not necessarily created to replace HTML, rather it was created with a much broader vision – a universal format to structured data. HTML is, and will remain, the primary backbone of the Web. However, the benefits of XML will soon be seen in HTML as well, which is evolving as XHTML 1.0 (HTML 4.01 in XML syntax). So, to answer the question: No, XML was not created to replace XML and it will not completely replace HTML; however it is being used to update HTML for better Web.
|9. What is XHTML?
Is simple words, XHTML, or Extensible HTML, is HTML 4 with XML rules applied to it (each begin tag must have an end tag, attribute values in single/double quotes, etc.). However, the overall vision of XHTML is much more than that. In addition to using XML syntax for HTML, XHTML also encloses specifications such as XHTML Basic (minimal set of modules for devices such as PDAs), XForms (represents the next generation of forms for the Web, and separates presentation, logic, and data), XML Events (provides XML languages with the ability to uniformly integrate event listeners and associated event handlers), etc.
|10. Was XML created to replace databases, specifically RDBMS? Or will XML ever replace databases?
XML and "data" can be assumed to be synonymous, but XML was not created to replace traditional RDBMSs (Relational Database Management Systems), such as SQL Server or Oracle. XML is quite popular for "data-on-move". XML can be used to "contain", "define", and "transform" data, but it is not designed to offer benefits that traditional RDBMS offer, such as security, transaction support, etc. However, for certain applications it would make more sense to "natively" store XML. Instead of using traditional relational databases, Native XML Databases products are available in the market, that can be used to store and manage XML in its native form, for application such as content management, search, etc.
|11. Was XML created to replace EDI (Electronic Data Interchange)? Or will XML ever replace EDI?
Considering the complexity, ROI, and various other disadvantages or EDI, think that XML is here to replace EDI (Electronic Data Interchange). We encourage you to read the article XML Set To Change The Face Of E-Commerce.
|12. Is it possible to have characters from multiple encodings into a single XML document? If yes, how?
XML declaration line (and hence the encoding attribute) is optional in the XML document. The parser uses the first bytes (Unicode byte-order-mark) in the document to detect if the document is UTF-8 or UTF-16. If document contains characters from encoding other than UTF-8 and UTF-16, it is required to define the encoding attribute with the correct value. Once the encoding is defined for one character set, it is not possible to redefine the encoding or include data from other encoding. However, if it required to represent data from two different encodings, external entities can be used for this purpose.
|13. What is XPath?
XML Path Language (XPath) is a W3C specification that defines syntax for addressing parts of XML document. XML document is considered as a logical tree structure, and syntax based on this consideration is used to address elements and attributes at any level in the XML document. For example, considering the XML document described above in answer to question 2, /abc:Employees/abc:Emp/@EmpID XPath expression can be used to access the EmpID attribute under the (first) Emp element under the Employees document element. XPath is used in various other specifications such as XSLT.
|14. What is XSLT? And what's its use?
XSL Transformations (XSLT) is yet another popular W3C specification that defines XML-based syntax, used to transform XML documents to any other text format, such as HTML, text, XML, etc. XSLT stylesheets can be applied on the source XML document to transform XML into some other XML, or text, HTML, or any other text format.
|15. How are DTDs/XML Schemas important while building XML applications?
Document Type Definition (DTD) or XML Schema is used to define the structure and other constrains for the XML documents. If an XML document has an associated DTD/Schema, it is said to be a valid XML document, and it ensures that the XML document structure and data adheres to the predefined rules. While using the XML document in an application, once the XML document is validated, it is then not required to assert/check for parent-child relationship, presence/absence of elements/attributes, data-value range checks, etc. The schema validation already took care of all such issues. If a valid XML document is being used as the media of data transfer between two parties, both can be rest assured that the XML document is as expected. In addition, DTDs/Schemas have many other benefits, including help in understanding the resultant XML document structure/constraints (documentation), and defining DTDs/Schemas is also a good design step.
|16. What is DOM?
Document Object Model (DOM) is a W3C specification that defines a standard (abstract) programming API to build, navigate and update XML documents. It is a "tree-structure-based" interface. As per the DOM specification, the XML parsers (such as MSXML or Xerces), load the entire XML document into memory, before it can be processed. XPath is used to navigate randomly in the document, and various DOM methods are used to create and update (add elements, delete elements, add/remove attributes, etc.) the XML documents.
|17. What is SAX?
Simple API for XML Processing (SAX) is an alternative to DOM, and can be used to parse XML documents. SAX is based on streaming model. The SAX parser reads input XML stream and generates various parsing events that an application can handle. With each parsing event, the parser sends sufficient information about the node being parsed. Unlike DOM, SAX does not build an in-memory representation of the source XML document, and hence it is an excellent alternative when parsing large XML documents, as SAX does not require that much memory (and resources). Unlike DOM, SAX is not defined/controlled by W3C. See http://www.saxproject.org/ for details.
|18. Which is the best API to parse a huge XML document, to get parts of data from the XML document?
If the requirement is to "look-into" a huge XML document for some chunk of data, SAX would be better alternative. DOM load/unload of huge XML documents needs lot of memory and other resources, whereas SAX parses the XML character-by-character, and hence is well suited in this case.
|19. Why the XML Web service is such a hot topic in the industry right now? Is it just some hype, or does it have a real value behind it?
Google (Search Web service), Amazon (for Associates, store building), Microsoft (MapPoint.NET), and many other already offer and support XML Web services. Using XML as Internet-based messaging, to integrate applications, is truly a valuable technology that offers a good ROI (considering other integration technologies and interoperability issues) and demands lower integration costs. In last two years, XML Web services received a good momentum, more and more vendors supported it, and surrounding standards were defined to close any lose gaps, making XML Web services quite a mature, low-risk technology , and think will continue in the coming years.
|20. What are the core protocols/standards behind XML Web services?
XML (for message format), HTTP and others (for transport), WSDL (Web Service Definition Language, to describe the Web services and define the contract), and UDDI (Universal Description, Discovery and Integration, to dynamically discover and invoke Web services).
Stay tuned for next set of 20 questions! Happy New Year from all of us at perfectxml.com!
Related link: Top 10 Interview Questions When Hiring XML Developers