perfectxml.com
 Basic Search  Advanced Search   
Topics Resources Free Library Software XML News About Us
  You are here: home »» Free Library »» O'Reilly Books » Programming Jabber Extending XML Messaging Sunday, 11 March 2007
 
Programming Jabber Extending XML Messaging

ISBN: 0596002025
Author(s): DJ Adams
December 2001

Programming Jabber offers developers a chance to learn and understand the Jabber technology and protocol from an implementer's point of view. Detailed information of each part of the Jabber protocol is introduced, explained, and discussed in the form of mini-projects, or simple and extended examples in Perl, Python, and Java. This book provides the foundation and framework for developers to hit the ground running, and is the essential book on Jabber.

Buy this book!

Copyright O'Reilly & Associates, Inc.. Used with permission.

Chapter 5

Jabber Technology Basics

Contents:

Jabber Identifiers
Resources and Priorities
XML Streams
Jabber's Building Blocks

One of Jabber's strengths is its simplicity. Neither the technology employed to build Jabber networks nor the protocol used to facilitate conversations within those networks is complicated.

The aim of this chapter is to give you a good grounding in the technology and the protocol. In the Preface we likened Jabber to chess: a small set of rules but boundless possibilities. And, indeed, that is the case. In this chapter we cover identification within Jabber -- how entities are addressed. Related to identity is the concept of resources; we look at how that relates to addressing, as well as its relationship to presence and priority.

The Jabber protocol is in XML, which is streamed between endpoints. We look at the details of these XML streams and see how they're constructed. Comprised of surprisingly few basic elements, the Jabber protocol is small but perfectly formed. Each element of Jabber's protocol will be reviewed in detail.

With this chapter under your belt, your understanding of Jabber fundamentals should be complete. Everything else is strategy, planning, and endgames.

Jabber Identifiers

An entity is anything that can be addressed in Jabber. A server, a component, a user connected with a client -- these are all addressable entities. Every entity is identifiable by a Jabber ID, or JID. These JIDs give these entities their addressability. This is what a typical JID looks like:

[email protected]/Laptop

This JID represents a user, connected to Jabber on a particular client. We can look at this JID in a more abstract way, by identifying its component parts:

username@hostname/resource

The username is separated from the hostname with an @ symbol, and the resource is separated from the hostname with a slash (/).

It's quite likely that the JIDs you may have encountered so far are those representing users' connections, such as the [email protected]/Laptop example. This is not the only sort of entity that JIDs are used to represent. As a Uniform Resource Locator (URL) is fundamental to the HyperText Transport Protocol (HTTP), so a JID is fundamental in Jabber. JIDs are used to represent not only users connected to Jabber via their clients, but also every single entity in the Jabber universe that is to be addressed -- in other words, that is to be the potential recipient of a message. Before looking at the restrictions that govern how a JID might be constructed (these restrictions are described in "

Rules and Regulations"), let's first look at some examples in which a JID is employed to give entities their addressability:

A Jabber server

A Jabber server is identified by a JID that doesn't contain a username. For basic addressing, the JID is simply the hostname :

jabber.org

To address specific features of the server, a resource is often specified and reflects the feature being addressed:

jabber.org/admin

The JID jabber.org/admin is used by server administrators at jabber.org to obtain a list of online users.

Administrators can send an announcement to all online users on the Jabber server yak by sending a message to the JID:

yak/announce/online

In this case, the resource is announce/online. The first slash in the JID is interpreted as the separator; the second slash is simply part of the resource.

Unique identification of Jabber software

Jabber clients can make a request for information on new versions of themselves by sending a special packet to an update server that manages a software version database. The packet they send is a presence packet (see "

Jabber's Building Blocks
" later in this chapter for an explanation of packet types) to a JID that takes this form:

[email protected]/1.6.0.3

In this case, the important part of the JID is the hostname (update.jabber.org) which is the Jabber server to which the presence packet is destined. The username(95996702) is used to represent the unique identification of the client software requesting version information, and the resource(1.6.0.3) is set to be the current version of the client software.

A conference room

Jabber has a Conferencing component that provides group chat facilities akin to IRC. Whereas IRC has channels, the Conferencing component offers rooms. These rooms are addressed with JIDs in this form:

[email protected]

The room name is specified in the usernameportion of the JID, and the hostnamereflects the address of the conferencing component.



Browsing entities

Browsing is a powerful hierarchical navigation and entity discovery feature in Jabber. When a browse request is sent to an entity, that entity may return various pieces of information that reflect its component parts -- how it's made up, what services it offers, what features it has, and so on.

The browse request is addressed to the entity via its JID, and the component parts that are returned in response are all identified with JIDs too. If we address a browse request to the JID yak/admin, we receive a list of online users. This is shown in Example 5-1.

Example 5-1. Querying the server yak for online users

SEND: <iq type='get' to='yak/admin'>
        <query xmlns='jabber:iq:browse'/>
      </iq>

RECV: <iq type='result' to='dj@yak/console' from='yak/admin'>
        <item name='Online Users (seconds, sent, received)'
              xmlns='jabber:iq:browse' jid='yak/admin'>
          <user name='dj (548, 18, 15)' jid='dj@yak'/>
          <user name='john (535, 11, 13)' jid='john@yak'/>
          <user name='jim (488, 15, 17)' jid='jim@yak'/>
        </item>
      </iq>

A further example of browsing is shown in Example 5-2, where a conference service running on the jabber.org server is queried for information.

Example 5-2. Querying a conference service

SEND: <iq to='conference.jabber.org' type='get'>
        <q xmlns='jabber:iq:browse'/>
      </iq>

RECV: <iq type='result' to='[email protected]/telnet'
          from='conference.jabber.org'>
        <conference xmlns='jabber:iq:browse'
          type='public' name='Jabber.org Conferencing Center'>
          <conference jid='[email protected]'
                      type='public' name='Assistance Zone (2)'/>
          <conference jid='[email protected]'
                      type='public' name='Development Room (14)'/>
          <conference jid='[email protected]'
                      type='public' name='Users Area (6)'/>
          <conference jid='[email protected]'
                      type='public' name='General Chat (1)'/>
          ...
      </conference>
    </iq>

The JID yak/admin in Example 5-1 represents an administrative function in the Jabber server yak; it identifies the place -- the service entry point -- by Jabber address, from which this information can be retrieved.

Example 5-1 shows how pervasive the JID is as a mechanism for identification within Jabber. How we might use the information returned to us is not relevant at this point; the key thing to note is that the hooks used in conversations to jump from one point to another, to refer to other entities -- services, users, transports, call-hooks into a server to obtain specific information -- take the form of JIDs. Each one of the boldface attribute values in the example is a JID.

NOTE: This administrative information about online users on a Jabber server can be retrieved by sending the IQ-get element shown in the example. However, the information is forthcoming only if the user making the request -- sending the IQ-get element -- is the administrative user. See the section "Administration" in Chapter 4 for details on administrative users.

Taking another example from the conferencing area, JIDs are used to represent those present in a room in an abstracted way. Each room participant has an identity specific to that room, for example:

jdev@conference.jabber.org/bd9505f766f98bd559d4c2d8a9d5ae78e3a7bbf5

As before, the room itself is represented by the username and hostname parts of the JID -- in this case, it's the Jabber developers room (jdev) hosted on conference.jabber.org. The resource is the long hex number that represents an individual room participant. It's a hexadecimal SHA-1 message digest of the participant's JID, designed to be unique and calculated and assigned by the conferencing component as a user enters the room.[1]

[1]This is to shield the participant's real identity, which is the default setting for a conference room.

In the client software identification example of a JID being used to carry software version information, we have a presence packet addressed to a JID using the following form:

[email protected]/1.6.0.3

But why doesn't the presence packet end up getting sent to a user called 959967024? The short answer is because the Jabber Session Manager (JSM) component isn't running at update.jabber.org.

Instead, the server is running a special component that provides a version information service and has no concept of user sessions as such. This component receives the presence packet -- which doesn't go any further (i.e., it isn't passed on to somewhere else) -- and then inspects the username and resource before performing the database lookup to see if their software needs to be updated.

So we see that just because a JID might have something defined for the username part, it doesn't necessarily mean there's a user at the end of the line. It just serves as a carrier of unique information embedded in the JID to whichever component is listening for packets to the hostname.

Components, Hostnames, and Users

As you can see, JIDs are flexible identifiers used throughout Jabber to give addressability to various entities. In the context of the JSM and user management, the address structure username@hostname has many parallels with email addressing, and indeed not without reason. In the context of individual users, an email address represents a user on a specific email server. This server is the user's "home," the mailbox to which everything addressed to the user's email address is routed. Different email users have different home mailboxes. In the same way, the JIDs of different Jabber users reflect each user's home Jabber server, to which everything addressed to his JID is routed. A message addressed by a user based on one Jabber server to a user based on another Jabber server is automatically routed from the one server to the other.

Rules and Regulations

A JID must contain a hostname part to be valid. The username and resource parts are optional; circumstance and usage dictates when either of these parts is necessary. A username is specific to the hostname that it's paired up with. For example: [email protected] is not the same as [email protected].

There are some restrictions on how each JID part is composed; Table 5-1 details these restrictions. Although you can be particular about the case of letters in a username, any operations (such as comparisons) at the Jabber server are case-insensitive. For example, if a user has registered dj as his username, then another user cannot register with the username DJ. However, the person who registered as dj can connect and send DJ when he authenticates, and for the duration of that session will be known as DJ not dj.

On the other hand, resources are case-sensitive.

Table 5-1. JID restrictions

JID part

Restrictions

username

A username can be up to 255 characters in length and may not contain any ASCII character under 33 (decimal),[2] nor can it contain any of the characters :, @, /, ", or '; also, whitespace (tabs, newlines, and carriage returns) and control characters are forbidden.

hostname

The same restrictions apply here as for normal DNS hostnames.

resource

There are no restrictions for the resource part of a JID.

[2]That is, it may not contain spaces or those considered to be control characters.

Resources and Priorities

In the previous section, we saw how the resource is used to "qualify" certain queries to a servername, to hold information such as version numbers, and to represent users in a conference room. However, the resource is traditionally seen as a way of making a distinction between simultaneous connections by a user to the same Jabber server. For example, if you connect to a Jabber server using the same username and password on three different machines (or resources), the Jabber server will look at the resource part of the JID to determine which client to route messages to.

For the purpose of this example, let's say that the three resources are a workstation, a laptop, and a PDA. Each client is connected to the same Jabber server, so the resource part of the JID can be used to distinguish between the three connections. They could also be used to differentiate between the three connections coming from the same client host.

The classic explanation serves us well here: In a work situation, I might be connected to my Jabber server using a client on my workstation. I might also be connected, with the same username, to my Jabber server on my laptop that's sitting next to my workstation. Furthermore, I might have a handheld device that runs a small Jabber client that I'm connected with, too.

On each client machine, I'm connecting using the same credentials (username and password) to the same Jabber server. So the resource part of a JID can be used to distinguish between my three connections. In this example, the three "resources" are my workstation, laptop, and handheld.

The resource part of a JID allows a user to be connected to Jabber (specifically the JSM, which manages users and sees user sessions as separate entities) multiple times.

Then the question becomes, what happens when someone sends you a message? To which client is the message sent?

This is where the concept of connection prioritycomes to our aid. Each Jabber client connection can be given a priority. When a user has more than one concurrent connection to a Jabber server, the priority is used to determine to which connection any messages intended for that user should be sent. IQ elements are resource-bound, that is, they are addressed to specific resources. In that sense, they are not affected by priority. (See "

Jabber's Building Blocks" later in this chapter for information on the types of packets that are sent and received.) The connection with the highest priority value is the connection to which the messages are sent (priority values must be a positive integer and cannot be 0 or less).

Figure 5-1 shows priority in action. In this example, Sabine's message is sent to the Jabber client on the Desktop, as it has a higher priority. Note that with Jabber priority, 1 has a lower priority than, say, 5. The higher the number, the higher the priority.

In the event that there's a priority tie, the most recent connection to the Jabber server wins. For example, if DJ connects to the server first from a client running on his Laptop and then again later with a client running on his Desktop system, and both clients have their priority set to a value of 1, the client running on his Desktop would win and receive the incoming messages.

It is also possible to direct messages to a particular client. Taking the example from Figure 5-1, if sabine@yakwere to specify dj@yak/Laptop instead of dj@yak as the recipient for a message, her message would go to Client 1 (Laptop), not Client 2 (Desktop), despite Client 1's lower priority value.

Figure 5-1Figure 5-1. Resources, priority, and message delivery
NOTE: In the upcoming server Version 1.4.2, this resource-based routing facility has been made more flexible. Rather than relying upon exact matches (such as Laptop or Desktop), messages will be routed based on a subset of the resource value. For example, if logged in with the JID sabine@yak/Laptop, sabine would receive messages addressed to sabine@yak/Laptop/foo, sabine@yak/Laptop/bar, and so on. This allows clients to do flexible routing and delivery, based on the resource detail, once they've received the messages.

Priorities are specified when a user sends presence information. We will see this later in "

Jabber's Building Blocks." It makes sense for the priority to be associated with a user's presence, rather than a user's client connection. For example, if the priority was specified at connection time, the user would have to disconnect and reconnect if she wanted to change priority. As it stands, she just has to send presence information containing a new priority value to change it. Figure 5-2 shows a WinJab client pop-up window used to change presence information. The value of the current priority can be changed.

Figure 5-2. Changing presence and priority in the WinJab client
Figure 5-2

XML Streams

By now, you should already know that Jabber relies heavily on XML. XML courses through Jabber's veins; data sent and received between entities, and internally within the server itself, is formatted in XML packets.

However, the XML philosophy goes further than this. A connection between two Jabber endpoints, say, a client and a server, is made via a TCP socket, and XML is transferred between these endpoints. However, it's not just random fragments of XML flowing back and forth. There is a structure, a choreography, imposed upon that flow. The entire conversation that takes place between these two endpoints is embodied in a pair of XML documents.

The Conversation as XML Documents

The conversation is two-way, duplexed across a socket connection. On one side, the client sends an XML document to the server. On the other side, the server responds by sending an XML document to the client. Figure 5-3 shows the pair of XML documents being streamed across the TCP socket connection between client and server, over time.

Figure 5-3Figure 5-3. A conversation between client and server as a pair of streamed XML documents

But what do we mean when we say that the conversation is an XML document? To answer this, consider this simple XML document:

<?xml version="1.0"?>
<roottag>
  <fragment1/>
  <fragment2/>
  <fragment3/>
  ...
  <fragmentN/>
</roottag>

The document starts with an XML declaration:

<?xml version="1.0"?>

which is immediately followed by the opening root tag. This root tag is significant because there can be only one (and, of course, its corresponding closing tag) in the whole document. In effect, it wraps and contextualizes the content of the document:

<roottag>
  ...
</roottag>

The real content of the document is made up of the XML fragments that come after the opening root tag:

<fragment1/>
<fragment2/>
<fragment3/>
...
<fragmentN/>

So, taking a connection between a Jabber client and a Jabber server as an example, this is exactly what we have. The server is listening on port 5222 for incoming client-initiated connections. Once a client has successfully connected to the Jabber server, it sends an XML declaration and the opening root tag to announce its intentions to the server, which in turn responds by sending an XML declaration and opening root tag of its own.

From then on, every subsequent piece of data that the client sends to the server over the lifetime of the connection is an XML fragment (<fragmentN/>). The connection can be closed by the client by sending the matching closing root tag. Of course, the connection can be also closed by the server by sending the closing root tag of its XML document.

The fragments sent within the body of the XML document are the XML building blocks on which Jabber solutions are based. These XML building blocks are introduced and examined later in the chapter in "

Jabber's Building Blocks."

Suffice it to say here that these fragments can come in any order within the body of the XML document, precisely because they're in the body. As long as an XML document has a root tag, and the fragments themselves are well-defined, then it doesn't matter what the content is. Because of the way the document is parsed -- in chunks, as it appears -- it doesn't matter if the fragments appear over a long period, which is the case in a client/server connection where messages and data are passed back and forth over time.

It should be fairly easy now to guess why this section (and the technique) is called XML Streams. XML is streamedover a connection in the form of a document and is parsed and acted upon by the recipient in fragments, as they appear.

The Opening Tag

Earlier, we said that the opening document tag was used by the client to "announce its intentions." The following is a typical opening document tag from a Jabber client that has made a socket connection to port 5222 on the Jabber server jabber.org:

<stream:stream
    xmlns:stream="http://etherx.jabber.org/streams"
    to="jabber.org"
    xmlns="jabber:client">

There are four parts to this opening tag:

The <stream:stream> tag

Every streaming Jabber XML document must start, and end, with a <stream:stream> tag, qualified with the stream namespace.



The stream namespace declaration

xmlns:stream="http://etherx.jabber.org/streams"

The declaration of the stream namespace also comes in the opening stream tag. It refers to a URL (http://etherx.jabber.org/streams), which is a fixed value and serves to uniquely identify the stream namespace used in the XML document, rooted with <stream/>, that is, streamed over a Jabber connection.

The namespace qualifies only the tags that are prefixed stream:. Apart from stream, there is one other tag name used in these documents that is qualified by this namespace, and that is error. The <stream:error/> tag is used to convey Jabber XML stream connection errors, such as premature disconnection, invalid namespace specifications, incomplete root tag definitions, a timeout while waiting for authentication to follow the root tag exchange, and so on.

The to attribute

to="jabber.org"

There is a to attribute that specifies to which Jabber server the connection is to be made and where the user session is to be started and maintained.

We've already specified the jabber.org hostname, representing our Jabber server, when defining the socket connection (jabber.org:5222), so why do we need to define it again here? As indicated by the to attribute, you can see that we've made a physical connection to the jabber.org host. However, there may be a choice of logical hosts running within the Jabber server to which our client could connect.

When making the physical connection from our client to the Jabber server, we defined the hostname jabber.orgfor our socket connection (to jabber.org:5222). Now that we're connected, we're specifying jabber.org again as the logical host to which we want to connect inside Jabber. This is the logical host identity within the Jabber server running on the jabber.org host.

This "repeat specification" is required, because there's a difference between a physicalJabber host and a logical Jabber host. In the section "Server Constellations" in Chapter 4, we saw how a single Jabber server can be set up to service user sessions (with one or more JSMs) that are each identified with different logical hostnames. This is where the physical/logicalhostname distinction comes from and why it's necessary to specify a name in the root <stream:stream> tag's to attribute.

It just so happens that in the example of an opening tag we've used, the logicalhostname is the same as the physicalone -- jabber.org. In many cases, this will be the most commonplace. However, an Internet Service Provider (ISP), for example, may wish to offer Jabber services to its customers and dedicate a single host for that purpose. That host has various DNS names, which all resolve to that same host IP address. Only one Jabber server is run on that host. (If a second server were to be installed, it would have to listen on different -- nonstandard -- ports, which would be less than ideal.) To reflect the different names under which it would want to offer Jabber services, it would run multiple JSMs under different logical names (using different values for each <host/> configuration tag, as explained in the section "A Tour of jabber.xml" in Chapter 4). When connecting to that Jabber server, it may well be that the logical name specified in the opening tag's to attribute would be different from the physical name used to reach the host in the first place.

The namespace of the conversation

xmlns="jabber:client"

In addition to the namespace that qualifies the stream and error tag names, which could be seen as representing the "outer shell" of the document, the xmlns attribute specifies a namespace that will qualify the tags in the body of the document, the conversation fragments of XML that will appear over time. This namespace is jabber:client and signifies that the type of conversation that is about to ensue over this document connection is a Client (to Server) conversation.

This namespace specification is required because a client connection is just one type of connection that can be made with a Jabber server, and different connections carry conversations with different content. Table 5-2 lists the conversation namespaces currently defined in the Jabber protocol.

Table 5-2. Conversation namespaces

Namespace

Description

jabber:client

This is the namespace that qualifies a connection between a Jabber client and a Jabber server.

jabber:server

This namespace qualifies a connection between two Jabber servers. Dialback (host verification mechanism) conversations also take place within the jabber:server namespace.

jabber:component:accept

When an external program connects to a Jabber server via a TCP sockets connection, this namespace is used to qualify the pair of XML documents exchanged over the connection.

jabber:component:exec

When an external program connects to a Jabber server via a STDIO connection, this namespace is used to qualify the pair of XML documents exchanged over such the connection.[3]

[3]For more details on external program connections to Jabber, see Chapter 4.

The Response

To complete our initial look at XML streams in a Jabber client/server conversation, let's have a look at what the Jabber server might send in response to the opening tag from the client:

<stream:stream
    xmlns:stream='http://etherx.jabber.org/streams'
    id='3AFD6862'
    xmlns='jabber:client'
    from='jabber.org'>

There are a couple of differences between this opening tag from the server and the opening tag from the client -- that is, above and beyond the fact that this response's opening tag is for a document that is going to be streamed along the socket in the opposite direction from that of the document to which the request's opening tag belongs. The first difference is that there's a from attribute instead of a to attribute. The second difference is that there's an extra attribute -- id. Let's look at these in turn.

The from attribute

The from attribute is fairly straightforward; it normally serves to confirm to the client that the requested logical host is available. For example:

from="jabber.org"

If the host is available, the value of the from attribute from the server will match the value of the to attribute from the client. However, in some circumstances, the value can be different. The value sent in the from attribute is a redirection, or respecification, of the logical host by which the Jabber server (or more specifically the JSM component within the Jabber server) is actually known.

Logical host aliases can be defined in the Jabber server's configuration to "convert" a hostname specified in the incoming to attribute. The <alias/> tag, which is used to define these logical host aliases, is described in the section "Component Instance: c2s" in Chapter 4. But how are these hostname conversions used? Here's an example.

Let's say that you're running a Jabber server on an internal network that doesn't have an available DNS server. The host where the Jabber server runs is called apollo, and its IP address is 192.168.1.4. Some people will connect to the host via the hostname because they have it defined in a local /etc/hosts file; others will connect via the IP address. Normally, the hostname (or IP address) specified in the connection parameters given to a Jabber client will be:

  • Used to build the socket connection to the Jabber server.

  • Specified in the to attribute in the opening XML stream to specify the logical host.

If the JSM section of the Jabber server is defined to have a hostname of apollo:

<host><jabberd:cmdline flag='h'>apollo</jabberd:cmdline></host>

then we need to make sure that the Jabber client uses that name when forming any JIDs for that Jabber server (e.g., the JID apollo used as an addressee for an IQ browse request). Having this:

<alias to='apollo'>192.168.1.4</alias>

in our c2s instance configuration would mean that any incoming XML stream header with a value of 192.168.1.4 in the to attribute:

<stream:stream
    to="192.168.1.4"
    xmlns="jabber:client"
    xmlns:stream="http://etherx.jabber.org/streams">

would elicit the following response:

<stream:stream
    from='apollo'
    id='1830EF6A'
    xmlns='jabber:client'
    xmlns:stream='http://etherx.jabber.org/streams'>

which effectively says: "OK, you requested 192.168.1.4, but please use apollo instead." The client should use the value "confirmed" in the from attribute when referring to that Jabber server in all subsequent stream fragments. That is to say, when wanting to address the server, instead of sending something like this:

SEND <iq type='get' to='192.168.1.4'>
       <query xmlns='jabber:iq:version'/>
     </iq>

it should address it like this:

SEND <iq type='get' to='apollo'>
       <query xmlns='jabber:iq:version'/>
     </iq>

Not specifying an <alias/> tag in this example would result in problems for the client. Without any way of checking and converting incoming hostnames, the c2s component will by default simply transfer the value from the to attribute to the from attribute in its stream header reply.

Following this thread to its natural conclusion, it's worth pointing out that if we have an alias specification like this:

<alias to='apollo'/>

then the value of the from attribute in the reply will always be set to apolloregardless of what's specified in the to attribute. This means that the to attribute could be left out of the opening stream tag. Although this serves well to illustrate the point, it is not good practice.

The id attribute

The id attribute is the ID of the XML stream and is used in the subsequent authorization steps, which are described in Chapter 7. For example:

id='3AFD6862'

The value is a random hexadecimal string generated by the server and is not important per se. What is important is that it's a value that is random and shared between server and client. The server knows what it is because it generated it, and the client knows what it is because the server sends it in the opening tag of the response.

The Simplest Jabber Client

Now that we know how a conversation with a Jabber server is started, let's try it ourselves. At a stretch, one could say that the simplest Jabber client, just like the simplest HTTP client, or the simplest client that has to interact with any server that employs a text-based protocol over a socket connection, is telnet.

Simply point telnet to a Jabber server, specifying port 5222, and send an opening tag. You will receive an opening tag, from the server, in response:

yak:~$ telnet localhost 5222
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
<?xml version='1.0'?>
<stream:stream xmlns:stream='http://etherx.jabber.org/streams' to='yak' 
      xmlns='jabber:client'>
<?xml version='1.0'?><stream:stream xmlns:stream='http://etherx.jabber.org/streams' 
      id='3AFD839E' xmlns='jabber:client' from='yak'>

If you don't have a Jabber server to experiment with, see Chapter 3 on how to set one up.

Using telnet is a great way to find out more about the way the Jabber protocol works. Perhaps the next thing to do is try out the user registration and authentication steps described in Chapter 7. But watch out -- send some invalid XML and the server will close the connection on you!

Jabber's Building Blocks

At this stage we've got a good impression of the structure of Jabber: what different elements make up a Jabber system, how entities in Jabber are addressed, and how communication between these entities is carried.

Now it's time to look at what gets carried -- the fragments that we touched upon in the previous section. These fragments are the heart and soul of Jabber -- the lifeblood that courses through Jabber's veins carrying information back and forth -- these fragments in many ways define what Jabber is, what it stands for.

Surprisingly, when we look closely at these fragments, with Jabber's capabilities as a messaging platform in mind, we see that there are only three basic elements involved -- <message/>, <presence/>, and <iq/>. Three different types of XML fragments, each with a different purpose. But with these three fragment types -- these elements -- all that Jabber promises, and more, can be achieved.

Now let's look at each of these Jabber elements in greater detail. But before we do, let's dive into the XML stream and pull out a handful of XML fragments to get us in the mood. Example 5-3 shows a chunk of conversation between a Jabber client and a Jabber server, which occurred immediately after the connection and authentication stages.

NOTE: Although any conversation between two Jabber entities is contained within two XML documents exchanged in streams, the traditional way to represent both documents at the same time is to use prefixes to show whether a fragment is being sent (SEND:) or received (RECV:), by one of the two entities. When appropriate, the perspective is taken from the viewpoint of the entity that's not the Jabber server; in the case of Example 5-3, the viewpoint is of the Jabber client.

Example 5-3. A chunk of conversation between a Jabber client and a Jabber server

SEND: <iq id='roster_0' type='get'><id="iq"/>
        <query xmlns='jabber:iq:roster'/>
      </iq>

RECV: <iq id='roster_0' type='result' from='dj@yak/Work'>
        <query xmlns='jabber:iq:roster'>
          <item jid='sabine@yak' name='sabine' subscription='both'>
            <group>Family</group>
          </item>
        </query>
      </iq>

SEND: <presence><status>Online</status></presence><id="presence"/>

      ... time passes ...

RECV: <message id='1' to='dj@yak' from='sabine@yak/winjab' type='chat'>
        <id="message"/>
        <thread>3FE7392DDCA919CB49C73A2FFCE9901D</thread>
        <body>Hello</body>
      </message>

Example 5-3 shows three different elements in action, described as follows:

The <iq> elements

The user's (dj@yak's) contact list is requested and sent back by the server.

The <presence> element

The client broadcasts the user's availability.

The <message> element

The user receives a message from sabine@yak.

So, let's have a look at each of these elements, starting with arguably the most commonly occurring: <message/>.

The Message Element

It's obvious that in a messaging architecture such as Jabber, sending messages is fundamental. The <message/> element provides us with this facility. Any data, other than availability information or structured requests and responses (which are handled by the other two element types) sent from one Jabber entity to another, is sent in a <message/> element.

All things considered, the key word in that last sentence is "in." It's a good idea to regard Jabber elements as containers; the simile fits well as the elements themselves remain relatively static (save for the attributes) but the content can change to reflect the circumstances.

Message attributes

The <message/> element is a container, or envelope, which requires some form of addressing. The attributes of the <message/> element serve this purpose.

type

<message type='chat'>
name: Optional

The Jabber protocol defines five different message types. The message type gives an indication to the recipient as to what sort of content is expected; the client software is then able, if it wishes, to handle the incoming message appropriately.

Attribute Values

type='normal'

The normal message type is used for simple messages that are often one-time in nature, similar to an email message. If I send you a message and I'm not particularly expecting a response, or a discussion to ensue, then the appropriate message type is normal.

Some clients handle normal message types by placing them in a sort of message inbox, to be viewed by the user when he so chooses. This is in contrast to a chat type message.

Note that the normal message type is the default. So if a message is received without an explicit type attribute, it is interpreted as being normal.

type='chat'

The chat message type differs from the normal message type in that it carries a message that is usually part of a live conversation, which is best handled in real time with immediate responses -- a chat session.

The handling of chat messages in many clients is done with a single window that displays all the chat messages both sent and received between the two parties involved -- all the chat messages that belong to the same thread of conversation, that is. There's a subelement of the <message/> element that allows the identification of conversational threads so that the right messages can be grouped together; see the information on <thread/> later in "

Message subelements
."

type='groupchat'

The groupchat message type is to alert the receiving client that the message being carried is one from a conference (groupchat) room. The user can participate in many conference rooms and receive messages sent by other participants in those rooms. The groupchat type signifies to the receiving client that the address specified in the from attribute (see later in this section) is not the sending user's real JID but the JID representing the sending user, via her nickname, in the conference room from where the groupchat message originates.[4]

[4]Also, groupchat type messages, such as those announcing entrances or exits of room participants, can be received from the room itself.

type='headline'

This is a special message type designed to carry news style information, often accompanied by a URL and description in an attachment qualified by the jabber:x:oob namespace. Messages with their type set to headline can be handled by clients in such a way that their content is placed in a growing list of entries that can be used as reference by the user.

type='error'

The error message type signifies that the message is conveying error information to the client. Errors can originate in many places and under many circumstances. Refer to the description of the <error/> subelement in the next section for more details.

from

<message from='dj@yak/Desktop'>
name: Set by server

The from attribute of the <message/> element shows the message originator's JID. In many cases this is the JID of a user, but with the message type groupchat, for example, it can be the JID of the conference room in the place where the message was originally sent.

The from attribute should not be set by the client. It is the Jabber server, to which the client from where the message originated is connected, that sets the attribute value. This is to prevent spoofing of JIDs. If a from attribute is set, it will be overriden by the server.

to

<message to='[email protected]'>
name: Optional

The to attribute is used to specify the intended recipient of the message and is a JID. The recipient may be another Jabber user, in which case the JID will usually be in the form username@hostname (with an optional /resource if a message should be sent to a specific client connection), or it could be a Jabber server identity, in which case the JID will be in the form hostname> with an optional /resource depending on the situation.

If no to attribute is specified, then the message will be directed back to the sender, or the server, depending on the circumstances. This may or may not be what you want.

This is also the case with the to attribute for the <iq/> element; however, it is not the case with the <presence/> element.[5]See the sidebar titled "Element Handling by the Jabber Server" for an explanation.

[5]Actually, it is, internally, but the effect is that it isn't. The packet is swallowed on its final delivery stage by the presence handler.

When elements (packets) make their way over the jabber:client XML stream and arrive at the Jabber server, they're delivered to the JSM that provides many of the services associated with Jabber's IM features, such as roster management, presence subscription, offline storage, and so on. Each packet received runs a gauntlet of handlers before being delivered to its ultimate destination specified by the value of the to attribute.

In some cases, a packet has no "ultimate destination" and is deemed to have been handled without reaching a final delivery point.

For example, in the case of a simple <message/> packet with a JID specified in the to attribute, the packet will not be swallowed by a handler but will be delivered to that JID destination. On the other hand, in the case of a simple <presence/> packet without a to attribute (a normal notification of availability), the packet will reach the mod_presence module in the JSM and be handled by that module, where the availability information will be distributed according to presence subscriptions. The <presence/> packet itself, in its original form, will go no further.

Element Handling by the Jabber Server

id

<message id='JCOM_12'>
name: Optional

When a message is sent, and a reply is expected, it is often useful to give the outbound message an identifier. When the recipient responds, the identifier is included in the response. In this way, the originator of the message can work out which reply corresponds to which original message.

At the Jabber server, this works because a reply is usually built from a copy of the original message, with the from and to attributes switched around. So the id attribute remains untouched and in place.

NOTE: Each id value within a session, represented by one streamed XML document, must be unique within that session, that is, within that one document.

Message subelements

While the <message/> element itself is a container for the information being carried, the subelements are used to hold and describe the information being carried. Depending on the circumstances and the message type, different subelements can be used.

subject

<message to='[email protected]' from='[email protected]/Home'>
  <subject>Time to meet?</subject>
  <body>What time to you want to meet this afternoon?</body>
</message>
name: Optional

The <subject/> subelement is used to set a message subject. Message subjects are not that common in chat type messages but are more appropriate in normal type messages in which the subject can be displayed in the style of a list of inbox items. This subelement is also used in groupchat type messages to set the subject (or "topic") of a conference room.

body

<message to='qmacro@yak' from='john@yak' type='chat'>
  <body>Hey - got a minute?</body>
</message>
name: Optional

The <body/> subelement carries the body of the message.

error

<message to='[email protected]/Home' from='[email protected]' type='error'>
  <body>Are you there?</body>
  <error code='502'>Unable to resolve hostname.</error>
</message>
name: Optional

The <error> subelement is for carrying error information in a problem situation. In this example, the original message sent by [email protected]was a simple "Are you there?" to what he thought was qmacro's JID on the Jabber server at jabber.org. However, the to attribute was specified incorrectly (jaber.org), and the Jabber server on pipetree.com wasn't able to resolve the hostname. So Piers receives his message back with an additional <error/> subelement, and the message type has been switched to error (the type='error' attribute).

The <error/> subelement carries two pieces of related information: an error number, specified in the code attribute, and the error text. Table 5-3 lists standard error codes and texts. The entity generating the error can specify a custom error text to go with the error code; if none is specified, the standard text as shown is used.

Table 5-3. Standard error codes and texts

Code

Text

400

Bad Request

401

Unauthorized

402

Payment Required

403

Forbidden

404

Not Found

405

Not Allowed

406

Not Acceptable

407

Registration Required

408

Request Timeout

409

Conflict

500

Internal Server Error

501

Not Implemented

502

Remove Server Error

503

Service Unavailable

504

Remove Server Timeout

510

Disconnected

html

<message id="3" to="dj@yak" type="chat">
  <html xmlns="http://www.w3.org/1999/xhtml">
    <body>
      <span style="font-family: Arial; font-size: 10pt">
        This is really <em>nice!</em>
      </span>
      <br/>
    </body>
  </html>
  <body>This is really nice!</body>
</message>
name: Optional

The <html/> tag is for support of messages formatted in Extensible HyperText Markup Language (XHTML). The normal <body/> tag carries plain text; text formatted with XHTML markup can be carried in <message/> elements inside the <html/> subelement.

The markup must be qualified by the XHTML namespace http://www.w3.org/1999/xhtml (as shown in the example) and conform to the markup described in the XHTML-Basic specification defined at

http://www.w3.org/TR/xhtml-basic. This is despite the name of the tag being html and not xhtml.

Note that the content of the message must also be repeated in a normal <body/> subelement without formatting, to comply with the "lowest common denominator" support for different Jabber clients -- not all of them will be able to interpret the XHTML formatting, so they will need to receive the message content in a way that they can understand.

The <html/> subelement effectively is a wrapper around a second, alternative, <body/> subelement.

thread

<message to='[email protected]' type='chat'>
  <thread>B19217AFEEBDC2611971DD1E8B23AAE4</thread>
  <body>Yes, they're at http://docs.jabber.org</body>
</message>
name: Optional

The <thread/> subelement is used by clients to group together snippets of conversations (between users) so that the whole conversation can be visually presented in a meaningful way. Typically a conversation on a particular topic -- a thread -- will be displayed in a single window. Giving each conversation thread an identity enables a distinction to be made when more than one conversation is being held at once and chat type messages, which are component parts of these conversations, are being received (possibly from the same correspondent) in an unpredictable sequence.

Only when a new topic or branch of conversation is initiated must a client generate a thread value. At all other times, the correspondent client must simply include the <thread/> tag in the response. Here the thread value is generated from a hash of the message originator's JID and the current time.

x

<message to='dj@yak' type='chat' from='sabine@yak/laptop'>
  <body>Hi - let me know when you get back. Thanks.</body>
  <x xmlns='jabber:x:delay' from='dj@yak' stamp='20010514T14:44:09'>
    Offline Storage
  </x>
</message>
name: Optional

The <x/> subelement is special. While the other subelements like <body/> and <thread/> are fixed into the Jabber building blocks design, the <x/> subelement allows <message/> elements to be extended to suit requirements. What the <x/> subelement does is provide an anchor point for further information to be attached to messages in a structured way.

The information attached to a message is often called the payload. Multiple anchor points can be used to convey multiple payloads, and each one must be qualified using a namespace.

Just as the content of XML streams is qualified by a namespace (one from the list in Table 5-2 earlier in this chapter), so the content of the <x/> attachment must be qualified. There are a number of Jabber-standard namespaces that are defined for various purposes. One of these, jabber:x:delay, is used in the example. These standard namespaces are described in Chapter 6. But there's nothing to stop you defining your own namespace to describe (and qualify) the data that you wish to transport in a <message/>. Namespaces beginning jabber: are reserved; anything else is OK.

Briefly, you can see how payloads are attached from the example. For every <x/> subelement, there's an xmlns attribute that qualifies it, and the data contained within the <x/> tag is formatted depending on the namespace.

In the example, the payload is carried in addition to the <body/> subelement. However, as the <body/> is actually optional in a message, it is possible to transmit structured payloads between Jabber entities without the need for "conventional" message content.

The Presence Element

The <presence/> element is used to convey a Jabber entity's availability. An entity can be available, which means that it's connected and any messages sent to it will be delivered immediately, or it can be unavailable, which means that it's not connected, and any messages sent to it will be stored and delivered the next time a connection is made.

For the large part, it is the entity itself, not the Jabber server to which it connects, that controls the availability information. The Jabber server will communicate an entity's unavailabilityif that entity disconnects from the server but will do that only if the entity has communicated its availability beforehand.

Availability information isn't a free-for-all. Presence in Jabber is usually exchanged within a subscription mechanism. See "

Presence subscription" for an explanation.

Presence Management

It's worth noting that the entities referred to here are client entities, that is, clients (and therefore the users using those clients) connected to the Jabber server over an XML stream qualified by the jabber:client namespace (see "

XML Streams"). Presence is a feature that is used throughout Jabber; the Jabber Session Manager (JSM) manages presence on behalf of clients. External components that connect to the Jabber server backbone are separate from the JSM and therefore don't have any concept of "managed" presence. That's not to say they can't partake in the sending and receiving of presence elements. They just have to manage everything themselves, as they don't have the JSM to do it for them.

Presence attributes

The attributes of the <presence/> element are similar to those of the <message/> element.

type

<presence type='unavailable'>
name: Optional

The type attribute of the <presence/> element is used for many purposes. The basic usage is to convey availability. Two values are used: available and unavailable.[6] Another value is to signify that the <presence/> packet is being used to query the packet recipient's presence (value is probe). The rest of the values (subscribe, unsubscribe, subscribed, unsubscribed) are used in the subscription structure, which is described in "

Presence subscription."

[6]Technically speaking, there's no available value. The absence of a type attribute implies availability. However, for the purposes of discussion (it's easier to concentrate on something than to concentrate on a lack of something), we'll refer to type='available'.

Attribute Values

type='available'

The available presence type is used by entities to announce their availability. This announcement is usually made to the Jabber server that manages the presence subscription mechanism (see "

Presence subscription
" for more details). However it can also be directed to a particular JID if the entity wants to control presence information itself.

The available presence isn't a simple binary "on/off"; varying degrees of availability are specified using subelements of the <presence/> packet. These include <show/> and <status/> and are described next.

If no type attribute is specified, then this value of available is assumed. It makes sense, as the most common type of <presence/> packet sent by entities is usually the available type, optionally qualified with the <show/> and <status/> subelements, as the user of the connected client changes her circumstances over time (off for a break, back, out to lunch, and so on).

type='unavailable'

The unavailable presence type is the antithesis of the available presence type. It is used to qualify an entity's unavailability. An entity is unavailable when its client has disconnected from the Jabber server. An unavailable presence type should be sent by clients before they disconnect.

How can we make sure that clients actually send such a packet when they disconnect (to keep the presence information equilibrium)? Well, we can't. If a client disconnects without sending an unavailable presence type, the Jabber server will send one out on its behalf when it disconnects. This is part of the presence service of the JSM and closely related to the presence subscription mechanism. See "

Availability Tracker
" for more details.[7]

[7]Not sending an unavailable presence type before disconnection means that the information held for a user in the jabber:iq:last namespace -- see the section "jabber:iq:last" in Chapter 6 -- will not be stored.

While the <show/> and <status/> subelements qualify the available presence packet, there's no point in any embellishment of the fact that the entity is unavailable, so no subelements are used when the packet is of the unavailable type.

type='probe'

The probe presence type is a query, or probe, on another entity's availability. This probe is used by the Jabber server to determine the presence of entities in its management of the presence subscription mechanism. Under normal circumstances, this presence probe should not be used directly by a client -- availability information is always pushed to the client by the server. Regardless, if a client insists on using a probe, there are two things to bear in mind:

  • Information will be returned only in response to an availability probe if the probing entity already has a subscription to the entity being probed. This means that you can't bypass the subscription model and probe random entities for availability information; you can probe only those who have previously given you permission to be informed of their availability. See "

  • Presence subscription" for more details.

  • The <presence/> packet must be specified with a from attribute specifying the sender's JID in the form username@hostnamebefore it is sent. The Jabber server does not add this attribute. The presence mechanism will use the full JID (including any resource) when working out whether the prober has permission. This will ultimately fail because permission is determined on a username@hostname basis, not a username@hostname/resource basis.

WARNING: Although possible right now, you should really avoid using the probe presence type in clients. Future versions of the Jabber server may block such packets.

type='subscribe'

This presence type is a request to subscribe to an entity's presence. ("Will you allow me to be sent your presence information by the server?") See "

Presence subscription
" for details.

type='unsubscribe'

This presence type is a request to unsubscribe from an entity's presence. ("I don't want to be sent your presence information anymore; please have the server stop sending it to me.") See "

Presence subscription
" for details.

type='subscribed'

This presence type is sent in reply to a presence subscription request, used to accept the request. ("OK, I accept your request; the server will send you my presence information.") See "

Presence subscription
" for details.

type='unsubscribed'

This presence type is sent in reply to a presence unsubscription request, used to accept the request. ("OK, I accept your unsubscription request; the server will stop sending you my presence information.")

It is also used to deny a presence subscription request. ("No, I don't accept your subscription request; I don't want the server to send you my presence information.")

These presence types are described in more detail in "

Presence subscription
."

from

<presence from='dj@yak'/>
name: Set by server

Similar to the attribute of the same name in the <message/> element, here the from attribute is set by the server and represents the JID from which the availability information originates.

If you are sending a presence probe, type='probe', you must set the from attribute yourself, as mentioned earlier.

to

<presence to='sabine@yak'/>
name: Optional

The to attribute is optional; if, as a user, you are just announcing availability (with the intention of having that announcement reflected to the appropriate members of your roster), then specifying a to attribute is not appropriate.[8] If you want to send your availability to a specific entity, then do so using this to attribute, specifying that entity's JID. Why might you want to do this? See "

Availability Tracker" for an answer.

[8]In fact, as in the cases for the other two elements, <message/> and <iq/>, not specifying a to attribute will cause the <presence/> packet to be sent to the sender. However, in the case of the presence handler mechanism, the packet is swallowed before it can reach its destination, to prevent reflective presence problems.

id

<presence id='p1'/>
name: Optional

All Jabber elements support an id attribute for tracking purposes. So, the <presence/> packet is no different from the <message/> packet in this respect. As presence notification is usually a one-way thing, it is very uncommon to see <presence/> packets qualified with an id attribute.

Presence subelements

show

<presence>
  <show>xa</show>
  <status>Gone home for the evening</status>
</presence>
name: Optional

When an available presence is sent, it can be qualified with more detail. The detail comes in two parts and is represented by two subelements of the <presence/> element. The first part of the detail is in the form of a <show/> tag, which by convention contains one of five possible values. Table 5-4 lists these values and their meaning.

Table 5-4. Presence <show> values

Value

Meaning

away

The user is available but temporarily away from the client.

chat

This is similar to the normal value but suggests that the user is open to conversation.

dnd

"Do not disturb." Although online and available, the user doesn't want to be disturbed by anyone. Don't forget, unless the user is actually offline (unavailable or disconnected from the Jabber server), messages to that user will still be sent to the user immediately.

normal

This is the normal availability; there's nothing really special about this qualification -- the user is simply available. If no <show/> tag is specified in an available <presence/> element, a value of normal is assumed.

xa

This is an extreme form of the away value -- xa stands for "extended away" and is probably as near to an unavailable presence as you can get.

status

<presence>
  <show>dnd</show>
  <status>working on my book!</status>
</presence>
name: Optional

The other part of the detail that qualifies a user's availability is the <status/> subelement. It allows for a more descriptive remark that embellishes the <show/> data.

The examples for this subelement and the <show/> subelement show how the <status/> value is used as a textual description to explain the <show/> value's "short code," or mnemonic.

priority

<presence>
  <show>chat</show>
  <status>coffee break</status>
  <priority>5</priority>
</presence>
name: Optional

Earlier in this chapter, "

Resources and Priorities" described how a user's priority is used to determine the primary session to which messages should be sent.

As we see here, the priority is set using the <presence/> element. In this example, we see that the user has set the priority high to make sure that messages are routed to him on the Jabber client running on this machine.

x

<presence from='dj@yak/Work' to='sabine@yak'>
  <status>Online</status>
  <priority>1</priority>
  <x xmlns='jabber:x:delay' from='dj@yak/Work'
     stamp='20011005T10:58:28'/>
</presence>
name: Optional

Just as with the <message/> element, extra information can be attached to the <presence/> element by means of the <x/> tag. In the same way, each <x/> tag must be qualified with a namespace.

While there aren't many external uses for payloads in a <presence/> packet, the Jabber server uses this facility to add information. In this example, we see that dj@yak's notification of availability (remember, type='available' is assumed for <presence/> packets without an explicit type attribute) is being sent to sabine@yak. While dj@yak connected to the Jabber server and sent his availability (which was stamped on receipt by the Jabber server) just before 11 a.m., sabine@yak is just logging on now (say, 30 minutes later). When she receives dj@yak's presence, she knows how long that presence status has been valid for.

See the section "The X Namespaces" in Chapter 6 to find out what namespaces are available to qualify <x/>-included payloads.

Presence subscription

Presence subscription is the name given to the mechanism that allows control over how entity presence information is made available to other entities. By default, the availability of an entity is unknown to other entities.

Let's put this into more concrete terms. For example, let's assume that you and I are both Jabber users. I'm registered with the Jabber server running at jabber.org, my JID is [email protected], and you are registered with a Jabber server running at your company, and your JID is [email protected].

If you want to know whether I'm available, you have to subscribe to my presence. This is done by sending a <presence/> packet to me with the type attribute set to subscribe. In the example that follows, the XML fragments are sent and received from your perspective:

SEND: <presence type='subscribe' to='[email protected]'/>

I receive the <presence/> packet, and when I receive it, it's been stamped (by your Jabber server) with a from attribute with the value [email protected]. So, based upon who it is, I decide to accept the subscription request and send back a reply, which you receive:

RECV: <presence type='subscribed'
                from='[email protected]/home'
                to='[email protected]/work'/>

This lets you know that I've accepted your subscription request. From now on, every time my availability changes (when I send a <presence/> packet or when I disconnect and the server generates an unavailable <presence/> packet on my behalf), that availability information will be relayed to you.

But how does this work? How does the Jabber server know that you've subscribed to my presence and I've accepted that subscription?

Enter the roster, stage right. The roster is a list of JIDs maintained for each user, stored server-side. A roster is similar to an AOL Buddy List; one could say that it's a sort of personal address book, but it's more than that. The presence subscription and roster mechanisms are tightly intertwined. We'll be examining the roster in more detail in the section "jabber:iq:roster" in Chapter 6. Here, we'll just look at the characteristics of the roster that are relevant for the presence subscription mechanism. The roster is managed using the third basic Jabber element -- <iq/> -- which will be explained in more detail later in this section. Ignore the tags that you aren't yet familiar with; it's just important to get the basic drift of what's going on.

While the roster is stored and maintained server-side, any changes to it made by the server are reflected in (pushed to) the client so it can be synchronized with a local copy.[9]

[9]The local copy would exist only for the duration of the user's session and should always be regarded as a slave copy.

Let's expand the simple exchange of <presence/> packets from earlier and see how the roster is used to record presence subscription information.

If you wish to subscribe to my presence and add my JID to your roster at the same time, these two actions are linked for obvious and practical reasons. Many Jabber clients use the roster as a basis for displaying availability information, and with the exception of an entity sending presence information directly to another entity regardless of roster membership, presence subscription information is stored by the user in the roster. Here's the order in which the subscription would take place:

  1. A request is sent to the server to update your roster, adding my JID to it:

    SEND: <iq id="adduser1" type="set">
            <query xmlns="jabber:iq:roster">
              <item jid="[email protected]" name="DJ Adams"/>
            </query>
          </iq>
    

    You add an id attribute to be able to track the request and match up the response when it comes.

  2. The server responds with a push of the updated (new) roster item:

    RECV: <iq type='set'>
             <query xmlns='jabber:iq:roster'>
               <item jid='[email protected]' name='DJ Adams'
                     subscription='none'/>
             </query>
           </iq>
    

    Note that in the update an additional attribute subscription='none' is sent, reflecting the presence subscription relationship between you and me. At this stage, the relationship is that I don't have a subscription to your presence and you don't have a subscription to my presence, hence the value none.

  3. It also acknowledges the original update request, confirming its success:

    RECV: <iq id='adduser1' type='result'
              from='[email protected]/Work'
              to='[email protected]/Work'/>
    

    Note the id='adduser1' identity is passed back so we can track the original request and find out where this response is being made.

  4. Meanwhile, you send the subscription request:

    SEND: <presence to="[email protected]" type="subscribe"/>
    

  5. The server notes the subscription request going through and once more updates your roster and pushes the item out to you:

    RECV: <iq type='set'>
            <query xmlns='jabber:iq:roster'>
              <item jid='[email protected]' name='DJ Adams'
                    subscription='none' ask='subscribe'/>
            </query>
          </iq>
    

    The current subscription relationship is reflected with the subscription='none' attribute. In addition, we have a subscription request status, with ask='subscribe'. This request status shows that there is an outstanding presence subscription request to the JID in that roster item. If you've ever seen the word "Pending" next to a username in a Jabber roster, this is where that comes from. Don't forget that a subscription request might not get an immediate response, so we need to remember that the request is still outstanding.

  6. Your subscription request is received and accepted, and a subscribed type is sent back to you as part of a <presence/> packet:

    RECV: <presence to='[email protected]'
                    type='subscribed' from='[email protected]'/>
    

  7. The server also notices the subscription request acceptance and yet again updates your roster to keep track of the presence subscription. Again, it pushes the subscription information out to you so your client can keep its copy up-to-date:

    RECV: <iq type='set'>
             <query xmlns='jabber:iq:roster'>
               <item jid='[email protected]' name='DJ Adams'
                     subscription='to'/>
             </query>
           </iq>
    

    This time, the subscription attribute in the roster item has been set to to. This means that the roster owner (you) has a presence subscription to the JID in the roster item (i.e., me).

  8. The server knows you've just subscribed to my presence; it generates a presence probe on your behalf that causes my presence information to be retrieved and sent to you:

    RECV: <presence from='[email protected]/Work'
                    to='[email protected]'>
            <status>Available</status>
            <priority>1</priority>
            <x xmlns='jabber:x:delay'
               from='[email protected]/Work'
               stamp='20010515T11:37:40'/>
          </presence>
    

Of course, at this stage, our relationship is a little unbalanced, in that you have a subscription request to me, but I don't have a subscription request to you. So you are aware of my availability, but not the other way around. In order to rectify this situation, I can repeat the process in the opposite direction, asking for a subscription to your presence information.

The only difference to the sequence that we've just seen is that you will already exist on my roster because the server will have maintained an item for your JID to record the presence subscription relationship. While the item in your roster that represents my JID has a subscription attribute value of to (the roster owner has a presence subscription to this JID) -- we've seen this in Step 7 -- the item in my roster that represents your JID has a subscription attribute value of from (the roster owner has a presence subscription from this JID).

Once I repeat this sequence to subscribe to your presence (and you accept the request), the value for the subscription attribute in the items in each of our rosters will be set to both.

The upshot of all this is that when an entity announces its presence, it does so using a single <presence/> packet, with no to attribute specified. All the members in that entity's roster who have a subscription to that entity's presence will receive a copy of that <presence/> packet and thereby be informed.[10]

[10]That is, where there's a value of to or both in the roster item's subscription attribute.

Availability Tracker

The Jabber server (specifically, the presence handler within the JSM) has a mechanism called the Availability Tracker. As its name implies, its job is to track the availability of entities that have previously made an availability announcement (in a <presence/> element).

The concept of exchange of availability information via an exchange agreement recorded in the roster was introduced in "

Presence subscription." This mechanism covers the automatic distribution of availability notification based upon prearranged presence subscriptions.

However, Jabber services (which are connected to the jabberd backbone; see the section "An Overview of the Server Architecture" in Chapter 4) may need to know an entity's availability or, more importantly, when they suddenly become unavailable. These Jabber services usually won't have a prior presence subscription agreement recorded in anyone's roster.

The Conferencing service, which provides group chat facilities, allowing users to join discussion "rooms" and chat, is one of these services. The service maintains data for each room's participants, and, so that it can manage its memory usage effectively, needs to know when a user ends his connection with the Jabber server -- in other words, when he becomes unavailable -- so it can free that user's data. Normally, a user leaving a room is information enough for the service to know that data can be freed. But what if the user disconnects (or is disconnected) from his Jabber server without first leaving the room?

The availability tracker mechanism comes to the rescue. It maintains a list of JIDs to which an entity has sent his availability in a <presence/> packet containing a to attribute (i.e., a directed<presence/> packet). When the JSM notices that a user has ended his session by disconnecting, the presence handler invokes the availability tracker to send an unavailable <presence/> packet (with the type='unavailable' attribute) to all the JIDs to which the entity had sent directed availability information during the lifetime of that session.

How does this help in the Conferencing service case? Well, one of the requirements to enter a room is that presence must be sent to that room. Each room has its own JID, so a typical presence packet in room entry negotiation might look like this:

SEND: <presence to='[email protected]'/>

which would be for the jdev room running at the conferencing service at conference.jabber.org.[11]

[11]The example here contains a room JID with no resource specified; this is taken from the 0.4 version of the Conferencing protocol. An earlier version of the protocol (Groupchat 1.0) required that the nickname for the person entering the room be specified as a resource to the room's JID, for example, [email protected]/dj.

So, the availability tracker would have recorded this directed presence and will send an unavailable presence to the same JID if the user's session ends.

The IQ Element

The third and final element in the Jabber building block set is the <iq/> element ("iq" stands for "info/query"), which represents a mechanism for sending and receiving information. What the <iq/> element has over the <message/> element for this purpose is structure and inherent meaning. It is useful to liken the info/query mechanism to the request/response model of HTTP using GET and POST.

The <iq/> element allows a structured conversation between two Jabber entities. The conversation exists to exchange data, to retrieve or set it, and to notify the other party as to the success (or not) of that retrieve or set action. There are four states that an <iq/> element can be in, each reflecting one of the activities in this conversation:

get

Get information.

set

Set information.

result

Show the result when the get or set was successful.

error

Specify an error if the get or set was not successful.

These states are reflected in the type attribute of <iq/> elements. The relationship between two entities in such a structured conversation that convey these states is shown in Figure 5-4.

Figure 5-4Figure 5-4. Entities in an <iq/>-based conversation
NOTE: As you can see, the combination of the <iq/> element specification and the type attribute is written like this:

IQ-type

For example, "IQ-get" refers to an <iq/> element with type='get', and so on.

Earlier in this chapter, we saw various elements in action in Example 5-3. The first two were <iq/> elements and showed a retrieval request and response for roster information.

First comes the request:

SEND: <iq id='roster_0' type='get'>
        <query xmlns='jabber:iq:roster'/>
      </iq>

Then the response:

RECV: <iq id='roster_0' type='result' from='dj@yak/Work'>
        <query xmlns='jabber:iq:roster'>
          <item jid='sabine@yak' name='sabine' subscription='both'>
            <group>Family</group>
          </item>
        </query>
      </iq>

This snippet shows a number of things:

  • The type of each info/query activity is identified by the type attribute.

  • Each info/query activity contains a subelement (here, <query/>), which is qualified by a namespace.

  • The subelement is used to carry the information being retrieved.

  • The response (type='result') can be matched up to the request (type='get') via the id tracking attribute.

So, if we look at the first <iq/> element:

<iq id='roster_0' type='get'>

we can see that this "request" <iq/> doesn't contain a to attribute. This is because the request is being made of the Jabber server (specifically the JSM), instead of a particular user. Next we see the response from the server:

<iq id='roster_0' type='result' from='dj@yak/Work'>

This "response" <iq/> contains a from attribute stating that the result is coming back from the original requester! This is simply because the from attribute is a hangover from the original request to the Jabber server, which is stamped with its origin (dj@yak/Work) in the form of the from attribute. Here, as in many other places in the Jabber server, the response is simply built by turning the incoming request packet around and adding whatever was required to it before sending it back.

OK, let's examine the details of the <iq/> element.

IQ attributes

The attributes of the <iq/> element are the same as those of the <presence/> and <message/> elements and used pretty much in the same way.

type

Example: <iq type='get'/>
name: Required

As mentioned already, the from attribute is used to specify the activity.

Attribute Values

type='get'

This is used to specify that the <iq/> element is being used in request mode, to retrieve information. The actual subject of the request is specified using the namespace qualification of the <query/> subelement; see later in this section for details.

Using the HTTP parallel, this is the equivalent of the GET verb.

type='set'

While IQ-get is used to retrieve data, the corresponding set type is used to send data and is the equivalent of the POST verb in the HTTP parallel.

Very often, an IQ-get request will be made of an entity, to discover fields that are to be completed to interact with that entity. The Jabber User Directory (JUD) is a component that plugs into the jabberd backbone and provides simple directory services; users can register an entry in the JUD address book, on which searches can be performed.

Let's look at how IQ elements are used to interact with the JUD.

The registration conversation with the JUD starts with an IQ-get to discover the fields that can be used for registration, followed by an IQ-set filling those fields in the act of registration. Note how, each time, the JUD responds with an IQ-result to confirm each action's success.

Here we are requesting registration information from the JUD. Note the namespace that qualifies the <query/> subelement (and hence the <iq/>):

SEND: <iq type='get' to='jud.yak'
                   id='judreg_ask'>
        <query xmlns='jabber:iq:register'/>
      </iq>

The JUD responds with the fields to fill in. The response is basically a copy of the request, with new attributes and tags:

RECV: <iq type='result' to='dj@yak/Work'
          from='jud.yak' id='judreg_ask'>
        <query xmlns='jabber:iq:register'>
          <instructions>
            Complete the form to submit your details
            to the User Directory
          </instructions>
          <name/>
          <first/>
          <last/>
          <nick/>
          <email/>
        </query>
      </iq>

Now we know what to send:

SEND: <iq type='set' to='jud.yak' id='judreg_do'>
        <query xmlns='jabber:iq:register'>
          <name>DJ Adams</name>
          <first>DJ</first>
          <last>Adams</last>
          <nick>qmacro</nick>
          <email>[email protected]</email>
        </query>
      </iq>

And the JUD responds, saying the IQ-set request was successful:

RECV: <iq type='result' to='dj@yak/Work'
                   from='jud.yak' id='judreg_do'/>

type='result'

As shown in the JUD conversation, the result type <iq/> packet is used to convey a result. Whether that result is Boolean (it worked, as opposed to it didn't work) or conveys information (such as the registration fields that were requested), each IQ-get or IQ-set request is followed by an IQ-result response, if successful.

type='error'

If not successful, the IQ-get or IQ-set request is followed not by an IQ-result response, but by an error type response. In the same way that a subelement <error/> carries information about what went wrong in a <message type='error'/> element, so it also provides the same service for IQ-error elements.[12]

[12]Table 5-3 lists the standard Jabber error codes and their default descriptions.

Let's look at an IQ-error in action. A user, who is trying to join a conference room, is notified that his entrance is barred because he hasn't supplied a required password.

First, the user requests information on the room he wishes to join:

SEND: <iq type="get" id="conf1" to="[email protected]">
        <query xmlns="jabber:iq:conference"/>
      </iq>

The conference component instance, to which the IQ-get was addressed (with the to='cellar@conference.yak' attribute), responds with information about the cellar room, including the fact that a nickname and passwordmust be specified to gain entrance:

RECV: <iq type='result' id='conf1' to='dj@yak/winjab'
                            from='[email protected]'>
        <query xmlns='jabber:iq:conference'>
          <name>Dingy Cellar</name>
          <nick/>
          <secret/>
        </query>
      </iq>

After sending availability to the room, to have the availability tracker kick in for that room's JID (see "

Availability Tracker
"):

SEND: <presence to="[email protected]"/>

Entrance to the room is attempted with a nickname but without specifying a password:

SEND: <iq to="[email protected]" type="set" id="conf2">
        <query xmlns="jabber:iq:conference">
          <nick>dj</nick>
        </query>
      </iq>

The entrance attempt was unsuccessful. An IQ-error response is given with an <error/> subelement explaining what the problem was:

RECV: <iq to='dj@yak/winjab' type='error' id='conf2'
           from='[email protected]'>
        <query xmlns='jabber:iq:conference'>
          <nick>dj</nick>
        </query>
        <error code='401'>Unauthorized</error>
      </iq>

Again, the response is simply the request with modified attributes and data (the <error/> tag) added.

from

<iq from='dj@yak/Work'/>
name: Set by server

Similar to the from attribute in the <message/> and <presence/> elements, this is set by the server and represents the JID where the <iq/> originated.

to

<iq to='[email protected]'/>
name: Optional

This attribute is used to specify the intended recipient of the info/query action or response. If no to attribute is specified, the delivery of the packet is set to the sender, as is the case for <message/> packets. However, unlike the case for <message/> packets, <iq/> packets are usually dealt with en route and handled by the JSM.

What does that mean? Packets sent from a client travel over a jabber:client XML stream and reach the Jabber server, where they're routed to the JSM.[13]

[13]They're routed with the internal <route/> element; see the section "Component Types" in Chapter 4 for more details.

A large part of the JSM consists of a series of packet handlers, in the form of modules, whose job it is to review packets as they pass through and act upon them as appropriate; some of these actions may cause a packet to be deemed to have been "delivered" to its intended destination (thus causing the packet routing to end for that packet) before it gets there.

So in the case of <iq/> packets without a to attribute, the default destination is the sender's JID, as we've already seen with the <message/> element. But because JSM handlers that receive a packet may perform some action to handle it and cause that packet's delivery to be terminated (marked complete) prematurely, the effect is that something sensible will happen to the <iq/> packet that doesn't have a to attribute and it won't appear to act like a boomerang. Here's an example:

The namespace jabber:iq:browse represents a powerful browsing mechanism that pervades much of the Jabber server's services and components. Sending a simple browse request without specifying a destination (no to attribute):

SEND: <iq type='get'>
        <query xmlns='jabber:iq:browse'/>
      </iq>

will technically be determined to have a destination of the sender's JID. However, a JSM handler called mod_browse that performs browsing services gets a look-in at the packet before it reaches the sender and handles the packet to the extent that the query is deemed to have been answered and thereby the delivery completed. The packet stops traveling in the sender's direction, having been responded to by mod_browse:

RECV: <iq type='result' to='dj@yak/sjabber' from='dj@yak'>
        <user name='DJ Adams' xmlns='jabber:iq:browse' jid='dj@yak'/>
      </iq>

And while we're digressing, here's a meta-digression: we see from this example that a browse to a particular JID is handled at the server. The client doesn't even get a chance to respond. So, as one of browsing's roles is to facilitate resource discovery, how is this going to work if the client doesn't see the request and can't respond. The answer lies in the distinction of specifying the recipient JID with or without a resource. The idea is that you can query someone's client to find out what that client supports; for example, whiteboarding or XHTML text display.[14] As a resource is per client connection and in many ways represents that client, it makes sense to send a browse request to a JID including a specific resource:

[14] Whiteboarding is collaborative sketching, not a form of surfing atop wave crests.

SEND: <iq type='get' to='[email protected]/sjabber'>
        <query xmlns='jabber:iq:browse'/>
      </iq>

This time the destination JID is resource-specific and the packet passes by the mod_browse handler to reach the client (sjabber), where a response can be returned:

RECV: <iq type='result' to='[email protected]/WinJab
                      from='[email protected]/sjabber'>
        <user type='client' xmlns='jabber:iq:browse'
                         jid='[email protected]/sjabber'>
          <whiteboard/>
          <videochat/>
          <PGP/>
        </user>
      </iq>

id

<iq type='get' id='roster1'/>
name: Optional

If we're going to rank the elements in terms of the importance of their being tracked, <iq/> would arguably come out on top, as it inherently describes a request/response mechanism. So this element also has an id attribute for tracking purposes.

Don't forget that the pair of XML streams that represent the two-way traffic between Jabber client and server are independent, and any related packets such as a request (traveling in one XML stream) and the corresponding response (traveling in the other) are asynchronous. So a tracking mechanism like the id attribute is essential to be able to match packets up.

IQ subelements

We've seen these two subelements of the <iq/> element already in earlier examples -- <query/> and <error/>. Here's a review of them.

query

<iq type='get' to='yak'>
<query xmlns='jabber:iq:version'/>
</iq>
name: Required

We've already seen the <query/> subelement performing the task of container for the info/query activity.

  • For an IQ-get, the subelement usually just contains a qualifying namespace that in turn defines the essence of the get activity. This is evident in the example here, where the <iq/> element is a retrieval of the server (yak) version information.

  • For an IQ-set, it contains the qualifying namespace and also child tags that hold the data to be set, as in this example, in which a vCard (an electronic "business card") is being updated:

    SEND: <iq type='set'>
            <vCard xmlns='vcard-temp' version='3.0'>
            ... [vCard information] ...
            </vCard>
          </iq>
    

  • When result information is returned, it is enclosed within a <query/> subelement qualified with the appropriate namespace, as in this IQ-result response to the earlier request for server version information:

    RECV: <iq type='result' to='dj@yak/Work' from='yak'>
            <query xmlns='jabber:iq:version'>
              <name>jsm</name>
              <version>1.4.1</version>
              <os>Linux 2.2.12-45SAP</os>
            </query>
          </iq>
    

    Of course, there are some results that don't carry any further information -- the so-called Boolean results. When there's no information to return in a result, the <query/> subelement isn't necessary. A typical case in which a Boolean result is returned is on successfully authenticating to the Jabber server (where the credentials are sent in an IQ-set request in the jabber:iq:auth namespace); the IQ-result element would look like this:

    RECV: <iq type='result' id='auth_0'/>
    

  • And for an error situation, while the actual error information is carried in an <error/> subelement, any context in which the error occurred is returned too in a <query/> subelement. This is usually because the service returning the error just turns around the IQ-set packet -- which already contains the context as the data being set -- and adds the <error/> subelement before returning it.

    Here we see that the authentication step of connecting to the Jabber server failed because Sabine mistyped her password:

    RECV: <iq type='error' id='auth_0'>
            <query xmlns='jabber:iq:auth'>
              <username>sabine</username>
              <password>geheimnix</password>
              <resource>pavilion</resource>
            </query>
            <error code='401'>Unauthorized</error>
          </iq>
    

Whoa! Hold on a minute, what's that <vCard xmlns='vcard-temp' version='3.0'> doing up there in the IQ-set example? Shouldn't it be <query xmlns='vcard-temp' version='3.0'>?

Actually, no. What it should be is defined, in each case, by the namespace specified in the xmlns attribute in the tag. It's important to note that while we specified the <query/> subelement as being required, it's actually the presence of the container itself that is required. Its name, while commonly query, really depends on the namespace qualifying it. So, while all of the containers qualified by the namespaces listed in the section "The IQ Namespaces" and the section "The X Namespaces," both in Chapter 6, have the tag name query, others, qualified by the namespaces in the section "Miscellaneous Namespaces," do not.

The critical part of the subelement is the namespace specification with the xmlns attribute. And we've seen this somewhere before -- in the definition of component instance configuration in the section "Server Configuration" in Chapter 4, we learned that the tag wrapping the component instance's configuration, like that for the c2s service:

<service id="c2s">
  ...
  <pthcsock xmlns='jabber:config:pth-csock'>
    ... [configuration here] ...
  </pthcsock>
</service>

which is pthcsock here, is irrelevant, while the namespace defining that tag (jabber:config:pth-csock) is important, because it's what is used by the component to retrieve the configuration.

We've seen this feature in this chapter too; remember the <iq/> examples in the jabber:iq:browse namespace? The result of a browse request that returned user information looked like this:

RECV: <iq type='result' to='dj@yak/sjabber' from='dj@yak'>
        <user name='DJ Adams' xmlns='jabber:iq:browse' jid='dj@yak'/>
      </iq>

Again, the query tag is actually <user/>. In fact, in browsing, the situation is extreme, as the <iq/> response's subelement tag name will be different, depending on what was being browsed. But what is always consistent is the namespace qualifying the subelement; in this example, it's jabber:iq:browse. See the section "jabber:iq:browse" in Chapter 6 for more details.

error

<iq type='error' from='dj@yak/Work' to='dj@yak/Work'>
  <query xmlns='jabber:iq:browse'/>
  <error code='406'>Not Acceptable</error>
</iq>
name: Optional

The error subelement carries error information back in the response to a request that could not be fulfilled. Table 5-3 showed the standard error codes and default accompanying texts.

The example here shows the response to a browse request, but why might the request have been erroneous? Because the <iq/> type attribute had been specified as set instead of get. Browsing is a read-only mechanism.

  Contact Us |  | Site Guide | About PerfectXML | Advertise ©2004 perfectxml.com. All rights reserved. | Privacy