Version identifiers: not quite what you think
When deciding whether to use version #s or namespace names for identifying versions in XML documents, we need to step back and determine what exactly is being identified with a version.
Given an extensible name type with a first and last name that is version 1.0 of a language, and the addition of an optional middle name is version 1.1 of a language, what is the right version identifier for a name that contains first and last only? We typically think that it should be 1.0 if the format author doesn’t know about the version 1.1 and the identifier should be version 1.1 if the format author knows about version 1.1. The instance document is both a version 1.0 and version 1.1 of the language. What is the version identifier actually identifying in this case? If a prefix is also added to the type in version 1.1, should the first, middle,last instance be identified as version 1.1 even though it doesn’t use the prefix? We typically think yes.
"It seems that a version identifier in an instance doesn’t really identify the version of the instance, it actually identifies the latest version that the format author could possibly use.
"
We don’t expect that a version 1.1 component will use v1.0 for first,last and then v1.1 for first,middle,last names.
"HTTP uses the version identifier in a similar way, as it says that the version identifies the latest version of HTTP supported by the requesting client. This information is important to the server so that it can know whether the client supports a version 1.1 optional feature. Effectively, the version identifier is on the protocol.
"
But lest you think that “aha because HTTP uses version identifiers that means they are good”, there is a crucial difference between HTTP and XML vocabularies. XML supports namespaces for elements, attributes and their content.
Now why would a format need a version identifier compared to namespaces? If the version change is an incompatible change, then a namespace name change will clearly indicate an incompatible change in the format. If the version change is a compatible change, then simply updating the namespace name preserves compatibility with existing namespace aware software.
"In XML languages, we almost invariably allow for extensibility. We regularly (and rightly!) allow for new content that we don’t know about. This new content uses namespaces names to distinguish the content from other content. XML software must be built to handle namespaces.
"
Which brings us back to the decision: use version #s or namespaces names for identifying versions? Given that the xml software is going to support namespaces, the use of version #s seems a redundant mechanism for indicating that content is either a compatible or incompatible with a given namespace.
In fact, using version identifiers with namespace aware software means that namespace only software can only be deployed in a strict and layered manner.
"The software to handle the language must know about namespaces AND the language specific versions, so generic namespace software can’t be used on it's own.
"
Given that namespaces are already in use and supported in tools, why require an additional, redundant, and complexity increasing version identifier for format identification? There may be reasons for having a version identifier, but it doesn't seem worth it for format versioning.
This article is reprinted from Dave Orchard's blog entry found at http://www.pacificspirit.com
Trackback URL for this post: http://www.webservices.org/trackback/id/5726





