Assertions in new XSD 1.1 draft

By Rick Jelliffe
July 8, 2008

The new XSD 1.1 draft has some revisions to the assertions system, which seem sensible.

Assertions follow Schematron's naming, so assert element and test attribute, but they are in the XSD namespace. I have mentioned before that they different from Schematron's assertions in paying no attention to natural language: schemas as still seen as computer problems not human problems.

For XSD Datatypes, assertion elements are just another facet on the datatype. A change is that full XPath 2 paths can be used, however, in the case of simple types the assertion only operates on the value in isolation from its context: you cannot constrain a type to be used in a particular environment, you can only constrain that a structure uses a certain simple type: neatly top-down.

The XSD processor first parses the data value against its type (union, list, int, etc) to get a sequence of typed nodes. Then it makes that sequence available in a variable value for the XPath test. So the test will always access that variable, for example <assert test="$value &lt; 1" />

For XSD structures, there is a highly unsatisfactory definition in s3.13.1, which makes no sense to me at all. I am sure they have something in mind, but they have utterly failed to capture it.

For start there is variability allowed in which subset of XPath 2 that can be used: you can implement the full thing, or only a subset which supposedly only allows downward looking XPaths. (These are the the kinds of XPaths that can be checked by a streaming validator that builds the DOM as it goes when the element's end-tag is found. If there are no assertions registered to higher-level elements in the DOM, that branch can be pruned for space efficiency. So you really don't want to have structural assertions on top-level elements when you have large documents.)

The problem with the definition is that it defines there is no context item, says that this is "." (which it isn't, . means self::node() if I recall correctly). The current definition seems to want to ensure that minimal implementations using only downward XPaths are still OK, however by not restricting axes, this is not the case: for example, what about xxx/parent::* where you go down to the child (assuming a child::xxx exists) then up to the start again: this is the same as "." surely!

I am sure the XSD SC will fix this up, and congratulations to them for the hard work. I don't see anything that cannot be implemented by conversion to Schematron with type-aware XSLT 2. However, the current Schematron with XSLT 2 implementations are not type-aware, because they don't run on a typed PSVI; but they could if there was some desire for this.

The 1.1 draft is looking like the VISTA of schema languages: the problems of having a monolithic approaching are really showing. If you thought it was complex before, it is now a lot bigger.

In ISO DSDL we have implemented various small languages as layers and now are working on combining them, in the light of experience. For example, we have Schematron for constraints, DSRL for renaming, in the future DTTL for user-defined element types, and NVDL for dispatching branches to validation. W3C XSD 1.1 also has responses to the same kinds of issues, but all bundled together: unreliable assertions instead of Schematron, unreliable implementation-dependent types instead of DTTL, a new versioning mechanism instead of DSRL renaming, and a new override mechanism superceding redefine instead of NVDL.

The thing that surprised me about reading this draft, and I am completely open to being corrected, is that in the past XSD's monolithicity was justified with the argument that validity should always be validity: that there should be no supersets or subsets or options. However, the new draft seems to ditch the reliability argument but doesn't take the opportunity to become more layered or modular in the process. If the reliability requirement has indeed been ditched, then surely constraints like the UPA need to be made optional.

The traditional argument against XSD 1.0 was that it had too little bang per buck: arbitrary limitations on the power working against the expectation that such a complicated technology would be able to meet its pretensions of universality. XSD 1.1 clearly improves a lot of areas, and the new draft is better than the old, but by requiring for example that a full XPath2 library is available but then only being able to use that library in a downward-streaming fashion, I don't see that the bang per buck argument will disappear: 1.1 has more bang, but more buck.


You might also be interested in:


Popular Topics

Archives

Or, visit our complete archives.

Recommended for You

Got a Question?