Implementing Healthcare Messaging with XML

Last week Monday I gave a presentation on XML 2007 titled “Implementing Healthcare Messaging with XML” for a very attentive and responsive audience, chaired by Tony Coates. David Orchard and Glen Daniels of multiple WS-standards were there, and I had an interesting chat with them afterwards on the layering problems of WSDL mentioned in my presentation. Jon Bosak inquired about ebXML – which we hadn’t used because it did not seem to get any traction from IBM and Microsoft at the time. With hindsight, looking at what ebXML (ebMS specifically) delivered years ago and the time the WS-* stack took, one wonders whether this was such a wise decision… Anyway, it was great to have such a responsive crowd.

Axioms of Versioning 2

I’ve written a new version of ‘Axioms of Versioning‘. I extended the formalization to get a grasp of the concept of ‘Semantic Backward Compatibility in HL7v3, which I believe is flawed (quote: “Objective of backward model compatibility is that a receiver expecting an ‘old’ version will not misinterpret content sent from a new version”). It seems to be the reverse of the position of the W3C TAG in ‘Extending and Versioning Languages: Terminology‘, and the position I would defend myself. Yet the interaction of new senders with old receivers was not sufficiently explored in my Axioms.

It turns out that exploration of this notion leads to quite natural definitions of ‘may ignore’ and ‘must understand’ semantics. The HL7v3 notion is probably best characterized by the concept of ‘partial semantical forward compatibility’ in my new Axioms. The concept is also close to, if not the same as, the TAG’s ‘Partial Understanding‘.

It really thrilled me to see how helpful my formalisms were in exploring the notions in HL7v3, and uncovering the – I think – hidden meaning in it.

Making All Changes Compatible over Multiple Versions – Part 2

In the previous article I showed how to add an optional element and make this a compatible change over two releases, so existing receivers aren’t broken by unknown elements. It is done by following two simple principles:

  1. have one or more intermediate releases;
  2. use different schema’s in intermediate releases for senders and receivers.

This strategy works well in environments where ‘Big Bang’ upgrades are undesirable and there is control over the number of different releases ‘in the field’. This applies to the Dutch national EHR infrastructure, for which I work, and probably for most exchanges between companies and/or governments.The same can be done for any kind of change. Consider making an optional element required: we have an existing

Order V1:

  • customer-id 1..1
  • name 0..1
  • order-line 1..n

with an optional ‘name’ element, and we want:

Order V2:

  • customer-id 1..1
  • name 1..1
  • order-line 1..n

Upgrading some receiving nodes to V2 would break them once they receive orders without names. If we use an intermediate release, there are no problems:

Sender 2 cannot break receiver 1. Once all parties are at V2 at least, receivers can start to upgrade to V3. If the level of control is such that there are no more than two versions allowed simultaneously, this is sufficient – else we’ll just have to have some more intermediate releases.

There are more use cases. Consider code lists: in V1, we allow the Netherlands and the USA as countries. In the next release, we want to trade with Ireland as well:

If we upgrade in two steps, every subsequent pair of releases is fully compatible. No sender can break a receiver. The same if we remove old codes:

It’s as easy to increase maximum cardinality:

Some changes take more than one intermediate release, such as adding a required element where there was none:

Sometimes we change senders first, other times receivers. It turns out there is a simple rule behind this: if the change from V1 to the final version is – by itself – backward compatible, we change receivers first. If the change is forward compatible, we change senders first.

Backward compatible changes (where new receivers can handle content from old senders) are:

  • making required elements optional;
  • introducing new optional elements;
  • increasing maximum cardinality (from 1 to n, or 0 to n, or 1 to 2);
  • adding code values;

and in general all changes which allow more document instances.
Forward compatible changes (where old receivers can handle content from new senders) are:

  • making optional elements required;
  • removing optional elements;
  • decreasing maximum cardinality (from 1 to n, or 0 to n, or 1 to 2);
  • removing code values;

and in general all changes which allow less document instances.

All patterns are reversible – if one way they describe a backward compatible change over multiple versions, then the other way they describe a forward they describe a forward compatible change and vice versa. In a next post I’ll look at some advantages and disadvantages of this approach and other approaches, such as using <xs:any> schema elements.

Making All Changes Compatible over Multiple Versions – Part 1

This article will show a way to make any change in an XML-based exchange vocabulary compatible over several versions of the exchange language. In environments where there is some control over the number of versions ‘in the field’ this is highly useful. In the Dutch Electronic Health Record exchange for instance we target a maximum of subsequent two versions in the field at any time, and so does the British NHS for their EHR. I guess – for different reasons – we will have to be bit more lenient than two versions, but the environment is still under control. It’s not uncommon in messaging scenario’s to have some control over the number over versions ‘out there’ (as opposed to publishing, where one often has little idea of what versions still exist).

Suppose we have a message version V1, a basic order with a customer id and one or more order lines. In V2 we want to add a name, since the customer may have different contacts for different orders, and needs a way to indicate this. So the V2 message looks like this:

Order V2:

  • customer-id 1..1
  • name 0..1
  • order-line 1..n

As you see, I will indicate minimum and maximum cardinality on all elements. For instance, name occurs zero times minimum and once maximum, or in other words, name is optional. The elements themselves, of course, may be complex in this example. I will add one convenient shorthand for this article, zero minimum cardinality:

Order V1:

  • customer-id 1..1
  • name 0..0
  • order-line 1..n

V1 order contain a minimum of zero names, and a maximum of zero names. In other words, V1 orders cannot contain names. This is not a common construct in schema languages, but it will help this small exposure.

If we do nothing beyond this, V2 senders will break V1 receivers: their schema’s cannot handle the unknown ‘name’ element and either validation will fail or their receiving software might break. There is no way to know without knowing all implementations, which is not doable in a field with thousands of participants. So the existing implementations will break: new senders will send unexpected content to old receivers.

One way of solving the problem is “ignoring unknown content” like HTML does – but here we’ll look at another approach. Instead of moving from version 1 to version 2, we’ll define an intermediate version with different schema’s for senders and receivers:

I’ve simplified the content, focusing on the important ‘name’ element. In version 2, we move to the intermediate situation: senders are unchanged, but receivers are upgraded to accept names. Only in version 3 can senders send names. And all of a sudden all subsequent versions can communicate with each other. If we combine this with the rule that no more than two versions co-exist at the same time, compatibility remains intact. If we allow co-existence of three or more versions at the same time, the principle remains, we’ll just need some more intermediate versions.

In the next part we’ll look at how to use this approach for all changes, not just adding optional content.

Axioms of Versioning

Obsolete, please see the latest version

Version 1

An attempt to define syntactical and semantical compatiblity of versions in a formal way. Much derives from the writings of David Orchard, especially the parts on syntactical forward and backward compatibility (though my terminology differs).

  1. Let U be a set of (specifications of) software processable languages {L1, L2, … Ln}
    1. This axiomatization does not concern itself with natural language
  2. The extension of a language Lx is a set of conforming document instances ELx = {Dx1, Dx2, … Dxn}
    1. Iff ELx = ELy then Lx and Ly are extensionally equivalent
      • Lx and Ly may still be different: they may define different semantics for the same syntactical construct
    2. Iff ELx ? ELy then Lx is an extensional sublanguage of Ly
    3. Iff ELx ? ELy then Lx is an extensional superlanguage of Ly
    4. D is the set of all possible documents; or the union of all ELx where Lx ? U
  3. For each Lx ? U there is a set of processors Px = {Px1, Px2, … Pxn} which implement Lx
    1. Each Pxy exhibits behaviour as defined in Lx
    2. Processors can produce and consume documents
    3. Each Pxy produces only documents it can consume itself
    4. At consumption, Pxy may accept or reject documents
  4. The behaviour of a processor Pxy of language Lx is a function Bxy
    1. The function Bxy takes as argument a document, and its value is a processor state
      • We assume a single processor state before each function execution, alternatively we could assume a <state, document> tuple as function argument
    2. If for two processors Pxy and Pxz for language Lx for a document d Bxy(d) = Bxz(d) then the two processors behave similar for d
      • Two processor states for language Lx are deemed equivalent if a human with thorough knowledge of language specification Lx considers the states equivalent. Details may vary insofar as the language specification allows it.
      • Processor equivalence is not intended to be formally or computably decidable; though in some cases it could be.
    3. If ?d ( d ? ELx ? Bxy(d) = Bxz(d) ) then Pxy and Pxz are behaviourally equivalent for Lx
      • If two processors behave the same for every document which belongs to a language Lx, the processors are behaviourally equivalent for Lx.
  5. An ideal language specifies all aspects of desired processor behaviour completely and unambiguously; assume all languages in U are ideal
  6. A processor Px is an exemplary processor of a language Lx if it fully implements language specification Lx; assume all processors for all languages in U are exemplary
    1. Because they are (defined to be) exemplary, every two processors for a language Lx are behaviourally equivalent
    2. ELx = { d is a document ? Px accepts d }
    3. The complement of ELx is the set of everything (normally, every document) which is rejected by Px
    4. The make set MLx = { d is a document ? Px can produce d }
  7. A language Lx is syntactically compatible with Ly iff MLx ? ELy and MLy ? ELx
    • Two languages are syntactically compatible if they accept the documents produced by each other.
    1. A language Ln+1 is syntactically backward compatible with Ln iff MLn ? ELn+1 and Ln+1 is a successor of Ln
      • A language change is syntactically backward compatible if a new receiver accepts all documents produced by an older sender.
    2. A language Ln is syntactically forward compatible with Ln+1 iff MLn+1 ? ELn and Ln+1 is a successor of Ln
      • A language change is syntactically forward compatible if an old receiver accepts all documents produced by a new sender.
  8. A document d can be a member of the extension of any number of languages
    1. Px is an (exemplary) processor of Lx, Py is an (exemplary) processor of language Ly
    2. Two langauges Lx and Ly are semantically equivalent iff ELx = ELy ? ?d ( (d ? ELx ) ? Bx(d) = By(d) )
      • If two languages Lx and Ly take the same documents as input, and their exemplary processors behave the same for every document, the languages are semantically equivalent.
      • Two languages can only be compared if their exemplary processors are similar enough to be compared.
      • Not every two languages can be compared.
      • “Semantic” should not be interpreted in the sense of “formal semantics”.
  9. The semantical equivalence set of a document d for Lx = { y ? ELx | Bx(d) = Bx(y) }
    1. Or: SLx,d = { y ? ELx | Bx(d) = Bx(y) }
      • The semantical equivalence set of a document d is the set of documents which make a processor behave the same as d
      • Semantical equivalence occurs for expressions which are semantically equivalent, such as i = i + 1 and i += 1 for C, or different attribute order in XML etc.
    2. d ? SLx,d
    3. Any two semantical equivalence sets of Lx are necessarily disjunct
      • If z ? SLx,e were also z ? SLx,d then every member of SLx,e would be in SLx,d and vice versa and thus SLx,d = SLx,e
  10. A language Ly is a semantical superlanguage of Lx iff ?d ( d ? MLx ? By(d) = Bx(d) )
    1. For all documents produced by Px, Py behaves the same as Px
      • Equivalence in this case should be decided based on Lx; if Ly makes behavioural distinctions which are not mentioned in Lx, behaviour is still the same as far as Lx is concerned
    2. It follows: ?d ( d ? MLx ? ?SLy,d ( SLy,d ? ELy ? ( SLx,d ? MLx ) ? SLy,d ? By(d) = Bx(d) ) )
      • For any document produced by Px, the part of its semantical equivalence set which Px can actually produce, is a subset of the semantical equivalence set of Py for this document
    3. For all d ? ELx ? d ? MLx there may be many equivalence sets in Ly for which By(d) ? Bx(d)
      • In other words: for documents accepted but not produced by Px, Ly may define additional behaviours
    4. Lx is a semantical sublanguage of Ly iff Ly is a semantical superlanguage of Lx
  11. A language Ln+1 is semantically backward compatible with Ln iff Ln+1 is a semantical superlanguage of Ln and Ln+1 is a successor of Ln
    1. An old sender may expect a newer, but semantically backward compatible, receiver to behave as the sender intended
    2. A language Ln is semantically forward compatible with Ln+1 iff Ln+1 is a semantical sublanguage of Ln and Ln+1 is a successor of Ln
    3. Semantic forward compatibility is only possible if a language loses semantics; i.e. it’s processors exhibit less functionality, and produce less diverse documents
    4. A processor cannot understand what it does not know about yet

SOAP over REST

Suppose I want to order 100,000 pieces of your newest, ultra-sleek geek gadgets. We negotiate the price etc., and you send me a proposed contract. I agree, and return the contract. Blessed with a healthy skepticism towards all new technologies, we decide to transfer all documents on paper, and since the contract is very important to both of us, I return it using the most trusted courier service available, with parcel-tracking and armored trucks and all. Yet I do not sign the contract. Will you honor it and send me the goods? I doubt it. Yet this is the level of protection HTTPS offers.

With REST, based on the workings of the Web, HTTPS is the standard choice for safe transport. Yet HTTPS only secures the transport, the pipe. Once a message is delivered on the other end, it is simple text, or xml, or whatever format we choose again. Of the signatures used to establish the secure session, nothing remains with the message. We can use client certificates, so both server and client authenticate themselves, but is still only for the pipe, not for the messages. What you want for real contracts are message signatures.

There are several options in REST to solve the problem. One of them is to simply hijack the WS-Security spec of the WS-* stack. Add a soap:Envelope element with the appropriate wss headers to the contract message, and send the resulting xml in a RESTful way to the other party. Maybe this is not 100% WS-Security compliant and there are some dependencies on SOAP or WSDL or other WS-* specs which we do not honor (and maybe not, I haven’t combed the spec for it), but hey, if we squint enough that shouldn’t be much of a practical problem.

Such a coupling of REST and appropriate WS-* specs does seem promising – unless one is tightly in the WS-*-is-evil-by-default camp. It has an immediate consequence: there is almost nothing WS-* can do and REST cannot do – safe travel over non-HTTP connections and a few others. Bill de hÓra wrote: “And do not be surprised to see specific WS-* technologies and ideas with technical merit, such as SAML and payload encryption, make an appearance while the process that generated them is discarded.”

There is a general lesson to be extracted as well: if something belongs with the payload, store it in the payload. HTTP headers are fine for transport, eh, transfer headers but not for anything which inherently belongs with the message payload. HTTP headers should be discardable after the HTTP method completes. Rule of thumb: if you want to keep it after reception, payload header. If not, HTTP header.

When REST advantages weigh less…

There are two interesting posts by Stefan Tilkov WS-* Advantages and Stuart Charlton What are the benefits of WS-* or proprietary services? on when to use WS-* instead of REST.

Stu writes: “When you want a vendor independent MOM for stateful in-order, reliable, non-idempotent messages, and don’t have time or inclination to make your data easily reused… ”

We could reverse this argument: when do the advantages of REST (caching, linking and bookmarking to name some) matter less? For one of my customers I design part of the Dutch national healthcare exchange, which is used to exchange patient data between care providers. Nearly all messages involved include the patient id: therefore most messages are pretty unique, and tied to a particular care context: say a patient visits his GP, or collect medication from his apothecary. In such exchanges, caching doesn’t matter at all. It is possible some data (a patients medication history) is retrieved twice when the patient visits two doctors after another, but in general in such an infrastructure it’s better to simple turn off caching, GET the data twice in the outlier cases and not be bothered by the overhead involved in caching.

It seems to me a lot of business exchanges (say order/invoice such as UBL does) share this property of mostly unique messages, whereas cases such as Google or Amazon APIs clearly will benefit a lot from caching. The distinction is between messaging (sending letters) and publishing (newspapers).
I’m not advocating REST or WS-* here for any particular application, but thinking about where the benefits of REST matter most is another way of thinking choosing technologies. For publishing, REST with all the optimizations of GET is the option to look at first. For messaging it’s less obvious where to start.

GET and POST aren’t verbs

Calling HTTP GET and POST ‘verbs’ is a gross misnomer; they really are URI metadata in disguise.

REST is centered around the idea that we should use the way the web works when we do things on the Web – fair enough – and that REST is the architectural style of the Web. RESTful applications – like HTTP – use POST, GET, PUT and DELETE to CREATE, READ, UPDATE and DELETE resources.

The problem is, this is not how the current Web works.Real verbs can be applied to many nouns, and a single noun can take many verbs: the cat walks, the cat talks, the cat whines, the cat shines. Some combinations make no sense, but a lot do. This is not the case for GET and POST and their friends. In principle, it is possible to apply both GET and POST to a single resource; GET an EMPLOYEE record, POST to the same record.

In practice GET and POST do not often apply to the same resource. POST on the Web is used in HTML forms. A form has a method (GET or POST) and an URI (which points to a resource). Usually, POST forms have unique URI’s; they don’t share them with GETS. Amazon uses artificial keys to make the POST URI’s unique. More surprising, they even do the same thing when I GET the URI instead of POST it (which the Firefox Web Developer toolbar supports with a single menu choice). Amazon doesn’t care whether I GET or POST a book to my shopping cart; it (understandably) lands in my shopping cart either way. The same applies to del.icio.us and numerous other – I suspect most – sites.

The Web does not work through applying a small set of verbs to a large number of resources: the resources do all the work. GET and POST aren’t real verbs; they just signal what the URI owner intends to do when the URI is dereferenced; as such, they are URI metadata. The ‘GET’ metadata in a HTML form’s method just informs your browser, and all intermediaries that this URI is supposed to have no side-effects on the server. Of course no browser can know what actually happens when a URI is dereferenced: maybe a document is returned, maybe the missiles are fired. GET and POST are just assertions made by the URI owner about the side effects of the URI.

The ‘old’ Web wouldn’t be one bit different in appearance if both GET and POST weren’t even allowed on the same URI. It’s possible to use them as real verbs, as REST advocates; but the merits of this approach do not derive from the way the current Web works.

Don’t get me wrong: GET and POST are brilliant – especially GET. Returning the GET and POST metadata back to the server with the URI is what makes the Web tick: it allows intermediaries to do smart caching stuff, based on the GET and POST metadata. But however good – GET and POST are not verbs, they are metadata.

The #referent Convention

Update: I learned from the TAG list that Dan Connolly already proposed using #it or #this for the same purpose, and  Tim Berners-Lee proposed using #i to refer to oneself in a similar way. My idea therefore was not very original, and since I regularly read the TAG list and similar sources, it’s even possible I read the idea somewhere and (much) later thought of it as one of my own – though if this happened it was certainly unintentional.

There is a very simple solution to the entire hash-versus-slash debate: whenever you would want to identify anything with a hashless URI, suffix it with #referent. The meaning of x#referent is: I identify whatever x is about. And x is simply an information resource (about x#referent).

The httpRange-14 debate is about what hashless URI’s (without a #) refer to: can they refer only to documents (information resources) or to anything, i.e. persons or cars or concepts. Is it meaningful to say https://www.marcdegraauw.com/marcdegraauw/ refers to ‘Marc de Graauw’. Or does it now identify both a web page and a person, and is this meaningful and/or desirable?

Hash URI’s aren’t thought to be much of a problem in this respect. They have some drawbacks however. It may be desirable to retrieve an entire information resource which describes what the referent of the URI is. And putting all identifiers in one large file makes the file large. Norman Walsh did this: http://norman.walsh.name/knows/who#norman-walsh identifies Norman Walsh, and the ‘who’ file got big. So Norm switched to hashless URI’s: http://norman.walsh.name/knows/who/norman-walsh identifies Norman Walsh. The httpRange-14 solution requires Norm to answer to a GET on this URI with a 303 redirect, in this case to http://norman.walsh.name/knows/who/norman-walsh.html, which does not identify Norman, but simply is an information resource.

If we use the #referent convention, I can say: https://www.marcdegraauw.com/marcdegraauw.html#referent identifies me. And https://www.marcdegraauw.com/marcdegraauw.html is simply an information resource, which is about me. Problem solved.

If I put https://www.marcdegraauw.com/marcdegraauw.html#referent in a browser, I will simply get the entire https://www.marcdegraauw.com/marcdegraauw.html resource, which is a human readable resource about https://www.marcdegraauw.com/marcdegraauw.html#referent. Semantic Web software which understands the #referent convention will know https://www.marcdegraauw.com/marcdegraauw.html#referent refers to a non-information resource (except when web pages are about other web pages) and https://www.marcdegraauw.com/marcdegraauw.html is simply an information resource. Chances of collision of the #referent fragment identifier are very small (Semantic Web jokers who do this intentionally apart) and even in the case of collision with existing #referent fragment identifiers the collision seems pretty harmless. The only thing the #referent convention does not solve is all the existing hashless URI’s out there which (are purported to) identify non-information resources.
In Semantic Web architecture, there is no need ever for hashless URI’s. The #referent convention is easier, more explicit about what is meant, retrieves a nice descriptive human-readable information resource in a browser, along with all necesssary rdf metadata for Semantic Web applications.

Validation Considered Essential

I just ran into a disaster scenario which Mark Baker recently described as the way things should be: a new message exchange without schema validation. He writes: “If the message can be understood, then it should be processed” and in a comment “I say we just junk the practice of only processing ‘valid’ documents … and let the determination of obvious constraints … be done by the software responsible for processing that value.” I’ll show this is unworkable, undesirable and impossible (in that order).

I’ve got an application out there which reads XML sent to my customer. The XML format is terrible, and old – it predates XML Schema. So there is no schema, just an Excel file with “AN…10” style descriptions and value lists. It is built into my software, and works pretty well – my code does the validation, and the incoming files are always processed fine.

Now a second party is going to send XML in the same format. Since there is no schema, we started testing in the obvious way – entering data on their website, exporting to XML, send it over, import in my app, see what goes wrong, fix, start over. We have had an awful lot of those cycles so far, and no error-proof XML yet. Given a common schema, we could have a decent start the first cycle. Check unworkable.

So I wrote a RelaxNG schema for the XML. It turned out there where hidden errors which my software did not notice. For instance, there is a code field, and if it has some specific value, such as ‘D7’, my customer must be alerted immediately. My code checks for ‘D7’ and alerts the users if it comes in. The new sender sent ‘d7’ instead. My software did not see the ‘D7’ code and gave no signal. I wouldn’t have caught this so easily without the schema – I would have caught it in the final test rounds, but it is so much easier to catch those errors early, which schema’s can do. Check undesirable.

Next look at an element with value ‘01022007’. According to Mark, if it can be understood, it should be processed. And indeed I can enter ‘Feb 1, 2007’ in the database. Or did the programmer serialize the American MM/DD/YYYY format as MMDDYYYY and is it ‘Jan 2, 2007’? Look at the value ‘100,001’ – perfectly understandable, one hundred thousand and one hundred – or is this one hundred and one thousandth, with a decimal comma? Questions like that may not be common in an American context, but in Europe they arise all the time – on the continent we use DD-MM-YYYY dates and decimal comma’s, but given the amount of American software MM/DD/YYYY dates and decimal points occur everywhere. The point is the values can apparently be understood, but aren’t in fact. One cannot catch those errors with processing logic because the values are perfectly acceptable to the software. Check impossible.

In exchanges, making agreements and checking if those agreements are met is essential. Schema’s wouldn’t always catch the third kind of errors either, but they provide a way to avoid the misinterpretations. The schema is the common agreement – unless one prefers to fall back to prose – and once we have it, not using it for validation seems pointless. Mark Baker makes some good points on over-restrictive validation rules, but throws out the baby with the bathwater.