Semantics and behavior

Larry Masinter believes there is a risk in making ‘the “meaning” of [a] language depend on operational behavior‘.

In discussions like this there should be a sharp distinction between natural and computer languages. Larry says meaning ‘depends on what the speaker intends and how the listener will interpret the utterance’ – that’s a valid viewpoint for natural languages, but for markup (html, xml etc.) or programming languages it’s really stretching definitions. There is no speaker, and no direct intentions – unless software has intentions. For markup document instances, there are usually three indirect intentions: those of the language designer, the software builder and the end-user producing the instance. I’m not sure an approach based on intentions is workable for computer languages at all – whose intention does the document reflect? How do we know the intentions of end-user, software builder and language designer do not conflict? How do we know their intentions at all?

Besides, semantics in this way can be described in prose or first order logic. I don’t think first order logic is appropriate for most language design (it’s not human readable, only logician-readable), and prose may be fine but is often inexact. I spoke at length at Balisage 2008 on this issue, and discussed it with David Orchard and others, and I believe an operational approach is often appropriate for computer languages. Why? It certainly is problematic for natural language semantics. There may be an utterance, and a meaning, without any behavior – I can say ‘North Korea tested an A-bomb’, and that has a meaning, even if no-one exhibits any behavior whatsoever after my words. For computer languages, it’s different. It’s possible to describe operational behavior as testable conditions – if I send a system test message A, it should do X. And that makes operational semantics a nearly perfect fit for computer languages – the semantics can be tested with a test suite. That’s enough. No need for real-time behavior after receiving each message.

Larry writes: ‘…standards organizations are in the business of defining languages … and not in … telling organizations and participants … how they are supposed to behave’. That’s far too lenient. If I render text marked <b> as italic, and <i> as bold, I’m not following your spec. Period. If I implement a language specification, and claim conformance, I’m not free to do as I choose. This is even more true in other contexts – business, healthcare, insurance. There organizations exchanging documents sign contracts, and are legally bound to do certain things upon receiving documents – paying after ordering, for instance. Larry’s free-for-all approach to language definitions does not apply to the real world. It’s only true that behavior should not be constrained any further than necessary – but without behavioral consequences, document exchange is meaningless indeed.

Is RIMBAA a Mistake?

HL7v3 is a framework for developing messages in healthcare. Unlike its predecessor, v2, HL7v3 has at its core an information model, the RIM (Reference Information Model). The RIM contains classes and relations. From the RIM actual healthcare messages are derived, usually as XML instances. This is what HL7v3 does in a nutshell, and its focus is clearly on messaging, nothing else.

Recently a new school of thought is gaining ground: RIMBAA (RIM-Based Application Architectures). RIMBAA seeks to use the RIM not solely as the basis for messages, but to use it as the basis for an entire EHR (Electronic Health Record) system. The RIM, after all, is quite rich, and if it is rich enough to describe messages which care providers exchange between them, why not use it to describe the data contained in an EHR itself? This also makes deriving messages from my EHR a piece of cake.

Here’s why not.

Suppose I build an EHR from a certain HL7v3 version X (there are plenty of versions, called “ballots”, to choose from) and I also exchange messages with my colleagues using version X. If we now decide to upgrade the messages to version Y, I’m forced to do a double update: I have to upgrade not only my messaging framework to version Y, I also have to upgrade my entire EHR to version Y.

RIMBAA thus leads to tightly coupled systems. In a loosely coupled architecture, systems are black boxes: each system just has to know the interface (messages) of another system to communicate. In loosely coupled systems, each system can be upgraded or changed independent of other systems, as long as the interfaces remain unchanged. Loose coupling is a core design principle of Internet-scale messaging, and RIMBAA violates it.

Moreover, if RIMBAA gains wide acceptance, a majority of EHR’s would become RIM-based, and thus EHR’s would be very alike. Good, not? Since they’re so alike, won’t it be easy to communicate? Nope. If all, or many, EHR’s follow the RIM, there will be less competition, which will stifle innovation. Having EHR’s which are not based on the RIM enables healthcare developers to adopt any wild, new, unthought-of innovation they wish, as long as they keep supporting the common messages. This difference allows the full creativity of healthcare providers to be expressed in their systems, and is good for innovation and competition, major drivers of human progress.

With RIMBAA, it’s either taking what the RIM offers, or leaving RIMBAA. The latter is probably the best choice: take some inspiration from the RIM where useful, develop your EHR, and forget there ever was a link. I’ve seen a lot of efforts to harmonize data models, and it seldom works on a large scale, not even in a single large company. Different needs are simply too different. For messaging, a common data model is a necessity. For application architectures, it is a commodity at best and a nuisance at worst.

RIMBAA (and similar initiatives), in short, will lead to systems which are hard to upgrade, will stifle innovation and will hinder progress. It is much better to follow HL7v3’s original course and keep standards for messaging separate from standards for EHR’s. RIMBAA violates the fundamental design pattern of loose coupling and is a mistake.