Design by Constraint – not as useful as people think (#4)
May 1, 2011This post is the last part of the Design by Constraint series (first post) In the last 3 posts, I’ve described “Design By Constraint”, and pointed out that one inevitable outcome of design by constraint is that there will be transforms everywhere.
And I said:
The inevitable outcome of Design By Constraint is implementation chaos
Before I go on to explain why I think that, I just want to recap on some of the uses of Design by Constraint:
HL7 v3
- Reference Model is The RIM + Data Types + Structural Vocab (Note that the RIM has some additional patterns that are orthogonal to Design By Contract, particularly the Act based grammar, and the structural vocabulary layer. I’ll take them up in other posts)
- Static models as constraints. The static models are constraint specifications masquerading as constrained models - but they are syntactically equivalent to ADL. Static models may derive from other static models and in effect stack up
- The actual transform to constrained class model is executed by the schema generator, which produces schema that are in effect constrained class models
- In addition, there is the RIM serialisation, which is the normative XML form for the reference model
openEHR
- The Reference model is “the reference model”. The reference model is both more concrete and more abstract than the RIM - it explicitly represents a logical EHR, and then is has open clusters/elements for actual data that carry no semantics at all
- There is a canonical XML representation for instances of data that conform to the the reference model
- ADL is used to describe constraints in the reference model in “archetypes”. The archetypes can derive from each other and stack up.
- There is also the template layer that uses ADL to describe constraints that are applied as transforms to produce constrained class models that are represented as schema. I’m sure they could also be represented as UML PIMs, just like HL7 could choose to do as well
CDA
- The reference model is the CDA schema. No actual UML diagram is widely distributed, though one could be defined (and I think I’ve seen one, though there are some distinct challenges in the data types)
- The constraints are published as implementation guides with english language statements and schematron constraints.
- The wire format is the reference model - the single CDA format
- There is intense interest in using “greenCDA” - these are in effect constrained class models after applying a transform based on constraints
BRIDG
- In the beginning BRIDG was meant to be PIM, and constraints on the BRIDG were not expected to happen
- Now BRIDG is starting to be seen more and more as a conceptual model, which is constrained for particular use (at least, that’s what I see in private communications with CDISC/NCI people)
- As soon as someone says, let’s make these actual uses formal constraints on the BRIDG model, then the BRIDG eco-system will fully conform to “Design by Contract”
That’s enough for now. These systems are all similar in concept. But they differ wildly i:
- variations in the description and presentation of the overall approach
- variations in the details of how things are actually done
- variations in the choices of technologies for the different pieces
- focus and balance of the community that adopt them
But in spite of this, in spite of the fact that these variations mean the commonalities are not recognised, they are all variations on the one theme. And they all suffer from the problem of engineering the solution: whatever you do, you have to transform from general to specific, and back.
Implementation Disasters
The inevitable outcome of wide scale adoption of this technique is chaos. Different implementors want to live at different points on the general <-> specific curve, and there’s a variety of options to attempt to externalise costs from one implementor to another.
There’s various approaches to handling this. You can be like HL7: put your head in the sand, claim that it all works (in other words, externalise the costs), and then be real confused about why your brilliant idea isn’t actually solving all the problems in the world.
And this happens precisely because if you buy into the whole notion - learn the ways of the master model, and build a specific software stack to deal with the products - then it actually works pretty well. Please note this:** If you embrace the model, the outcomes are solid**. And there are many people doing so (HL7 JavaSIG/RIMBAA particularly). (or, to express it differently, if you invest where the costs have been externalised to, then you’ll eventually benefit from the savings that accrue where they were taken from)
But when you treat the design by constraint framework as an interoperability specification to be taken up by projects and/or standards that are going to be implemented across as multitude of applications who just want to use the standards - then they’re just going to feel the pain of the externalised costs, and they’ll never really derive the benefits of the outcome. And the politics will eventually overwhelm you.
People go on and on about the structured vocab, and acts, and various ontological features of the RIM. But I reckon that 90+% of all pushback to v3 implementations that I’ve seen is related to the costs of design by constraint, and what it does to the XML (i.e. instance engineering problems). HL7 externalised those costs very effectively. And that’s an own goal. (And note that the discussion around UEL and PCAST vs CDA walks into things like XML, OIDS in XML, so forth - all things that arise from the way the CDA community does design by constraint).
openEHR walks around this by being explicit about the costs, and not being adopted as a standard. You expect to be writing specific software to make it work, or you only deal with the constrained models that come out the end. Note, again, that this is about externalised costs. Either you pay the cost - learn the reference model, invest in tools and software - or openEHR internalises the costs by handling all the transforms privately (only I think that this approach will be a problem in the long term - implementors are stuck on the wrong side of the power of the reference model).
But if a large project or a national standard turned around and picked openEHR instead of v3 - well, it’s not going to be any different. Design By Constraint will lead to chaos.
OMG, come save us?
I think that this problem really needs OMG. As I pointed out earlier, this is really a case of design by contract - all we want is sophisticated contracts, and OMG has only provided really crude tools to do this.
Actually, what we are trying to do can be explained differently. UML defines Class diagrams and Instance diagrams. Class diagrams define a set of possible instance by defining their classes and the possible value domains they can have. Instance diagrams define a particular instance of data in terms of the class diagram. We want something in between - a diagram that describes a possible set of instances without taking ownership of their types. Easier said than done, though.
When I make the presentation of this content (the whole series of 4 blog posts) to OMG - and I have done so to some of the UML maintainers at an OMG meeting - their eyes glazed over, and they looked at me like I was an idiot. That’s crazy, they said: you’ll never get engineering continuity and coherence doing this.
Yeah, I agree. But semantic continuity and coherence - how you going to get that? (That’s where we started, what we’re trying to achieve) Well, I came away with the conclusion that engineers aren’t overly concerned about it. Sure, they get the notion of more sophisticated leverageable contracts in design by contract (and they had a good look at ADL too, with interest). But that’s only half my story.
I don’t know: how much is semantic consistency worth? That’s a subject I’ll take up in a later post.
In the mean time, I wish that the downstream price of Design By Contract was better understood by the people doing it, and by the large scale projects that adopt it. It’s not that it’s a wrong thing to do - but you have to know where the costs have gone.