CIMI at the Crossroads

Previous: Technical Correction in Data Types R2 / ISO 21090 »

CIMI at the Crossroads

Mar 15, 2012

The Clinical Information Modelling Initiative (CIMI, see here, and here) is

“an international collaboration that is dedicated to providing a common format for detailed specifications for the representation of health information content so that semantically interoperable information may be created and shared in health records, messages and documents”

CIMI is one of a number of efforts that have been started to try and define a common format for such specifications; all the previous efforts (mostly going by the name of DCM, “detailed clinical models”) have gotten bogged down in methodology questions and political games of various sorts, and they’ve failed to produce something that people might actually use.

CIMI shows every sign of following the same trail to the same dead end.

From the beginning, the CIMI initiative sought to produce a different outcome from previous efforts by trying to be agnostic on the tribal and political issues that have bedeviled the previous efforts. In particular:

The membership of CIMI included all the significant players in the space, not only some of them
The charter always included CIMI providing the capability to express the clinical models in a series of different formalisms (i.e. XML, Java, HL7 v2, EN13606, CDA, openEHR etc) by the provision of some “compiler”

The membership point was really new – and for the first time there was real hope that something might come from this. The first task for CIMI was to choose an internal methodology that would be used as the primary expression of the models. The initiative held a meeting in London in Nov 2011 to choose between the following candidate approaches:

UML/OCL and associated OMG standards
13606-2/ADL 1.4
ADL 1.5 (http://www.openEHR.org)
Semantic Web technology (OWL, RDF, Protégé, and associated tools and standards)
HL7 v3 approach (MIF, HL7 RIM, static models and associated artifacts and tools)

In spite of the fact that these things are at not all alike, a comparison was performed, and the group decided… well, let’s quote from the press release:

ADL 1.5 will be the initial formalism for representing clinical models in the repository.
CIMI will use the openEHR constraint model (Archetype Object Model:AOM).
A set of UML stereotypes, XMI specifications and transformations will be concurrently developed using UML 2.0 and OCL as the constraint language.
A Work Plan for how the AOM and target reference models will be maintained and updated will be developed and approved by the end of January 2012.

In other words, the group chose AOM/ADL, but it seems to me it was unable to get full consensus, hence the mention of UML/OCL. Note that the exact relationship between ADL 1.5 and UML is not spelled out.

Well, January 2012 has passed, and there is no work plan – because there still doesn’t seem to be any consensus about the methodology, let alone the reference model. As far as I can tell, the participants who favour UML/OCL have continued on as if ADL/AOM wasn’t the initial formalism. The follow up meeting in San Antonio in Jan 2012 was characterised by continued argument about UML vs ADL. CIMI still doesn’t have consensus about the stuff already decided, let alone the hard stuff to come.

I’ve been an interested observer to CIMI from the beginning – it’s a great goal that we really need to see solved, the best group of people that we’ve got together on this subject, and there was real hope. Due to resource constraints, I’ve never been a formal member of the initiative, but I have attended the CIMI meetings and teleconferences whenever possible. But it’s never seemed to me that the participants are being realistic.

The core problem revolves around the problem of getting compromise. This was obviously going to be a problem here – many of the participants at CIMI have many millions invested in their systems, and I never could see how CIMI would avoid the outcome I described:

…build a complicated framework that allows both solutions to be described within the single paradigm, as if there isn’t actually contention that needs to be resolved, or that this will somehow resolve it. This is expensive – but not valuable; it’s just substituting real progress with the appearance thereof.

As you can see, CIMI is well on the way to building a complicated framework, and providing only the appearance of progress.

For me, this was underscored by the decision to choose ADL/AOM as the methodology, while deferring the choice of reference model. While I understood the political reality of this decision, choosing an existing methodology (ADL/AOM) but not the openEHR reference model committed CIMI to building at least a new tooling chain, a new community, and possibly a new reference model.

Each of these is spectacularly hard and expensive. At a minimum, using semi-volunteer labour of loving experts who are building their own empire, you have to estimate the cost of tooling at great than $2M (and doing it on a straight commercial basis, upwards of $6M). Reference models take years – as in, a decade – to build, and the blood, sweat and tears of many people. This also equates to millions of dollars one way or another. Building a community around a methodology and tool-chain are the same. So CIMI committed itself to these kind of expenditures of $$, energy and ego, but I can’t think that any of the participants really thought that CIMI can actually call on those resources before it produces anything of value.

As for UML, the plan called for “a set of UML stereotypes, XMI specifications and transformations” – this is the same error. The point of UML is that the average implementer knows how to make it work, and has tools that can leverage the models. Each stereotype you define erodes that advantage, and as soon you define a really important stereotype – and why bother if it’s not? – then off the shelf tools can no longer be used. As for developing XMI specifications… who’s going to support that? This is known as “snatching defeat from the jaws of victory”.

I can’t see that CIMI is on a path to producing anything, let alone a methodology that people will be happy to use, offered the choice.

So what should CIMI do? As I see it, there are two pragmatic choices. CIMI needs to pick one, or accept that it’s never going to reach consensus with the resources available:

OpenEHR

That’s right. Just bite the bullet and pick the whole openEHR stack. They’ve got a reference model. They’ve got tooling – the archetype designer (open source!) and the CKM. They’ve got a community (using the CKM). They’ve got runs on the board with published models. It’s there, waiting to go.

I recognise that simply picking openEHR holus-bolus like this is extremely distasteful to many people. OpenEHR is still missing a few things across the stack, and the reference model is too EHR-specific rather than being a general clinical model – and it seems unlikely that CIMI has the resources to change these things, so we’d just have to live with the way it as and work with them. And of course, there’s a series of personal and political factors.

This is the first choice: pick the least worst established clinical modelling paradigm.

UML

The second option is to abandon any hope of a clinical-friendly modelling tool, and bite the bullet by adopting UML. This is the IT centric solution. But if you’re going to do this, do it is simply as possible. No fancy stereotypes. In fact, no enforcing of a reference model (it’s the reference model that complicates everything). Given the fragility of the UML tools (i.e. total lack of interoperability between tools), CIMI should ban anything other than classes, attributes, and associations. No stereotypes, no properties, no profiles. That’d mean a lot of missing functionality – but we’d just have to live that and work around it.

The real price of this isn’t that UML isn’t clinical-friendly, it’s the reference model. Given the cost of creating a reference model, and the fact the existing reference models aren’t created to be used in such a brutally simplistic way, this approach involves abandoning a serious reference model – and that’s exactly what some of the participants want, not understanding what it is that a reference model achieves.

Hybrid Models

From the beginning, CIMI has wanted to explore the auto-generation of multiple formats – CDA templates, v2, openEHR, java, xml, etc. Java, various forms of XML - that makes perfect sense. But the others? I spend quite a bit of time converting models and/or instances between the CDA, v2, and openEHR worlds, and they’re not just alternative syntaxes – they have completely different ways of understanding the world (or not, for v2). Real human input is required to effect these transforms. In the end, any auto-generation facility would become a transparent syntax conversion layer, and the CIMI models would have to contain the expression of the model in each of the target formalisms. It’s hard enough to model against one paradigm, let alone all 3 (or more). Whatever CIMI produced from this path would be a methodology that very few uber-experts could make use of. This isn’t an option.

Neither of the two choices are really palatable choices. But what other practical choices exist? CIMI is at a crossroads – it needs to pick something that will work.

p.s. Actually, several people have pointed out to me that FHIR might be a logical choice for CIMI – but FHIR’s got the slight problem that it doesn’t actually exist yet, so I’m not going push that forward for CIMI. Yet.