Interoperability Requirements #3: Identification Policies

May 4, 2011

As well as agreed meaning for concepts, if we are to exchange information about things, we have to be able to identify things. You can’t have any useful interoperability talking about “some patient” being admitted to “some hospital by some Doctor” (well, politicians try to do it). You have to identify them in some systematic way. First, theory.

It’s necessary to be able to identify entities when we refer to them, and also to consistently identify the same entity over the course of a conversation. And it’s a lot harder than it sounds to do this well. The poster child for how hard identification can be is the problem of identifying people. Identifying a person correctly is a huge challenge. It’s not the only identification challenge in healthcare, but it dwarfs the others, it’s pervasive and it comes up in many different contexts. Other than people, ‘instances’ that need to be identified include episodes of care, institutions, care providers, specimens, accounts, order numbers, etc.

Generally, identifying one of these things is based on documenting one or more unique characteristics that can be used to differentiate between different instances. For instance, with a person, you can use a combination of biometric properties (appearance, finger print, etc) and social characteristics (name, address, date of birth). All of the possibilities have some drawbacks, but are routinely used to identify people one way or another.

For other entities – such as containers, for instance – there’s just no intrinsic characteristic to differ between them. In lots of cases, there’s no need to – they’re all the same. But sometimes they are used in a particular way that has workflow ramifications later, and we need to be sure that it’s the same instance as earlier in the workflow. In these cases, some central controller issues each of the instances with a unique number which never changes, which is known as the “identifier”, and the unique number is imprinted on the instance one way or another. Database record primary keys are one such example.

The central controller must have procedures to ensure that the same number is not handed out more than once in a given scope of uniqueness, since two instances with the same number cannot be differentiated. Since there is no real way to control the scope in advance, it’s best that such a controlling authority issues globally unique numbers that will never be re-used by anybody else ever.

Since examining instances to test their differentiating characteristics is both time consuming and error prone, it is normal to use the second identification approach once the initial identification is done: the instance is issued a unique number once enrolled which is then used to identify it from then on. Many systems will only encounter the instance in the context of a single conversation (just echo it back) and are only concerned with the secondary identifier.

In almost all cases where identifiers are assigned, a registry is kept with details of how each identifier was assigned. This registry becomes a valuable resource for any business process that involves these identified entities – it makes information about them available, may provide assistance in cases of doubt over the identification, and allows multiple different but related business process and information systems to share the identification amongst themselves. For this reason, registry interoperability is always one of the first problems to be resolved.

Even with the presence of identification registries, it is still necessary for systems that interoperate to share a common policy about how things are identified, when registries are consulted, and how they themselves are identified. When everything is going to plan, such policies are redundant. But when things go wrong, that’s when the trouble starts, and interconnected systems with poorly documented or inconsistent identification policies are going to be in trouble.

Managing Errors - or Not

The classic example here concerns patient duplication and/or episodes created or deleted in error. The following scenario is all too familiar to experienced interoperability practitioners.

Acme Hospital Inc has engaged NewStartUp, LLC to develop a custom application for managing the long-term health of its renal transplant patients. One of the primary interfaces the new application will have is a feed of clinical laboratory reports. The clinical laboratory system vendor is EverJustOk Inc. The hospital project manager has called a meeting between the renal physicians, the laboratory, the hospital information administators, and the two vendors, with an expert interfacing consultant to assist with the process.

All is going well. The two companies have agreed that they will use HL7 v2 messages, and they have agreed how they will be populating the PID segment: they are going to use just the hospital MRN, since both are tightly coupled with the hospital patient administration system (PAS). In addition, the PID segment will have surname and DOB as a sanity check, though no one is tasked to figure out whose sanity this might save, and how. Then the interfacing consultant asks the two vendors to discuss the effect of patient merges on the interface.

While NewStartUp is trying to cover their ignorance (patient merges? what patient merges?), everyone is focused on the rather sheepish look of the laboratory system administrator, who explains that they’ve been ignoring PAS patient merges for a couple of years since they’re so often wrong, and they’ve been doing their own merges based on their own interaction with the patients. At this point the meeting blows up, since the hospital information managers are incensed that the laboratory has been trashing their databases like that – and insulting their professionalism as well.

What happens next varies. In some institutions, the project is over, and will never be discussed again. In others, everyone collectively draws their breath, and moves on the next subject. Patient merges will never be discussed again. In others, a separate committee is created that will give the subject long and thoughtful consideration, and tinker with the procedures, but probably nothing will change, because usually the price of changing things is worse than the cost of the current problems. A very small percentage of times, the problem is fixed up. But usually the vendors will need to build in workarounds for these problems in their interfaces.

Here’s another example for the kind of problems that can occur with regard to identifying objects.

Acme Hospital has just replaced one patient administration system (PAS) with another. Because ACME hospital has many different systems receiving HL7 v2 admission and discharge notifications from the PAS, and it cannot afford to pay each of the vendors to make changes to their systems – not that many of them would be able to do so in time anyway, it asks the vendor of the new system to ensure that the HL7 message feed is identical to the old system. The vendor examines the specification of the messages – a document that is now ten years old, and a sample of the current messages, and modifies its message generation configuration to make the HL7 fields of its outgoing HL7 messages exactly the same.

The project manager would like to test the new messages, but that’s just not possible. So everyone crosses their fingers and goes with the upgrade. For the first few days, everything is good, and everyone signs off on the upgrade. However by the end of the first week, it’s clear that there’s a major and growing problem: many of the systems in the hospital are confused about the location of a few patients in most of the wards in the hospital.

Eventually, after much finger pointing and many furious arguments, one of the vendors pin-points the problem. While the new and old messages are field for field identical, the way the new PAS handles episode identification is different to the way the old one did. The old PAS gave episodes a sequential number starting at 1 for each patient. So patient 34 would have episode 34-1, then 34-2, etc. When merging patients, the old PAS would issue an A18 (merge patient) message, followed by a series of A08 (update episode) to advise of the new renumbered episode from the two merged records. Receiving systems would use the patient id and the existing episode date details to update the identifier of the episode.

The new system issues each episode a unique identifier. The episodes are not renumbered when patient records are merged. But when an episode is added to the wrong patient, instead of canceling the episode and creating a new one, it issues an A08 (update episode) to advise that the episode now belongs to a new patient. Receiving systems have to process this differently – instead of using the patient and updating the episode id, they should use the episode id and update the patient id. But of course, they are still processing them the old way.

It’s one thing to realize the problem, but what now? A number of vendors must fix their interfaces, something that generally doesn’t happen to quickly, and someone must somehow try and repair the databases without doing more damage. Ouch.

It’s easy to laugh at this case, but it’s not obvious that the net cost of dealing with all the problems would be more expensive than testing in advance for this case.

Both of these examples are based on real stories. Of all the interoperability problems I have experienced, identification based ones are the hardest to resolve. I estimate that about a third of my effort in getting interfaces to work is spent on getting identifiers sorted out. There’s a lot more to say about identifiers and identification (and merging), but that will have to be deferred to later posts.

Of my 6 requirements for interoperability, I think HL7 does worst at identifiers. We generally let each object have a series of identifiers, and assume that the implementers are going to figure it out. To a degree that’s all that we can do, given such wide business variation out there. A variety of HL7 specifications take a variety of approaches. Specifications dealing with identification per se (EIS, registry profiles etc) do a lot better than more general specifications. I think we need an ontology of identifiers, and policies, procedures and patterns to follow when writing specifications. This way our specifications will be more consistent, and it will be easier for implementers to understand each others policies. Note: DICOM is more proscriptive about identification, which is good for interoperability within an enterprise, but rather limiting when expanding across enterprises. And IHE has done quite a lot of work on cross-enterprise identification.