Design by Constraint – not as useful as people think (#3)
May 1, 2011This post is part of the Design by Constraint series (first post) Instances, Designed by Constraint**
So let’s take a look at an instance.Here’s an example UML instance of data that conforms to the constrained model:
The first thing to say about the instance is that an instance of data is always described in terms of some definitional model. You don’t simply put the data down:
Ravi's Tandoori Fantasy, 3,
42 Main St Hicksville, Victoria, 3874
Greedy & Mean, Inc, 123-34-35
Ravi, M
Shivra, F
Bit’s of this might be recognizable, but it needs structure and information about the data items. And this is always be given a reference elsewhere. Many people don’t understand this about XML, because it’s called “self-describing” - but it never is. It achieves it’s description by reference to some other model in the element and attribute names - the only question is whether the other model is implicit or explicit. And we know from long experience how well implicit agreements work.
Any way, the point of interest here is that the data conforms to two different descrptions at once, and can be described either way - by the general reference model, or by the specific constrained model. The easiest way to illustrate this is using XML to represent the data. This first XML describes the data in terms of the general reference model:
<Restaurant>
<name>Ravi's Tandoori Fantasy</name>
<seats>3</seats>
<address>42 Main Street</address>
<address>Hicksville</address>
<address>Victoria 3874</address>
<person>
<role>ACC</role>
<name>Greedy & Mean, Inc</name>
<taxId>123-34-35</taxId>
</person>
<person>
<role>EMP</role>
<name>Ravi</name>
<sex>M</sex>
</person>
<person>
<role>EMP</role>
<name>Shivra</name>
<sex>F</sex>
</person>
</Restaurant>
This second XML defines the data in terms of the constrained reference model:
<FamilyBusinessRestaurant>
<name>Ravi's Tandoori Fantasy</name>
<seats>3</seats>
<address>42 Main Street</address>
<address>Hicksville</address>
<address>Victoria 3874</address>
<accountant>
<name>Greedy & Mean, Inc</name>
<businessId>123-34-35</businessId>
</accountant>
<family>
<name>Ravi</name>
<sex>M</sex>
</family>
<family>
<name>Shivra</name>
<sex>F</sex>
</family>
</FamilyBusinessRestaurant>
The data is the same in either case; the XML names are different, and there’s more information in the general one, whereas in the specific case, that information has been moved out of the instance into the context.
The models may be transformed from one to the other.Assuming full knowledge of the reference and constrained models, and the constraint transform, and that the instance conforms to the constraints, it’s always possible and straightforward to transform the specific instance to the general instance. On the other hand, it isn’t always possible to transform from the general model to a specific model. Firstly, it can take considerable computing power - you have to speculatively (and recursively, often) determine which of a set of constraints an instance conforms to. Actually this is a regular feature of parsing instances, but the ones that rise from design by constraint tend to have deeper resolution points.
In addition, it’s not always possible to unambiguously resolve which constraint an instance conforms to. A typical example of this is found in CDA:
In this model, a manufactured product (on the right) has an entity (on the left) that is either a “LabeledDrug” - a Manufactured material (MMAT), with optionally a code and a name, or a “Material” which is a Manufactured material (MMAT), with optionally a code, a name, and a lot number. This is not only an ambiguous model (how do you tell which one an instance is?), it is an example of semantics in the model that are not in the instance. So it’s a bad model - ambiguity usually means that - but it’s really difficult to eliminate/prevent these things (there’s some research work around on how to define such ambiguity and write tooling to prevent it).
General vs Specific - which to use?
Which XML should you choose? General Model
- a single form is shared between all systems - even if they have narrow scopes
- This imposes greater costs across all implementations
- Narrow implementations will pay higher costs due to the increased capability of the general model - particularly when reading the instance
It’s worth emphasizing as this point - the general XML (check above) - it includes a whole lot of things that aren’t really pertinent to the particular case. They relate the specific case to the general case. The increased cost comes because of the cost of everybody explicitly representing that mappings to the general case. For systems that deal with the general case - that’s a good thing. For systems that don’t - it’s a pure overhead, a tax. And the important question is, what do I get for my tax? (Except for some countries, where the mere fact that it’s a tax makes it bad ;-))
Specific Model:
- Different implementations will have different scopes / different use-cases, and therefore they’ll use different forms
- Unless the use-case is limited to a single scope (in which case the general model is spurious anyway), some transform engine will be needed somewhere in the mix
It’s worth noting about the specific model - an inevitable outcome of the using the specific models will be that systems end up handling the same data again and again, but each time with a completely different engineering basis - even if they nearly look all the same. In the end, almost every system that encounters multiple use cases will start abstracting their internals away from the specific engineering towards the general case.
I think that if you handle 1-3 use cases, then specific is better for you. But around 4-5 cases, it becomes worth investing in the general approach. Mileage varies widely depending on complexity and team culture, of course.
But whichever model you choose, it will lead to transforms everywhere - as long as scopes differ, and analyses of requirements differ, there’s going to be transforms. And transforms are expensive. Expensive to specify, develop, and execute. In addition, many system architectures don’t fit transforms easily.
(I don’t actually understand that bit. Do architects really think they won’t be doing transforms? How stupid would that be? But it’s true, I see it all the time. It’s actually an attempt to violate my second law, and externalise complexity away. I know it will fail eventually) (Oh, and also, an alternative approach to solving this general problem is to indulge in the fantasy that there’s some general model that can be imposed on all applications so that this model mismatch doesn’t occur. This one violates my second law multiple ways, and will always fail in the end, but that doesn’t stop people from trying it out.)
There’s yet another option, which is to try to do both at once. Here’s an XML example:
<FamilyBusinessRestaurant type="Restaurant">
<name>Ravi's Tandoori Fantasy</name>
<seats>3</seats>
<address>42 Main Street</address>
<address>Hicksville</address>
<address>Victoria 3874</address>
<accountant type="Person" association="person">
<role>ACC</role>
<name>Greedy & Mean, Inc</name>
<businessId source="taxId">123-34-35</businessId>
</accountant>
<family type="Person" association="person">
<role>EMP</role>
<name>Ravi</name>
<sex>M</sex>
</family>
<family type="Person" association="person">
<role>EMP</role>
<name>Shivra</name>
<sex>F</sex>
</family>
</FamilyBusinessRestaurant>
This example uses the specific model names, and adds the general case as markup. The same arguments would apply if we reversed that. And this is not just an XML thing - you could do the same using a UML instance diagram (or metaattributes in C# etc):
This approach is the worst of all worlds - the system that builds the instance must know the general model and all the specific models that may apply (because multiple may apply). And there’s no support for this kind of thing using tooling for CASE/MDA (well, maybe a little).The only way to make it happen is custom tooling - which includes internal transforms, and with external transforms here and there
Summary
Which brings you back to the conclusion:
Design by Constraint => Transforms everywhere
Next: what does this mean?