Transforming Pathology PIT reports into NEHTA Clinical Documents
Jun 11, 2012This is the second of a series of posts dealing with representing pathology reports in NEHTA Discharge summaries. The first post dealt with the transforming the TX format into CDA. This post deals with the PIT format. Follow up posts will deal with the FT format, and then with the general context of representing pathology reports in clinical documents.
The PIT format is an old format for sending pathology reports to the requesting doctor. It’s throughly outdated, everybody dislikes it, and no one is supposed to send it anymore. So, of course, you run into it all over the place. The documentation can be found here.
Here’s a sample of the core of a report (just the bits I’m interested in for this post):
301 ~SBLK~FINAL REPORT~EBLK~
301 ------------
301 Prothrombin Time ~SBLD~ 17~EBLD~ seconds
301 I.N.R. ~SBLD~1.6~EBLD~ (International Normalised Ratio)
301 __________________________________________________________________
301 Cumulative INR Report Date INR Req.No.
301 08/06/2010 ~FG05~1.4~FG15~ 254084
301 14/06/2010 2.0 259680
301 28/06/2010 2.5 272727
301 05/07/2010 ~FG04~4.8~FG99~ 279365
301 12/07/2010 3.0 286035
301 19/07/2010 1.6 292225~DFLT~
301
301 ÚÄÄÄINRÄÄÄÂÄÄÄÄÄÄÄÄÄCONDITIONÄÄÄÄÄÄÄÄÄÄÄÄÄÄÂÄÄLENGTH OF THERAPYÄÄ¿
301 ³ 2.0-3.0 ³ Atrial Fibrillation ³ Long Term ³
301 ³ 2.0-3.0 ³ Bioprosthetic Valve ³ 3 months ³
301 ³ 2.0-3.0 ³ Acute Myocardial Infarction ³ 3 months ( > if AF) ³
301 ³ 2.0-3.0 ³ Cardioembolic CVA, Rec. DVT/PE ³ Long Term ³
301 ³ ³ Dilated Cardiomyopathy ³ ³
301 ³ 2.0-3.0 ³ Venous Thrombosis and PE ³ 3-6 months~EUND~ ³
301 ³ 2.5-3.5 ³ Mechanical Heart Valve ³ ~SUND~Long Term~DFLT~ ³
301 ÀÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
309
The full PIT source is available here. The first part of the line is a number which tells you what the line means. The rest is text with formatting commands identified by tilde characters. Here’s a summary of the formatting commands:
- BGnn Specify a background color (see colour table below) The default colour is BG99, and the colour should always be reset at the end of the line
- FGnn Specify a text color (see colour table below) The default colour is FG99, and the colour should always be reset at the end of the line
- SBLD/EBLD Start and end bolding
- SUND/EUND Start and end underlining
- SBLK, EBLK Blinking – most targets do not support blinking
- PIpp Pitch control – meaning unclear
- FOff Meaning unknown
- DFLT set everything to default
For further details, consult the PIT doco link above.
Converting PIT to CDA Narrative
In principle, the conversion process is relatively straightforward:
- Strip the leading number and spaces
- replace the old table border characters if they appear
- replace the formatting commands with CDA equivalents (see below)
- wrap the content with a CDA paragraph with style xPre
In practice, several things complicate the conversion process:
- PIT formatting commands are not always paired. (They’re supposed to be, but most systems will overlook such errors, so they’re often not detected by the sender)
- the commands do not need to be well formed (i.e. start underlining, then start bold, then stop underlining, then stop bold)
- the DFLT command resets everything
- Limitations around the CDA handling of narrative and whitespace
Step #1: leading spaces
The first task is to strip the line prefix (a 3 numeral line code prefix followed by a space):
~SBLK~FINAL REPORT~EBLK~
------------
Prothrombin Time ~SBLD~ 17~EBLD~ seconds
I.N.R. ~SBLD~1.6~EBLD~ (International Normalised Ratio)
__________________________________________________________________
Cumulative INR Report Date INR Req.No.
08/06/2010 ~FG05~1.4~FG15~ 254084
14/06/2010 2.0 259680
28/06/2010 2.5 272727
05/07/2010 ~FG04~4.8~FG99~ 279365
12/07/2010 3.0 286035
19/07/2010 1.6 292225~DFLT~
ÚÄÄÄINRÄÄÄÂÄÄÄÄÄÄÄÄÄCONDITIONÄÄÄÄÄÄÄÄÄÄÄÄÄÄÂÄÄLENGTH OF THERAPYÄÄ¿
³ 2.0-3.0 ³ Atrial Fibrillation ³ Long Term ³
³ 2.0-3.0 ³ Bioprosthetic Valve ³ 3 months ³
³ 2.0-3.0 ³ Acute Myocardial Infarction ³ 3 months ( > if AF) ³
³ 2.0-3.0 ³ Cardioembolic CVA, Rec. DVT/PE ³ Long Term ³
³ ³ Dilated Cardiomyopathy ³ ³
³ 2.0-3.0 ³ Venous Thrombosis and PE ³ 3-6 months~EUND~ ³
³ 2.5-3.5 ³ Mechanical Heart Valve ³ ~SUND~Long Term~DFLT~ ³
ÀÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Step #2: table border characters
Some PIT - not all - includes old ASCII tables based on old character sets from windows pages (or even DOS!). Old, but the systems sending PIT are old.
Here’s a lookup table for the old DOS OEM code page:
DOS Char | Unicode Char |
DA | 250C |
C4 | 2500 |
C2 | 252C |
BF | 2510 |
B3 | 2502 |
C0 | 2514 |
C1 | 2534 |
D9 | 2518 |
This table is missing some characters (for inner horizontal lines) - I’ll add them when I find them out (if I can).
After that, it looks like this in UTF-8. (I have no idea whether this will render in the blog for your browser, and I don’t know how to control how WordPress renders it):
~SBLK~FINAL REPORT~EBLK~
------------
Prothrombin Time ~SBLD~ 17~EBLD~ seconds
I.N.R. ~SBLD~1.6~EBLD~ (International Normalised Ratio)
__________________________________________________________________
Cumulative INR Report Date INR Req.No.
08/06/2010 ~FG05~1.4~FG15~ 254084
14/06/2010 2.0 259680
28/06/2010 2.5 272727
05/07/2010 ~FG04~4.8~FG99~ 279365
12/07/2010 3.0 286035
19/07/2010 1.6 292225~DFLT~
┌───INR───┬─────────CONDITION──────────────┬──LENGTH OF THERAPY──┐
│ 2.0-3.0 │ Atrial Fibrillation │ Long Term │
│ 2.0-3.0 │ Bioprosthetic Valve │ 3 months │
│ 2.0-3.0 │ Acute Myocardial Infarction │ 3 months ( > if AF) │
│ 2.0-3.0 │ Cardioembolic CVA, Rec. DVT/PE │ Long Term │
│ │ Dilated Cardiomyopathy │ │
│ 2.0-3.0 │ Venous Thrombosis and PE │ 3-6 months~EUND~ │
│ 2.5-3.5 │ Mechanical Heart Valve │ ~SUND~Long Term~DFLT~ │
└─────────┴────────────────────────────────┴─────────────────────┘
Step #3: replace the format tags
In principle, the mapping table goes like this:
SBLD | |
EBLD | </content> |
SUND | |
EUND | </content> |
SBLK | |
EBLK | </content> |
FGnn | |
FG99 | </content> |
BGnn | |
BG99 | </content> |
PIpp | ignore |
FOff | ignore |
DFLT | </content> |
Comments:
- There’s no support for blink in CDA/XHTML. Since the purpose of blink is to draw attention, we’ll have to settle forbold,underline, anditalic, andMaroon(see note on use of colour below). That should make it stand out.
- Because the control codes can be interlaced, the easiest way to manage the conversion process is to keep a set of flags for the various format options, start the paragraph with a
tag, and just lay down a each time the styling changes, and then finish with a . Not pure, but this is only for presentation. after all
Colour table:
n | name | html name | code |
00 | Black | Black | 000000 |
01 | Blue | Blue | 0000FF |
02 | Green | Green | 008000 |
03 | Cyan | Cyan | 00FFFF |
04 | Red | Red | FF0000 |
05 | Magenta | Magenta | FF00FF |
06 | Brown | Brown | A52A2A |
07 | Light Grey | Light Grey | D3D3D3 |
08 | Dark Grey | Dark Grey | A9A9A9 |
09 | Light Blue | Light Blue | ADD8E6 |
10 | Light Green | Light Green | 90EE90 |
11 | Light Cyan | Light Cyan | E0FFFF |
12 | Light Red | Salmon | FA8072 |
13 | Light Magenta | Violet | EE82EE |
14 | Yellow | Yellow | FFFF00 |
15 | White | White | FFFFFF |
99 is the default colour, which means white for background, and black for text. However there’s no reason from the PIT specification that the colours shouldn’t be entirely backwards as would suit an old console display (I heard of that once, but I don’t know if it’s still ever done - so check. Just to make the point, in this example, FG15 means default, not white. Don’t get caught doing white on white or blank on black).
That gives the following content:
<content> </content><content styleCode="Bold Underline Italics xFgColour800000">FINAL REPORT</content><content>
------------
Prothrombin Time </content><content styleCode="Bold"> 17</content><content> seconds
I.N.R. </content><content styleCode="Bold">1.6</content><content> (International Normalised Ratio)
__________________________________________________________________
Cumulative INR Report Date INR Req.No.
08/06/2010 </content><content styleCode="xFgColourFF00FF">1.4</content><content> 254084
14/06/2010 2.0 259680
28/06/2010 2.5 272727
05/07/2010 </content><content styleCode="xFgColourFF0000">4.8</content><content> 279365
12/07/2010 3.0 286035
19/07/2010 1.6 292225</content><content>
┌───INR───┬─────────CONDITION──────────────┬──LENGTH OF THERAPY──┐
│ 2.0-3.0 │ Atrial Fibrillation │ Long Term │
│ 2.0-3.0 │ Bioprosthetic Valve │ 3 months │
│ 2.0-3.0 │ Acute Myocardial Infarction │ 3 months ( > if AF) │
│ 2.0-3.0 │ Cardioembolic CVA, Rec. DVT/PE │ Long Term │
│ │ Dilated Cardiomyopathy │ │
│ 2.0-3.0 │ Venous Thrombosis and PE │ 3-6 months</content><content> │
│ 2.5-3.5 │ Mechanical Heart Valve │ </content><content styleCode="Underline">Long Term</content><content> │
└─────────┴────────────────────────────────┴─────────────────────┘
Step #4: wrap into CDA Narrative
This bit is easy: just a paragraph with style xPre:
<text>
<paragraph styleCode="xPre">
<content> </content><content styleCode="Bold Underline Italics xFgColour800000">FINAL REPORT</content><content>
------------
Prothrombin Time </content><content styleCode="Bold"> 17</content><content> seconds
I.N.R. </content><content styleCode="Bold">1.6</content><content> (International Normalised Ratio)
__________________________________________________________________
Cumulative INR Report Date INR Req.No.
08/06/2010 </content><content styleCode="xFgColourFF00FF">1.4</content><content> 254084
14/06/2010 2.0 259680
28/06/2010 2.5 272727
05/07/2010 </content><content styleCode="xFgColourFF0000">4.8</content><content> 279365
12/07/2010 3.0 286035
19/07/2010 1.6 292225</content><content>
┌───INR───┬─────────CONDITION──────────────┬──LENGTH OF THERAPY──┐
│ 2.0-3.0 │ Atrial Fibrillation │ Long Term │
│ 2.0-3.0 │ Bioprosthetic Valve │ 3 months │
│ 2.0-3.0 │ Acute Myocardial Infarction │ 3 months ( > if AF) │
│ 2.0-3.0 │ Cardioembolic CVA, Rec. DVT/PE │ Long Term │
│ │ Dilated Cardiomyopathy │ │
│ 2.0-3.0 │ Venous Thrombosis and PE │ 3-6 months</content><content> │
│ 2.5-3.5 │ Mechanical Heart Valve │ </content><content styleCode="Underline">Long Term</content><content> │
└─────────┴────────────────────────────────┴─────────────────────┘
</content></paragraph>
</text>
So we claim that this is all one paragraph. Strictly, it’s wrong to claim that all this is one paragraph with line breaks, but in practice, it doesn’t matter. You could spend days refining an algorithm for recognising the end of a paragraph (it’s never as simple as a double end of line), and who’s going to care? This is only ever going to be for display.
Note that we have used xPre, for text containing whitespace and carriage returns which may not be ignored. It would be possible to use xFixed, and add
tags for line breaks. But the problem with xFixed and
is that leading spaces - they will be lost, and you can’t use to preserve them, since is not defined for CDA. So xPre it is, and the conversion process has to be careful with whitespace. (Of course, in this example, the whitespace probably isn’t that big a deal, but it serves to make the point).
Conclusion
So there you go - that’s how to convert the PIT formatted report into a NEHTA CDA document without losing any formatting. Two things to note:
- I’ve not dealt with the issue of line wrapping here - that’s just hard whether you’re displaying the PIT directly or whether it’s being displayed through CDA
- these extra style codes starting with x are all defined in the NEHTA stylesheet, and aren’t in the normal HL7 stylesheet. Ensure that the documents will only be sent to NEHTA-conformant rendering systems for safety here.