Further Analysis of CDA vulnerabilities

Apr 8, 2014

This is a follow up to my previous post about the CDA associated vulnerabilities, based on what’s been learnt and what questions have been asked. Vulnerability #1: Unsanitized nonXMLBody/text/reference/@value can execute JavaScript

PCEHR status: current approved software doesn’t do this. Current schematrons don’t allow this. CDA documents that systems receive from the PCEHR will not include nonXmlBody.

General status: CDA documents that systems get from anywhere else might. In fact, you might get HTML to display from multiple sources, and how sure are you that the source and the channel can’t be hacked? This doesn’t have to involve CDA either - AS 4700.1 includes a way to put XHTML in a v2 segment, and there’s any number of other trading arrangements I’ve seen that exchange HTML.

So what can you do?

  • Scan incoming html to prevent active content in the HTML. (schematrons, or use a modified schema)
  • don’t view incoming html in the clinical system - use theuser’s standard sandbox (e.g. the browsers)
  • change the protocol to not exchange raw html directly

Yeah, I know that this advice is wildly impractical. The privilege of the system architect is to balance between what has to be done, and what can’t be done ;-)

FHIR note: FHIR exchanges raw HTML like this. We said - for this reason exactly - that no active content is allowed. We’re going to release tightened up schema and schematron, and the public test servers are tightening up on this.

Vulnerability #2: Unsanitized table/@onmouseover can execute JavaScript

PCEHR status: these documents cannot be uploaded to the pcEHR, are not contained in the PCEHR, and the usual PCEHR stylesheet is unaffected anyway.

General status: CDA documents that you get from anywhere else might include these attributes. If the system isn’t using the PCEHR stylesheet, then it might be susceptible.  Note: this also need not involve CDA. Anytime you get an XML format that will be transformed to HTML for display, there might be ways to craft the input XML to produce active content - though I don’t know of any other Australian standard that works this way in the healthcare space

So what can you do?

Vulnerability #3: External references, including images

PCEHR status: There’s no approved usage of external references in linkHtml or referenceMultimedia, but use in hand written narrative can’t be ruled out. Displaying systems must ensure that this is safe. There will be further policy decisions with regard to the traceability that external references provide.

General Status: any content that you get from other systems may include images or references to content held on external servers, whether CDA, HTML, HL7 v2, or even PDF. If you are careless with the way you present the view, significant information might leak to the external server, up to and including the users password, or a system level authorization token, or the user’s identity. And no matter how careful you are, you cannot prevent some information leaking to the server - the users network address, and the URL of the image/reference. A malicious system could use this to track the use of the document it authored - but on the other hand, this may also be appropriate behaviour.

So what can you do?

  • Never put authentication information in the URL that is used to initiate display of clinical information (e.g. internally as a application library)
  • Never let the display library make the network call automatically - make the request in the application directly, and control the information that goes onto the wire explicitly
  • If the user clicks an external link, make it come up in their standard browser (using the ShellExec on windows or equivalent), so whatever happens doesn’t happen where the code has access to the clinical systems
  • The user should be warned about the difference between known safe and unknown content - but be careful, don’t nag them (as the legal people will want, every time; but soon the users click the warning by reflex, and won’t even know what it says)

Final note: this is being marketed as a CDA exploit. But it’s an exploit related to the ubiquity of HTML and controls, and it’s going to be more and more common…

Update - General mitigating / risk minimization approaches

John Moehrke points out that there’s a series of general foundations that all applications should be using, which mitigate the likelihood of problems, and or the damage that can be caused.

Authentication

Always know who the other party (parties) your code is communicating with, establish the identity well, and ensure communications with them are secure. If the party is a user using the application directly, then securing communications to them isn’t hard - then the focus is on login. But my experience is that systems overlook authenticating other systems that they communicate with, even if they encrypt the communications - which makes the encryption largely wasted (see “man in the middle”). Authenticating your trading partners properly makes it much harder to introduce malicious content (and is the foundation on which the PCEHR responses above rest on). Note, though, the steady drum of media complaints about the administrative consequences of properly authenticating the systems the PCEHR interacts with - properly authenticating systems is an administrative burden, which is why it’s often not done.

Authorization

A fundamental part of application design is properly managed authorization, and to do so throughout the application. For instance, don’t assume that you can enforce all proper access control by managing what widgets are visible or enabled in the UI; eventually additional paths to execute functionality will need to be provided, in order to support some kind of workflow integration/management. Making the privileges explicit in operational code is much safer. And means that rogue code running the UI context doesn’t have unrestricted access to the system (though a hard break like between client/server is required to really make that work)

Audit logging

Log everything. With good metadata. Then, when there is belief that the system is penetrated, you can actually know whether it is or not. Make sure that the security on the system logs is particularly strong (no point keeping them, but making it easy for the malicious code to delete them). If nothing else, this will help trace an attacker, and prevent them from making the same attack again because no one can figure out what they did

Note: none of this is healthcare specific. It’s all just standard application design, but my experience is that a lot of systems in healthcare aren’t particularly armored against assault because it just doesn’t happen much. But it’s still a real concern.