Database schema for my FHIR server

Previous: Question: Special Characters in HL7 v2 »

Database schema for my FHIR server

Dec 29, 2012

Several implementers have asked me what database I use for my FHIR server, and how I structure the database. I use an SQL database – either MSSQL or MySQL (the cloud server uses MSSQL because Microsoft gives me free licenses via BizSpark). Most of the other implementations use something like MongoDB.

This diagram summarizes the schema that I use:

FhirTypes

This table has a row for each defined table type – a name and an internal primary key, along with a set of status fields & flags for how the system supports the type:

Supported – is the resource even accepted for this server
LastId – tracks the highest integer id value (if integer ids are being used)
IdGuids – whether to accept UUIDs logical ids proposed by clients
IdClient – whether to allow the client to put to a non-existent resource
IdServer – whether to allow the client to post to the resource type and ask the server to create a resource
cmd[X] – whether the logical operationis supported or not (note, some operations cannot be supported for some resources, such as search on a binary resource)
Versionupdates – whether to insist on version aware updates for this resource type

Note that most systems would fix these things globally for all resources, but I’ve found it useful to be able to vary the policy for different resources – it helps during the connectathon testing phase.

FhirIds

A list of known resources with logical ids:

ResourceKey – primary internal key for resource (not visible externally)
ResourceTypeKey – type key (reference to FhirTypes)
Id – logical Id (char(36) as specified in FHIR spec)
originalId – an extra field my implementation tracks, which is the id a client first provided. This is echoed back to clients so they can link resource by their original id. This is not required by the FHIR spec
MostRecent – a reference to FhirVersions.ResourceVersionKey, the current version of the resource

FhirVersions

An actual instance of a resource (this table is insert only):

ResourceVersionKey – internal primary key, not visible externally
resourceKey – reference to FhirIds.ResourceKey
StatedDate – the date the client claimed that this resource was created (UTC)
TransactionDate –the actual date the event occurred (UTC)
VersionId –the version is assigned when the resource was created
Deleted – true if this is actually a deletion of the resource
Format –flag for Xml or Json – the resource is stored as it was received
Author – from the http layer, who the author was, if known, or just an ip address
[Content] –the actual resource as a binary (gzipped)

FhirIndexes

List of the indexes that for each resource (derived from the search parameters defined in FHIR spec):

IndexKey – private internal key
ResourceTypeKey – reference to FhirTypes.ResourceTypeKey
Name – Search Parameter name
Type – type for search parameter from spec. Note that the code that uses the indexes needs to know this value implicitly, so I’ve never actually used this field for anything

FhirIndexEntries

An actual index value. These are populated whenever a resource is inserted into FhirVersions. All existing entries for the resource are deleted, and a new set are inserted by the code processing the resource (I write this code by hand):

EntryKey – internal primary key, never used anywhere yet
IndexKey – reference to FhirIndexes.IndexKey – which index this is an entry in
ResourceKey – reference to FhirIds.ResourceKey
SpaceKey – reference to FhirSpaces.SpaceKey – see below
Value – the actual value. Char(128) – search won’t work for values longer than 128 chars

FhirSpaces

For a QToken, the search is actually on a pair of strings – typically a URL with a leaf value. FhirSpaces just normalises the first string – entries are added to this table for each new URI seen. This table exists purely to conserve space, though it may be useful for optimising searches that involve partial matches on QToken spaces, if we decide this is necessary (not supported for now)

This is the schema I use to provide a fully functional and fully flexible FHIR server. It’s not necessary to have a schema like this - you can implement a server that provides direct access to an existing database schema that supports an actual application. That’d be fine and makes sense for many/most implementations, but the functionality provided by such an implementation is going to be limited to that allowed by the application - in particular, providing version history and fully functional searching (especially chained parameters) is going to be challenging with that approach (though it makes sense - you provide support for what makes sense to do as driven by your schema). But since I’m doing a reference implementation, I need to do those things, and hence this schema that provides full support.