Thursday, January 12, 2012

SDTM in XML - the data themselves

No that we made the define.xml more logical (and much more end-to-end-friendly), we can do the same for the data themselves.
We do not need VSTEST anymore (as it is a "synonym" or "display" variable, and listed in the metadata),so I commented it out, and we also can move the units of measurement to where they belong, i.e. as an attribute to the data point rather than as an attribute to the record.

This leads e.g. to the following SDTM-XML:


Remark that there is no explicit VSORRESU, nore VSSTRESU anymore, but the units have been attached directly to VSORRES and VSSTRESN.

When going from the "flat" SDTM-XML representation (see post of xxxx-xx-xx), I would call this "minimal-invasive multidimensional SDTM in XML" (the world is not round!).

There is some similarity with how HL7-v3, HL7-CDA/CCD and ISO-21090 is handling such information, e.g.:


We see that the unit of measure ("unit" attribute to "value" element)  is directly attached to the datapoint itself.
There are however also some main differences with ODM:
  • HL7-v3 (as far as I know) does not have a construct for isolating the metadata. It does not know about planning a visit, planning which forms are used in a visit, which questions to ask etc..
    In HL7-v3 every data point is an "observation", and one cannot see whether that "observation" was planned or "ad hoc", i.e. the physician spontaneously decided to do a specific test or to make a specific observation.
    Therefore, there is also no referencing to a specific metadata section.
Also remark in the above HL7-v3 snippet (it comes from a CCD document) the use of "xsi:type" which is currently disputed within HL7 as it is not well validatable and essentially has nothing to do with XML-Schema types.

Remark also the bad usage of date formats (in element "effectiveTime") which does not follow the ISO-8601 rules for XML dates. The disastrous effect is that e.g. a date "2011-02-29" (which does not exist) is a vaid date in HL7-v3.

On the other hand, HL7-V3 uses a lot of "code" and "codesystem" with OIDs (unique object identifiers). In the snippet the codesystem "2.16.840.1.113883.6.96" stands for SNOMED-CT, and the code "271649006" stands for "systolic blood pressure" (NEVER trust what is in "DisplayName"!).
CDISC however decided to generate its own controlled terminology (unfortunately without OIDs), which means that we urgently needs mappings between CDISC-CT and coding systems used in healthcare such as SNOMED-CT, ICD-10 and LOINC, if we really want to enable integration between healthcare and clinical research.

Another very nice thing in CDA is the use of UCUM units of measurement (http://unitsofmeasure.org/) which I could recommend highly. There is currently no good placeholder in ODM for adding UCUM units of measure (except maybe for the "Alias" element), so I think we should have an additional attribute in the next version of ODM to allow to give the UCUM code for each unit of measure that we use in the study. The great advantage is that the use of UCUM codes easily allows for transforming one unit in another (e.g. from pounds to kilogram).
But that's another topic for one of the next blog entries.

Back to our SDTM-XML data snippet. I call it minimal-invasive because it only deviates in a small amount from a two-dimension representation of the data.

But if we look more careful, we can see a lot of things we can further improve:
  • do we need the datapoint (SDTM variable) "DOMAIN"?. The fact that we have ItemGroupOID="MyStudy:VS", which is a reference to the "ItemGroupDef" with OID "MyStudy:VS" and which has the attribute "Domain" with value "VS" already gives us that information

No comments:

Post a Comment