ODM2CDA and CDA2ODM: Tools to convert documentation forms between EDC and EHR systems

Currently, the documentation processes for routine patient care and clinical research
are disconnected. High regulatory requirements result in costly documentation processes
16]. Different technical standards are used in medical IT systems: Predominantly CDISC
standards for clinical research and HL7 standards for EHR systems, in particular CDA.
CDISC ODM is highly adopted by EDC vendors. ODM is used for data and metadata export,
but recently more and more for metadata import. Regulatory authorities like FDA support
CDISC standards, which is an important driver for this trend.

From a medical point of view, an EHR system and an EDC system collect information
for the same patient. From a data analysis perspective, many data points in both systems–such
as body weight–should be very similar or the same, only stored in different representations.
CDISC and HL7 standards serve different purposes, so there will be differences between
ODM and CDA.

Regarding data exchange between EHR and EDC systems, it would be very useful to extract
data points from one representation and transform it into another one. A prerequisite
for such data exchange is a transformation of ODM data structures into CDA and vice
versa. This enables clinicians and researchers to identify similarities and differences
between EHR and EDC documents. Comparison between EHR and EDC data structures can
be used at the trial design stage to optimize EHR and EDC documentation. In the trial
execution phase it can be used to identify data elements for re-use.

Transformation of ODM to CDA was already described in the literature 17], however so far no implementation was available to the scientific community. In this
work we present an open source transformation program between ODM and CDA data structures.
It was tested for two sets of files: ten ODM files from different documentation settings
and ten public CDA files.

It was demonstrated that an automated transformation between ODM and CDA is technically
feasible in principle. However, this transformation is “lossy”, i.e., has several
limitations due to specific properties of ODM and CDA: ODM is much more generic, because
it assigns data items to item groups, and these item groups to forms. In contrast,
CDA consists mainly of predefined sections of XML nodes related to diagnosis, allergies,
medication, findings etc. Many CDA documents contain a lengthy header with very detailed
descriptive metadata regarding administrative patient data (name, address), physician
and hospital-related data. From a data analysis perspective, most of these CDA header
elements are not useful for clinical research questions. In contrast, the body section
of many CDA files–which contains the interesting clinical data–is very short. From
a technical perspective, processing CDA files is more complicated than ODM files:
CDA combines XML node values with XML attributes and has a variable hierarchical structure.

There are several limitations of the proposed transformation between ODM and CDA.

In general, CDA is generated from data instances and it is not clear what data elements
are optional or repeatable (by default, the conversion tool assigns attributes Mandatory?=?“Yes”
and Repeating?=?“No”). CDA also provides narrative parts, i.e., non-structured data.
In contrast, ODM defines a full schema with optional, mandatory and repeatable data
elements. ODM items are represented as CDA assessment sections in the current implementation
of the conversion tool. The hierarchical structure of CDA is approximated by concatenated
item names in ODM format. ODM does not provide information about item classes, therefore
act classes are generated in CDA. Narrative text from CDA is ignored when it cannot
be assigned to structured data elements. The transformation is designed for metadata:
CDA files contain data for one patient while ODM files can contain data for large
patient cohorts.

Many differences between EHR and EDC standards have been reported 18], and there is a long scientific debate about standardised medical data models.

Transformation and mapping between EHR and EDC standards is a first, but important
step to enable comparison and discussion of data items. From a data analysis perspective,
a list of data items is a prerequisite for statistical analysis.

This requirement is addressed very well by the ODM standard. CDA was designed to represent
the current heterogeneity of clinical data structures. From a methodological point
of view, the large diversity of clinical documentation indicates that there is room
for improvement by standardisation: It is highly unlikely that all the diverse documentation
approaches are optimal. Transparency of data models and transformation between different
standards like ODM and CDA are first steps to trigger a discussion about best practice
in clinical and research documentation.

The proposed transformation approach can take into account semantic codes for data
items. However, most publicly available medical forms are not (yet?) semantically
annotated. The high number of data elements per documentation unit (up to 3000) in
this study indicate a need for automated metadata processing (for instance 19]), because manual mapping of many data elements is resource-intensive and error-prone.