Most engineers have never seen a real HL7 message.
If you’ve worked in fintech, you’ve probably parsed ISO 8583. If you’ve worked in logistics, you’ve dealt with EDI. If you’ve worked in healthcare data — specifically if you’ve built a live integration between two hospital systems — you’ve seen HL7 v2, and you have opinions about it that you didn’t have before.
I wrote the HL7 integration between King Hussein Cancer Centre (KHCC) and HAKEEM, Jordan’s national EHR, while working at Electronic Health Solutions between 2010 and 2013. Here’s what that actually looked like.
What an HL7 v2 message looks like
Let’s start with the wire format, because it’s the first thing that makes engineers make a face.
MSH|^~\&|KHCC|KHCC|HAKEEM|MOH|20120315120000||ADT^A01|MSG00001|P|2.3
EVN|A01|20120315120000
PID|1||12345^^^KHCC^MR||AL-MANSOUR^AHMAD^MOHAMMED||19750304|M|||AMMAN^JO||||AR|M||12345
PV1|1|I|ONCOLOGY^101^1^KHCC||||SMITH^JOHN^A^^^DR||||||||ONCOLOGY|||V|||||||||||||||||||||||20120315
That’s an ADT^A01 — an admit/discharge/transfer message, event A01 meaning
patient admission. Every segment is pipe-delimited. Fields within a segment
are delimited by pipes. Sub-fields are delimited by ^. Sub-sub-fields
by &. Repeating fields by ~.
The MSH segment is the message header: sending application, sending
facility, receiving application, receiving facility, timestamp, message type,
message ID, processing ID, version. The PID segment is the patient
identifier: patient ID list (there can be many, from many systems), name,
date of birth, gender, address, language, marital status, religion, account
number.
HL7 v2.3 — the version we were working with — was finalized in 1997. The pipe-delimited format dates to a time when bandwidth was expensive and XML hadn’t yet conquered the enterprise world. It’s not beautiful. It maps directly to how hospital systems actually store and exchange patient data, which is why it’s still the dominant wire format in healthcare decades later, despite the existence of FHIR, CDA, and every other standard that was supposed to replace it.
Why KHCC was a specific integration challenge
King Hussein Cancer Centre is not a general hospital. It’s a specialist oncology facility. Patients arrive at KHCC via referral from other hospitals — many of which, in Jordan’s national system, were already on HAKEEM.
This means a patient’s record doesn’t start at KHCC. It starts at the referring facility — say, Prince Hamza Hospital — where they were first diagnosed with a condition that required oncology expertise. That patient record includes their demographics, their initial diagnosis, their medication history, their lab results. All of that lives in HAKEEM.
When that patient arrives at KHCC for cancer treatment, KHCC has its own EHR. Its own patient ID space. Its own data model for clinical concepts like chemotherapy protocols, radiation treatment plans, and oncology-specific lab panels. The challenge is not just “get KHCC’s data into HAKEEM.” The challenge is patient identity resolution across two systems that assigned different IDs to the same human being, followed by bidirectional record synchronization that keeps both systems current without creating duplicate records or losing clinical context.
HL7 v2 is the transport layer for that problem. It is not the solution to that problem. The solution to that problem is a patient matching algorithm, a master patient index, and a message routing layer that understands which events in one system should trigger messages to the other.
The parts that are actually hard
Here’s what “just parse HL7” leaves out.
Patient identity is not a solved problem. KHCC assigned their own
Medical Record Numbers (MRN). HAKEEM maintained its own national patient
identifier scheme. The PID segment has a field for patient ID list —
PID-3 — specifically because a patient has multiple identifiers across
multiple systems. But knowing that the patient has IDs 12345 at KHCC
and 98765 in HAKEEM requires a mapping table, and building that table
requires either an automated probabilistic matching algorithm (match on
name + DOB + gender + address, score the confidence) or a manual
reconciliation process, or — in practice — both, with different rules
for different confidence thresholds.
Event sequencing matters clinically. An HL7 ADT message carries an event type — A01 is admission, A02 is transfer, A03 is discharge, A08 is patient information update. These events have to arrive in order and be processed in order, or your clinical picture is wrong. A patient who was admitted (A01), transferred (A02), and then discharged (A03) looks very different from one where those messages arrive out of order and get processed as admitted, discharged, then transferred to a department they already left.
In a high-volume hospital integration, messages don’t always arrive in order. TCP gives you reliable delivery; it doesn’t give you application-level ordering guarantees when messages are produced asynchronously by different systems. You need sequence numbers, you need message acknowledgment (ACK/NAK), and you need a queue with retry logic that handles NAKs without losing the message or flooding the destination with duplicates.
The sending system’s implementation is the real specification.
HL7 v2.3 is a standard. But the standard has optional fields, optional
segments, and significant room for implementation variation. What KHCC’s
EHR actually sent in PID-3 was what mattered — not what the HL7 spec
said should be there. I spent meaningful time with KHCC’s technical team
analyzing their actual outbound messages to understand what we’d receive,
because the implementation documentation was incomplete. This is normal.
Every HL7 integration I’ve heard about has a version of this story.
What the message routing layer looked like
The HAKEEM-KHCC integration used the HL7 Messaging package in VistA — the MUMPS routines in that package handled inbound message parsing and routing on the HAKEEM side. I wrote the routines that handled specific event types — primarily ADT events for patient movement and ORM/ORU pairs for lab orders and results.
The flow for a referral:
- Referring hospital (on HAKEEM) sends the patient to KHCC.
- KHCC registers the patient in their system, assigns their MRN.
- KHCC sends an ADT^A01 to HAKEEM acknowledging admission.
- The HAKEEM HL7 routing layer receives the A01, resolves the patient identity via the MPI mapping, and updates the patient record in HAKEEM with the KHCC encounter information.
- Lab results from KHCC’s oncology panels flow back via ORU^R01 messages and land in the patient’s HAKEEM record, visible to the referring physicians who still hold the patient’s primary care relationship.
That last point matters clinically. An oncologist at KHCC is treating the cancer. The general physician at the referring hospital is still managing the patient’s hypertension, diabetes, and everything else. They need to see the oncology results. The HL7 integration is what makes that possible without requiring the patient to carry paper records between facilities.
What I think about HL7 now, twelve years later
HL7 v2 is not beautiful software. But it’s the kind of ugly that earns your respect: ugly because the problem domain is ugly, not because the designers didn’t try.
Healthcare data is messy because healthcare is messy. Patients have multiple identifiers because they interact with multiple systems. Events have ordering requirements because clinical timelines matter for diagnosis and treatment. The field optionality that makes the spec look loose is there because different clinical specialties have different data requirements and the standard has to serve them all.
FHIR is genuinely better in most respects — RESTful, JSON or XML, resource-oriented, modern tooling. It’s also twelve years newer. In 2012, FHIR was a draft specification. HL7 v2 was what hospital systems spoke. You integrate with what’s there.
The engineers who build healthcare integrations are doing work that most of the industry doesn’t see and doesn’t think about. The HL7 message that carries your lab result from the lab system to your chart to your doctor’s screen is plumbing. Nobody notices the plumbing until it breaks.
I noticed the plumbing. I built some of it. The pipes still run.