Jayant Singh Website - Healthcare IT Blog

HL7 Analysis with NoSQL MongoDB - 1

Last week I had a discussion with Terry Montgomery. He was interested in HL7 message analysis and wanted my opinion. We discussed different tools that both of us have tried.

From discussion I found that he was looking for a tool which can analyze raw HL7 messages. This analysis could be used in designing HL7 interfaces.

I gave him few basic analysis points that we should do to start any interface development like:


  1. Total number of messages in a sample
  2. Different Trigger Events & their count in Feed
  3. Different Segments that are present in feed
  4. Total Number of PID_3, PID_18, PV1_19 with leading zeros
  5. Total Number of PID_3, PID_18, PV1_19 are componentized

He said he wants all these with many other points. In discussion he presented couple of his ideas to analyze raw HL7 messages. One of them was using NoSQL approach with MongoDB.

I am not MongoDB expert but I tried it few days earlier with one of my FHIR researches.

I was somehow sure that MongoDB can be used but was hesitated with approach. I have already seen couple of discussions criticizing HL7 to Json conversion.

But as a developer I decided to do something around this & used some free time over weekend.

I extended my HL7 Parsing Library with couple of methods to generate BsonDocument from HL7 message.

Let's take an example HL7 message

MSH|^~\&|ADT|ADI|ADT-1|ADI-1|20050215||ADT^A01|MSGADT003|T|2.4
EVN|A01|20031016000000
PID|1|111222333|H123123^^^^MR^ADT~111-222-333^^^^SS^ADT||John^Smith|GARSEN^^Melissa|19380818|M||2028-9|241 AVE^^Lake City^WA^98125^^^^100|100|(425)111-2222|(425)111-2222||S|CHR|1234567|111-222-333
NK1|2|GARSEN^Melissa
PV1|1|E|||||D123^Jeff^Carron|||MED||||7|||D123^Jeff^Taylor|E|3454|R^20050215|||||||||||||||||||EM|||||20050215
IN1|1|I123|ICOMP1|INS COMP 1|PO BOX 1^^Lake City^WA^98125||||||||||1|John^Smith|01|19380818
IN2|1||RETIRED
IN1|2|I456|ICOMP2|INS COMP 1|PO BOX 2^^Lake City^WA^98125||||||||||8|John^Smith|01|19380818
IN2|2||RETIRED

 I have generated a Json from BsonDocument which looks like following.

{
 "MSH" : 
{
"MSH_1" : "|",
"MSH_2" : "^~\\&",
"MSH_3" : "ADT",
"MSH_4" : "ADI",
"MSH_5" : "ADT-1",
"MSH_6" : "ADI-1",
"MSH_7" : "20050215",
"MSH_8" : "",
"MSH_9" : "ADT^A01",
"MSH_9_1" : "ADT",
"MSH_9_2" : "A01",
"MSH_10" : "MSGADT003",
"MSH_11" : "T",
"MSH_12" : "2.4" 
},
 "EVN" : 
{
"EVN_1" : "A01",
"EVN_2" : "20031016000000" 
},
 "PID" : 
{
"PID_1" : "1",
"PID_2" : "111222333",
"PID_3" : "H123123^^^^MR^ADT~111-222-333^^^^SS^ADT",
"PID_4" : "",
"PID_5" : "John^Smith",
"PID_5_1" : "John",
"PID_5_2" : "Smith",
"PID_6" : "GARSEN^^Melissa",
"PID_6_1" : "GARSEN",
"PID_6_2" : "",
"PID_6_3" : "Melissa",
"PID_7" : "19380818",
"PID_8" : "M",
"PID_9" : "",
"PID_10" : "2028-9",
"PID_11" : "241 AVE^^Lake City^WA^98125^^^^100",
"PID_11_1" : "241 AVE",
"PID_11_2" : "",
"PID_11_3" : "Lake City",
"PID_11_4" : "WA",
"PID_11_5" : "98125",
"PID_11_6" : "",
"PID_11_7" : "",
"PID_11_8" : "",
"PID_11_9" : "100",
"PID_12" : "100",
"PID_13" : "(425)111-2222",
"PID_14" : "(425)111-2222",
"PID_15" : "",
"PID_16" : "S",
"PID_17" : "CHR",
"PID_18" : "1234567",
"PID_19" : "111-222-333" 
},
 "NK1" : 
{
"NK1_1" : "2",
"NK1_2" : "GARSEN^Melissa",
"NK1_2_1" : "GARSEN",
"NK1_2_2" : "Melissa" 
},
 "PV1" : 
{
"PV1_1" : "1",
"PV1_2" : "E",
"PV1_3" : "",
"PV1_4" : "",
"PV1_5" : "",
"PV1_6" : "",
"PV1_7" : "D123^Jeff^Taylor",
"PV1_7_1" : "D123",
"PV1_7_2" : "Jeff",
"PV1_7_3" : "Taylor",
"PV1_8" : "",
"PV1_9" : "",
"PV1_10" : "MED",
"PV1_11" : "",
"PV1_12" : "",
"PV1_13" : "",
"PV1_14" : "7",
"PV1_15" : "",
"PV1_16" : "",
"PV1_17" : "D123^Jeff^Taylor",
"PV1_17_1" : "D123",
"PV1_17_2" : "Jeff",
"PV1_17_3" : "Taylor",
"PV1_18" : "E",
"PV1_19" : "3454",
"PV1_20" : "R^20050215",
"PV1_20_1" : "R",
"PV1_20_2" : "20050215",
"PV1_21" : "",
"PV1_22" : "",
"PV1_23" : "",
"PV1_24" : "",
"PV1_25" : "",
"PV1_26" : "",
"PV1_27" : "",
"PV1_28" : "",
"PV1_29" : "",
"PV1_30" : "",
"PV1_31" : "",
"PV1_32" : "",
"PV1_33" : "",
"PV1_34" : "",
"PV1_35" : "",
"PV1_36" : "",
"PV1_37" : "",
"PV1_38" : "",
"PV1_39" : "EM",
"PV1_40" : "",
"PV1_41" : "",
"PV1_42" : "",
"PV1_43" : "",
"PV1_44" : "20050215" 
},
 "IN1" : 
{
"IN1_1" : "1",
"IN1_2" : "I123",
"IN1_3" : "ICOMP1",
"IN1_4" : "INS COMP 1",
"IN1_5" : "PO BOX 1^^Lake City^WA^98125",
"IN1_5_1" : "PO BOX 1",
"IN1_5_2" : "",
"IN1_5_3" : "Lake City",
"IN1_5_4" : "WA",
"IN1_5_5" : "98125",
"IN1_6" : "",
"IN1_7" : "",
"IN1_8" : "",
"IN1_9" : "",
"IN1_10" : "",
"IN1_11" : "",
"IN1_12" : "",
"IN1_13" : "",
"IN1_14" : "",
"IN1_15" : "1",
"IN1_16" : "John^Smith",
"IN1_16_1" : "John",
"IN1_16_2" : "Smith",
"IN1_17" : "01",
"IN1_18" : "19380818" 
},
 "IN1" : 
{
"IN1_1" : "2",
"IN1_2" : "I456",
"IN1_3" : "ICOMP2",
"IN1_4" : "INS COMP 1",
"IN1_5" : "PO BOX 2^^Lake City^WA^98125",
"IN1_5_1" : "PO BOX 2",
"IN1_5_2" : "",
"IN1_5_3" : "Lake City",
"IN1_5_4" : "WA",
"IN1_5_5" : "98125",
"IN1_6" : "",
"IN1_7" : "",
"IN1_8" : "",
"IN1_9" : "",
"IN1_10" : "",
"IN1_11" : "",
"IN1_12" : "",
"IN1_13" : "",
"IN1_14" : "",
"IN1_15" : "8",
"IN1_16" : "John^Smith",
"IN1_16_1" : "John",
"IN1_16_2" : "Smith",
"IN1_17" : "01",
"IN1_18" : "19380818" 
},
 "IN2" : 
{
"IN2_1" : "1",
"IN2_2" : "",
"IN2_3" : "RETIRED"
},
 "IN2" : 
{
"IN2_1" : "2",
"IN2_2" : "",
"IN2_3" : "RETIRED"
}
}
 

And this is how this BsonDocument looks in MongoVUE


HL7 Bson in MongoVUE


As I said I am not an expert in MongoDB and taking this as an opportunity to learn NoSQL, I have successfully written few basic queries. I am not sure how efficient these queries are but I shall improve them as I learn more about MongoDB & C# driver.

  1. Total number of messages in feed
  2. Different Event Types in feed & their count in Feed
  3. Different Segments received in Feed
  4. Find PID_3 with leading zeros
  5. Find PID_3 with components

I would request you to please comment the approach if this is a feasible way to analyze raw HL7 messages.

I welcome any comment/criticism to help improve this.