Jayant Singh's HealthTech Blog: HL7 Analysis with NoSQL MongoDB

Last week I had a discussion with Terry Montgomery. He was interested in HL7 message analysis and wanted my opinion. We discussed different tools that both of us have tried.
From discussion I found that he was looking for a tool which can analyze raw HL7 messages. This analysis could be used in designing HL7 interfaces.

I gave him few basic analysis points that we should do to start any interface development like:

Total number of messages in a sample
Different Trigger Events & their count in Feed
Different Segments that are present in feed
Total Number of PID_3, PID_18, PV1_19 with leading zeros
Total Number of PID_3, PID_18, PV1_19 are componentized

He said he wants all these with many other points. In discussion he presented couple of his ideas to analyze raw HL7 messages. One of them was using NoSQL approach with MongoDB.
I am not MongoDB expert but I tried it few days earlier with one of my FHIR researches.
I was somehow sure that MongoDB can be used but was hesitated with approach. I have already seen couple of discussions criticizing HL7 to Json conversion.
But as a developer I decided to do something around this & used some free time over weekend.
I extended my HL7 Parsing Library with couple of methods to generate BsonDocument from HL7 message.
Let's take an example HL7 message

 MSH|^~\&|ADT|ADI|ADT-1|ADI-1|20050215||ADT^A01|MSGADT003|T|2.4  
 EVN|A01|20031016000000  
 PID|1|111222333|H123123^^^^MR^ADT~111-222-333^^^^SS^ADT||John^Smith|GARSEN^^Melissa|19380818|M||2028-9|241 AVE^^Lake City^WA^98125^^^^100|100|(425)111-2222|(425)111-2222||S|CHR|1234567|111-222-333  
 NK1|2|GARSEN^Melissa  
 PV1|1|E|||||D123^Jeff^Carron|||MED||||7|||D123^Jeff^Taylor|E|3454|R^20050215|||||||||||||||||||EM|||||20050215  
 IN1|1|I123|ICOMP1|INS COMP 1|PO BOX 1^^Lake City^WA^98125||||||||||1|John^Smith|01|19380818  
 IN2|1||RETIRED  
 IN1|2|I456|ICOMP2|INS COMP 1|PO BOX 2^^Lake City^WA^98125||||||||||8|John^Smith|01|19380818  
 IN2|2||RETIRED

I have generated a Json from BsonDocument which looks like following.

 {  
      "MSH":  
      {  
           "MSH_1": "|",  
           "MSH_2": "^~\\&",  
           "MSH_3": "ADT",  
           "MSH_4": "ADI",  
           "MSH_5": "ADT-1",  
           "MSH_6": "ADI-1",  
           "MSH_7": "20050215",  
           "MSH_8": "",  
           "MSH_9": "ADT^A01",  
           "MSH_9_1": "ADT",  
           "MSH_9_2": "A01",  
           "MSH_10": "MSGADT003",  
           "MSH_11": "T",  
           "MSH_12": "2.4"  
      },  
      "EVN":  
      {  
           "EVN_1": "A01",  
           "EVN_2": "20031016000000"  
      },  
      "PID":  
      {  
           "PID_1": "1",  
           "PID_2": "111222333",  
           "PID_3": "H123123^^^^MR^ADT~111-222-333^^^^SS^ADT",  
           "PID_4": "",  
           "PID_5": "John^Smith",  
           "PID_5_1": "John",  
           "PID_5_2": "Smith",  
           "PID_6": "GARSEN^^Melissa",  
           "PID_6_1": "GARSEN",  
           "PID_6_2": "",  
           "PID_6_3": "Melissa",  
           "PID_7": "19380818",  
           "PID_8": "M",  
           "PID_9": "",  
           "PID_10": "2028-9",  
           "PID_11": "241 AVE^^Lake City^WA^98125^^^^100",  
           "PID_11_1": "241 AVE",  
           "PID_11_2": "",  
           "PID_11_3": "Lake City",  
           "PID_11_4": "WA",  
           "PID_11_5": "98125",  
           "PID_11_6": "",  
           "PID_11_7": "",  
           "PID_11_8": "",  
           "PID_11_9": "100",  
           "PID_12": "100",  
           "PID_13": "(425)111-2222",  
           "PID_14": "(425)111-2222",  
           "PID_15": "",  
           "PID_16": "S",  
           "PID_17": "CHR",  
           "PID_18": "1234567",  
           "PID_19": "111-222-333"  
      },  
      "NK1":  
      {  
           "NK1_1": "2",  
           "NK1_2": "GARSEN^Melissa",  
           "NK1_2_1": "GARSEN",  
           "NK1_2_2": "Melissa"  
      },  
      "PV1":  
      {  
           "PV1_1": "1",  
           "PV1_2": "E",  
           "PV1_3": "",  
           "PV1_4": "",  
           "PV1_5": "",  
           "PV1_6": "",  
           "PV1_7": "D123^Jeff^Taylor",  
           "PV1_7_1": "D123",  
           "PV1_7_2": "Jeff",  
           "PV1_7_3": "Taylor",  
           "PV1_8": "",  
           "PV1_9": "",  
           "PV1_10": "MED",  
           "PV1_11": "",  
           "PV1_12": "",  
           "PV1_13": "",  
           "PV1_14": "7",  
           "PV1_15": "",  
           "PV1_16": "",  
           "PV1_17": "D123^Jeff^Taylor",  
           "PV1_17_1": "D123",  
           "PV1_17_2": "Jeff",  
           "PV1_17_3": "Taylor",  
           "PV1_18": "E",  
           "PV1_19": "3454",  
           "PV1_20": "R^20050215",  
           "PV1_20_1": "R",  
           "PV1_20_2": "20050215",  
           "PV1_21": "",  
           "PV1_22": "",  
           "PV1_23": "",  
           "PV1_24": "",  
           "PV1_25": "",  
           "PV1_26": "",  
           "PV1_27": "",  
           "PV1_28": "",  
           "PV1_29": "",  
           "PV1_30": "",  
           "PV1_31": "",  
           "PV1_32": "",  
           "PV1_33": "",  
           "PV1_34": "",  
           "PV1_35": "",  
           "PV1_36": "",  
           "PV1_37": "",  
           "PV1_38": "",  
           "PV1_39": "EM",  
           "PV1_40": "",  
           "PV1_41": "",  
           "PV1_42": "",  
           "PV1_43": "",  
           "PV1_44": "20050215"  
      },  
      "IN1":  
      {  
           "IN1_1": "1",  
           "IN1_2": "I123",  
           "IN1_3": "ICOMP1",  
           "IN1_4": "INS COMP 1",  
           "IN1_5": "PO BOX 1^^Lake City^WA^98125",  
           "IN1_5_1": "PO BOX 1",  
           "IN1_5_2": "",  
           "IN1_5_3": "Lake City",  
           "IN1_5_4": "WA",  
           "IN1_5_5": "98125",  
           "IN1_6": "",  
           "IN1_7": "",  
           "IN1_8": "",  
           "IN1_9": "",  
           "IN1_10": "",  
           "IN1_11": "",  
           "IN1_12": "",  
           "IN1_13": "",  
           "IN1_14": "",  
           "IN1_15": "1",  
           "IN1_16": "John^Smith",  
           "IN1_16_1": "John",  
           "IN1_16_2": "Smith",  
           "IN1_17": "01",  
           "IN1_18": "19380818"  
      },  
      "IN1":  
      {  
           "IN1_1": "2",  
           "IN1_2": "I456",  
           "IN1_3": "ICOMP2",  
           "IN1_4": "INS COMP 1",  
           "IN1_5": "PO BOX 2^^Lake City^WA^98125",  
           "IN1_5_1": "PO BOX 2",  
           "IN1_5_2": "",  
           "IN1_5_3": "Lake City",  
           "IN1_5_4": "WA",  
           "IN1_5_5": "98125",  
           "IN1_6": "",  
           "IN1_7": "",  
           "IN1_8": "",  
           "IN1_9": "",  
           "IN1_10": "",  
           "IN1_11": "",  
           "IN1_12": "",  
           "IN1_13": "",  
           "IN1_14": "",  
           "IN1_15": "8",  
           "IN1_16": "John^Smith",  
           "IN1_16_1": "John",  
           "IN1_16_2": "Smith",  
           "IN1_17": "01",  
           "IN1_18": "19380818"  
      },  
      "IN2":  
      {  
           "IN2_1": "1",  
           "IN2_2": "",  
           "IN2_3": "RETIRED"  
      },  
      "IN2":  
      {  
           "IN2_1": "2",  
           "IN2_2": "",  
           "IN2_3": "RETIRED"  
      }  
 }

And this is how this BsonDocument looks in MongoVUE

As I said I am not an expert in MongoDB and taking this as an opportunity to learn NoSQL, I have successfully written few basic queries. I am not sure how efficient these queries are but I shall improve them as I learn more about MongoDB & C# driver.

Total number of messages in feed
Different Event Types in feed & their count in Feed
Different Segments received in Feed
Find PID_3 with leading zeros
Find PID_3 with components

I would request you to please comment the approach if this is a feasible way to analyze raw HL7 messages.
I welcome any comment/criticism to help improve this.

Jayant Singh's HealthTech Blog

Monday, 3 June 2013

HL7 Analysis with NoSQL MongoDB - 1

Popular Posts

Search This Blog