Created 04-30-2016 01:27 PM
Hi Artem @Artem Ervits,
I'm new to Hadoop/NiFi, just getting started.
We're trying to ingest HL7 data, parse the messages and extract data from the messages and store them in HBase tables. Is it possible to do it all with NiFi; after reviewing the NiFi documentation it seems it can be done. Please confirm if this works or we need to use additional Hadoop tools, like Storm/Kafka to make it work.
I saw a post by you that said NiFi has HAPI (the Java API to parse HL7) built-in, but I'm not getting data extracted when I use the ExtractHL7Attributes processor. In my simple NiFi dataflow, 1) I'm reading HL7 messages using GetSFTP processor, 2) then send the FlowFile to ExtractHL7Attributes processor and lastly 3) send the FlowFile from ExtractHL7Attributes processor to a PutFile processor. When I look at the contents of the files that PutFile processor generated, they look exactly like the Source files I'm reading in GetSFTP processor. I know I must be missing something here.
I appreciate any suggestions/guidance.
Thanks in advance.
Raj
Created 04-30-2016 01:44 PM
Hi Raj,
What you are trying to achieve is possible. No need for other components.
A flow file is composed of a content part and an attributes part (key/value map). When using the GetSFTP processor the content of your HL7 messages is in the content part of the created FlowFiles. When you use the ExtractHL7Attributes processor, you extract some parts of your content and set it as new key/value of your attributes part. At the end the PutFile only put the content of the incoming FlowFile. This is why you don't see any modification. As an advice, when you add a processor on the canvas, you can right click on it and then click on usage to have a complete documentation about the processor.
One option would be to extract all the information you want from the HL7 message using ExtractHL7Attributes processor, then, once you have all the attributes you want, to use an AttributeToJson processor that will create a FlowFile based on the attributes part and then use PutHbaseJSON processor to store the data into HBase.
A source of inspiration would be this article (even if it is not HBase at the end):
Hope that helps.
Created 04-30-2016 01:44 PM
Hi Raj,
What you are trying to achieve is possible. No need for other components.
A flow file is composed of a content part and an attributes part (key/value map). When using the GetSFTP processor the content of your HL7 messages is in the content part of the created FlowFiles. When you use the ExtractHL7Attributes processor, you extract some parts of your content and set it as new key/value of your attributes part. At the end the PutFile only put the content of the incoming FlowFile. This is why you don't see any modification. As an advice, when you add a processor on the canvas, you can right click on it and then click on usage to have a complete documentation about the processor.
One option would be to extract all the information you want from the HL7 message using ExtractHL7Attributes processor, then, once you have all the attributes you want, to use an AttributeToJson processor that will create a FlowFile based on the attributes part and then use PutHbaseJSON processor to store the data into HBase.
A source of inspiration would be this article (even if it is not HBase at the end):
Hope that helps.
Created 04-30-2016 03:46 PM
Merci Pierre, for replying so quickly to my question.
Thanks for clarifying about how to get the attributes; I'll try what you suggested.
Again, thank you.
Created 05-01-2016 10:03 AM
Hi Raj, if it helped solving your issue (if not, let me know), would you mind accepting the answer on this thread. It will help other users that could be in the same situation when searching for relevant information. Thanks a lot.
Created 06-03-2016 04:43 PM
Hi Raj,
I am also trying the same approach to parse HL7 data and ingest into Hive tables. Please let me know if you were able to successfully parse the HL7 data using ExtractHL7Attributes processor. Thanks.