can someone guide me on how to divert PutHiveStreaming data into ConvertAvroToJSON processor ? I had a hadoop engineer on site for short training and he setup the flow but had to leave before he could tell me this step.
I am attaching the nifi flow xml .
If i am understanding you correctly, the solution is as simple as dragging the circle that appears over the putHIveStreaming processor when you float your cursor over it to your ConvertAvroToJson processor.
In your connection configuration windows that appears, check the box next to the "success" relationship.
FlowFile that were successfully processed by the PutHiveStreaming will now get routed to the ConvertAvroToJson processor instead of being auto-terminated.
If you found this answer helpful, please take a moment to click "Accept" below.
What do you plan on doing next after ConvertAvrotoJson? Simply feeding the "success" connection to the next processor in a dataflow does not complete the dataflow. I am sure you intend to do something with the generated JSON files, correct?
If not FlowFiles, are just going to queue up on the success relationship between PutHiveStreaminga and ConvertAvroToJson until back pressure kicks in on the connection and causes your putHiveStreaming to stop running.
My suggestion is you play around with building the rest of the convertAvroToJson dataflow first to become more familiar with the NiFi interface and then connect your PutHiveStreaming processor to it,.
if you want to process the avro data that is stored by using PutHiveStreaming processor then Feed the success relation of PutHiveStreaming processor to ConvertAvroToJSON processor, then do your processing on the json data.
since I need JSON data to put in hive ..it doesn't like AVRO I was told , shouldn't I be adding the ConvertAVroToJSON processor before PutHiveStreaming ?
PutHiveStreaming processor require the incoming flowfile needs to be in Avro format and the table needs to exist in hive and all filed datatypes should be strings, orc format,bucketed,transactional ..etc are the requirements for the tables that are used in PutHiveStreaming processor.
Refer to below link how to stream data into hive using PutHiveStreaming processor
if you want put json data in hive then instead of using PutHiveStreaming processor use PutHDFS processor after ConvertAvtoToJson and then create Hive table with json serde pointing to HDFS location.
Reference how to create hive table on json data