Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFi putHiveStreaming being very slow

NiFi putHiveStreaming being very slow

Rising Star

Hello,

I am receiving around 300k-400k messages per day in NiFi version Nifi-1.2.0.3.0.0.0-453. The messages are coming as a Json and the goal is to put them into a hive table in near real time. I have built the following flow which is attached as an image in this post. The flow is working fine and doing everything like it is supposed to. The issue is that is not writing to putHiveStreaming fast enough and as a result messages start getting queued up in the flow. I have been playing around with the Transaction per Batch and Records per Transaction configuration properties from the putHiveStreaming but I seem to be unable to get the putHiveStreaming processor to write faster but it does not seem to be working and we are getting messages a lot faster than what we are writing. Is there a way to configure the putHiveStreaming processor so that it can handle this type of load of 300k-400k messages per day at around 3 to 4 messages per second fast enough? Any insight on this issue will be deeply appreciated.


puthivestreamingflow.jpg
3 REPLIES 3

Re: NiFi putHiveStreaming being very slow

New Contributor

I would check to see if there are lot of delta files being created. Use the ReplaceText or any similar processor and run a major compaction on the table every now and then. That should speed it up a bit.

Highlighted

Re: NiFi putHiveStreaming being very slow

Expert Contributor

Hi @Adda Fuentes - we are facing the same issue. Were you able to get this to run quickly ? In our case also we are seeing constant queuing as the rate at which records are consumed is much slower than the rate of consumption at the Put Hive Streaming Processor.

@kerra

How do we use the replace text processor to run the compaction ? I am currently using the update attribute processor and setting hive parameters for compaction. Not sure whether this will work

Re: NiFi putHiveStreaming being very slow

New Contributor

the ReplaceText processor with PutHiveQL together will pass the command for major or minor compaction. you would do the compact major command in the text of the replace text processor and then have that stream into PutHiveQL.

Don't have an account?
Coming from Hortonworks? Activate your account here