Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Accelerate working processor PutHiveStreaming

avatar
Expert Contributor

Whether there is possibility to accelerate work processor PutHiveStreaming ? running in parallel ?

Scheduling --> Concurrent Tasks : posible only single tasks.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

PutHiveStreaming relies on Streaming API which has 2 relevant concepts: number of events per transaction and number of transactions per batch. Generally, the more events you write per transaction the faster the ingest. I don't see the 1st of these properties in the NiFi doc referenced above - perhaps there is some NiFi specific property that controls this.

View solution in original post

5 REPLIES 5

avatar
Master Mentor
@Dmitro Vasilenko

I'm going to attempt to answer this as there are far better experts here. You can improve performance in a few ways, you can parallelize by running PutHiveStreaming processor on a couple of nodes or tweak some of the parameters in the processor.

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hive.PutHiveStreaming/

Probably one property I'd tweak first is Transactions per batch. See if yours is set too low?

avatar
Super Collaborator

PutHiveStreaming relies on Streaming API which has 2 relevant concepts: number of events per transaction and number of transactions per batch. Generally, the more events you write per transaction the faster the ingest. I don't see the 1st of these properties in the NiFi doc referenced above - perhaps there is some NiFi specific property that controls this.

avatar
Master Guru

As of NIFI-3418, NiFi will allow the user to set both of the aforementioned properties.

avatar
Contributor

Even though adding both the suggested properties, the output rate of Hive Streaming processor still seems to be slow, we are getting a mere 2 tps output rate for processor. the input queue contains more than 10k messages, and processor properties are : transactions per batch = 1k , records per transactions = 100k.

avatar
Expert Contributor

I have set my Transactions per batch to 10. Is that too low ? I am really looking at ways to process my queues faster . it always seems to be piling slowly over time