- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Accelerate working processor PutHiveStreaming
- Labels:
-
Apache Hive
-
Apache NiFi
Created ‎02-22-2017 10:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Whether there is possibility to accelerate work processor PutHiveStreaming ? running in parallel ?
Scheduling --> Concurrent Tasks : posible only single tasks.
Created ‎02-23-2017 07:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
PutHiveStreaming relies on Streaming API which has 2 relevant concepts: number of events per transaction and number of transactions per batch. Generally, the more events you write per transaction the faster the ingest. I don't see the 1st of these properties in the NiFi doc referenced above - perhaps there is some NiFi specific property that controls this.
Created ‎02-23-2017 04:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm going to attempt to answer this as there are far better experts here. You can improve performance in a few ways, you can parallelize by running PutHiveStreaming processor on a couple of nodes or tweak some of the parameters in the processor.
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hive.PutHiveStreaming/
Probably one property I'd tweak first is Transactions per batch. See if yours is set too low?
Created ‎02-23-2017 07:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
PutHiveStreaming relies on Streaming API which has 2 relevant concepts: number of events per transaction and number of transactions per batch. Generally, the more events you write per transaction the faster the ingest. I don't see the 1st of these properties in the NiFi doc referenced above - perhaps there is some NiFi specific property that controls this.
Created ‎02-23-2017 07:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As of NIFI-3418, NiFi will allow the user to set both of the aforementioned properties.
Created ‎09-07-2017 12:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Even though adding both the suggested properties, the output rate of Hive Streaming processor still seems to be slow, we are getting a mere 2 tps output rate for processor. the input queue contains more than 10k messages, and processor properties are : transactions per batch = 1k , records per transactions = 100k.
Created ‎05-15-2018 04:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have set my Transactions per batch to 10. Is that too low ? I am really looking at ways to process my queues faster . it always seems to be piling slowly over time
