- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
scale out nifi CPU utilization
- Labels:
-
Apache NiFi
Created ‎08-17-2016 07:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a flow that works on my desktop and I would like to move it to a single server that has 16 cores. My question is there anything I need to configure in NiFi to light up all the cores. I see documentation on clustering servers, which doesn't apply to me, or does NiFI use everything available, which is the behavior I'm looking for?
Created ‎08-17-2016 08:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The individual processors allow for concurrent task changes. By default they all have one concurrent task. For each additional concurrent task, you are giving that processor the opportunity to request an additional thread from the NiFi controller to do work in parallel. (Think of it as two copies of the same processor doing working different files or batches of files). If there isn't sufficient files in the incoming queue then any additional concurrent tasks are not utilized. The flip side is if you allocate two many concurrent tasks to a single processor, that processor may itself end up using two many threads from the NiFi controller's resource pull resulting in a thread starvation to other processors. So star with the default and setup by one increment at a time in place of backlog in your flow.
The NiFi controller also has a setting that limits the maximum number of threads it can use from the underlying hardware. This is the other thing Andrew was mentioning. A restart of NiFi is NOT needed when you make changes to these values. The defaults are low (10 timer driven and 5 event driven). I would set the timer driven to no more then double the number of cores your hardware has.
Thanks,
Matt
Created ‎08-17-2016 08:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, start by simply running it with defaults. Very often the client won't even have enough data generated for processing to warrant any changes from defaults. Next, if you see connections backlogging, change the number of concurrent instances for a specific processor (but bump by 1 only and re-test). Rinse and repeat. If you still need more, in NiFi Flow UI, in the global settings, you can increase the thread pool size available to the instance (might need to restart in this case, don't remember right now).
Created ‎08-17-2016 08:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The individual processors allow for concurrent task changes. By default they all have one concurrent task. For each additional concurrent task, you are giving that processor the opportunity to request an additional thread from the NiFi controller to do work in parallel. (Think of it as two copies of the same processor doing working different files or batches of files). If there isn't sufficient files in the incoming queue then any additional concurrent tasks are not utilized. The flip side is if you allocate two many concurrent tasks to a single processor, that processor may itself end up using two many threads from the NiFi controller's resource pull resulting in a thread starvation to other processors. So star with the default and setup by one increment at a time in place of backlog in your flow.
The NiFi controller also has a setting that limits the maximum number of threads it can use from the underlying hardware. This is the other thing Andrew was mentioning. A restart of NiFi is NOT needed when you make changes to these values. The defaults are low (10 timer driven and 5 event driven). I would set the timer driven to no more then double the number of cores your hardware has.
Thanks,
Matt
