- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 09-02-2017 06:59 PM - edited 08-17-2019 11:23 AM
This article aims to show how to planning a Nifi Cluster following the best practices.
1) Hardware Provisioning
2) Hardware Considerations for HDF
- General Hardware
A key design point of NiFi is to use typical enterprise class application servers.
Hardware failure:
- Machine Class
A NiFi cluster consists of a single class of machine
Balanced NiFi Node:
- Networking
In-rack backplane/Top-of-rack Switch:
- NiFi: Hardware Driving Factors
NIFI is designed to take advantage of:
- all the cores on a machine
- all the network capacity
- all the disk speed
- many GB of RAM (though usually not all) on a system
Most important hardware factors :
- Top-end disk throughput as configured which is a combination of seek time and raw performance
- Network speed
- CPU only a concern when there is a lot of compression, encryption, or media analytics
- Need to ensure flow can take advantage of the contiguous block allocation approach NiFiuses or it will result in lots of random seeks thus increasing seek times and decreasing effective throughput.
3) HDF Disk Partition Baseline
4) Disk Partitioning – Nifi Nodes (Repositories)
5) NiFi: Default Cluster Recommendation
When not provided with information to gauge the rate and complexity of data flow, start with a default cluster of three nodes. Three nodes are needed for HA by Zookeeper Quorum process.
The SKU is priced for cores, but it can be split up. So, a 16 core SKU can be split into 3 machines of 4 cores each. More cores per node will improve throughput (up to an extent).
So, starting cluster for, say, 50MB/s sustained throughput for average Flow is:
- 3 nodes each with:
- CPU: 8+ cores (16 is preferred)
- Memory: 8+ GB
- Disk: 6 disks, each 1TB disks (could be spinning or SSD)
6) NiFi Clusters Scale Linearly
Created on 09-03-2017 11:17 AM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Have you checked Nifi throughput using Content Repo in a JBOD mode instead of Raid? Basically, let application decide for the distribution of data.