I want to delpoy Nifi and I have some questions about the best practice deployment.
I have alot of data (around 100B records per day) so I need the best performance.
1. Should I need to install Nifi on VM or physic server?
2. Except the following Sizing Guide there is another one?
The sizing guide you linked is a great place to start.
Something similar which you can also take a look through is our documentation:
For maximum performance NiFi should be installed on a physical server - NiFi is constrained by disk and network usage, so making sure you configure the 3 repositories and map them appropriately to independent disks is crucial in extracting highest throughput.
I actually have worked with Cloudera NiFI SMEs that have helped me size out a standard configuration for some of their DOD customers.
Below are the specs of the cluster
3 Node Configuration Each Node with the Following•2U Rackmount Chassis with Redundant Power Supply o16 x Total Xeon Scalable Processor Cores / 2.1GHzo64 GB High Performance 2666 MHz DDR3 ECC Registered Memory oRedundant OS Hard Drive Configurationo80 TB Enterprise Storage§ContentRepo:32TB Storage; 2 RAID 1 mount points (total 16 TB usable)§Flowfile Repo:16TB Storage; 1 RAID 1 mount point (total 8 TB usable)§Provenance: Repo: 16TB Storage; 1 RAID 1 mount point (total 8 TB usable)§Other HDF components: 16 TB Storage; 1 RAID 1 mount point (total 8 TB usable)o8 TB Enterprise Storage§Zookeeper:8 TB Storage; 1 RAID 1 mount point (total 4 TB usable)•Dual Port 10/100/1000/10000 10GigE Network Adapter / SFP+ or RJ45 Connection•IPMI / iKVMDedicated Port•CentOS 7.x Installed for Testing•PSSC Labs HPC Hardware Integration & Testing