We have 4 apps running Flume, and are experiencing performance issues and running out of file descriptors. We have 4 apps, running 4 instances each across 16 data nodes. They have approximate volumes:
App A - 60 GB per month
App B - 150 KB per month
App C - 54 GB per day
App D - 330 GB per day
We have been advised to move these onto dedicated hosts (4 hosts running 1 agent for each app = 4 per node). My Questions are:
1. Is this a best practice for placement of Flume Agents?
2. With this cause downsides with data locality of HDFS files that are written out?