Support Questions
Find answers, ask questions, and share your expertise

Do i need to install flume agent in dedicated servers or datanodes.

Solved Go to solution
Highlighted

Do i need to install flume agent in dedicated servers or datanodes.

New Contributor

1. can i install flume agents in dedicated servers or is it ok to install in datanodes .

2. If it is dedicated servers, then how many flume agents in one server

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Do i need to install flume agent in dedicated servers or datanodes.

Expert Contributor

In the document "http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/installing_flume.html", it states "Hortonworks recommends that administrators not install Flume agents on any node in a Hadoop cluster." That is a really subtle (and hard to notice!) way of saying to put Flume on dedicated servers.

As noted above, in a smaller cluster you can get away with putting them on other nodes. A lot of this depends on the volume of data being processed by Flume and what else if running on the host.

There is also some good info on flume resource at https://cwiki.apache.org/confluence/display/FLUME/Flume's+Memory+Consumption.

View solution in original post

2 REPLIES 2
Highlighted

Re: Do i need to install flume agent in dedicated servers or datanodes.

You can install them on the edge/utility nodes. Not sure about datanodes, as they could get busy. Probably acceptable in a smaller cluster.

The number of agents per server depends on the volume really. Unless you share details, hard to suggest a solution (or an alternative).

Highlighted

Re: Do i need to install flume agent in dedicated servers or datanodes.

Expert Contributor

In the document "http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/installing_flume.html", it states "Hortonworks recommends that administrators not install Flume agents on any node in a Hadoop cluster." That is a really subtle (and hard to notice!) way of saying to put Flume on dedicated servers.

As noted above, in a smaller cluster you can get away with putting them on other nodes. A lot of this depends on the volume of data being processed by Flume and what else if running on the host.

There is also some good info on flume resource at https://cwiki.apache.org/confluence/display/FLUME/Flume's+Memory+Consumption.

View solution in original post