When I go to install Druid, the installer wants me to choose a server for each of these:
-overlord (someone was very tickled with themself)
Further, it suggests the same server for each, and it is a data node.
In the past, we've encountered issues assigning anything to a data node besides the core Yarn and Hbase processing.
We have 3 master nodes on our cluster.
Can someone help me decide how to assign these servers?
Should they all be the same machine? Would it help if they were different? Should they be data nodes?
Thanks in advance!
I've seen some suggestions and guidelines about what services could be co-located right on the docs from the Druid project. See this link in the "select hardware" section:
As always, if you find this post useful please "Accept" the answer.
@Zack Riesland Let me elaborate more on this. First, yes you can co-locate all those service together. Second in order to get high availability you need to have at least 2 different physical nodes running all the services. Thus you will get HA with a replication of 2. Or you can choose an other combination of collocation where each service is run at least over 2 different nodes.
Although ideally you want to have something like this. Node1 Broker Node2 Broker Node3 Router/Overlord/Coordinator/Superser Node4 Router/Overloard/Coordinator/Superset The reason what you need broker to be alone is the fact that broker usually needs way more memory than all the other together therefore you might have special hardware for that. But to keep it simple you can start with collocate all the services X 2 and make sure that broker is not running with another service that needs RAM as well.