Support Questions

bsaini · ‎10-23-2015

What precautions & extra configurations, if any, are needed when adding worker nodes with different capacity to a cluster? My understanding is that YARN will be able to just manage the nodes without anything special.

For e.g - Any issues with adding 3 nodes with following config to an existing POC cluster that has similar nodes 8 cores, 32 Gigs, 3 TB DAS for data-

node1 - 8 cores, 64GB RAM, NO STORAGE 

node2 - 8 cores, 64GB RAM, 2 TB

node3 - 8 cores, 64GB RAM, 2TB

Also, how do you configure YARN to utilize different amount of memory on these heterogeneous boxes?

nsabharwal · ‎10-25-2015

@bsaini@hortonworks.com

Node 1 - no storage - I guess you wont be using this for data node but it can play role of worker node (Nodemanager if you like and other components)

You will be creating a config group for node1

Please see this screenshot. In my case, node4 is datanode and I want to customize the data directories

Created new config for node4

select the config and change parameters that you want to change.

There is

Doc example

View solution in original post

aervits · ‎10-24-2015

you need to run the configuration utility against these nodes to account for memory, disk and cpu sizing. If you don't put node1 with no storage in its own group and specify not to look for whatever the hdfs or yarn directory you will run out of disk pretty quick because hdfs will try to write data to nonexisting directories in filesystem. So for node 1, I'd put it in it's own Ambari configuration group.

nsabharwal · ‎10-25-2015