Created 12-21-2016 02:19 AM
want to add a few nodes to a running cluster. does it require any down time and will the running jobs get affected?
Created 12-21-2016 10:03 PM
Basically, adding nodes (hosts) requires no downtime and will not affect running jobs at all.
Btw, you may try to do this by Ambari Add Hosts function. Scenario:
1. Go to Ambari Hosts page.
2. Click on Actions button located on the top left of this page.
3. Select "Add New Hosts " option.
You will see a wizard which will guide you through the whole process, you can add hosts with any slaves (DataNode, NodeManager, clients .etc) installed on them.
Hope this helps 🙂
Created 12-21-2016 03:17 AM
@ARUN :
No downtime is required for simple adding nodes for just DN, NM and other components. Even for adding client node(Edge node) doesn't require any downtime.
Created 12-21-2016 03:42 AM
Created 12-21-2016 10:03 PM
Basically, adding nodes (hosts) requires no downtime and will not affect running jobs at all.
Btw, you may try to do this by Ambari Add Hosts function. Scenario:
1. Go to Ambari Hosts page.
2. Click on Actions button located on the top left of this page.
3. Select "Add New Hosts " option.
You will see a wizard which will guide you through the whole process, you can add hosts with any slaves (DataNode, NodeManager, clients .etc) installed on them.
Hope this helps 🙂
Created 12-22-2016 03:00 AM
@ARUN
As @Xi Wang responded and suggested. I assess her response as appropriate. Please see also doc link (assuming you use Ambari 2.1.1.0, change to the proper link if you use an earlier version):
Also, keep in mind that only new data will use the new data nodes, unless you execute rebalance hdfs command, then you have a distribution of all the data across all data nodes. Default threshold is 10, but you could change it to your desired threshold. You may want to execute it during off-peak hours.