I have a requirement to change the instance type of name and data nodes in AWS. Currently they run with instance store mount points (total 6) and we need to change to other instance type with EBS (with may be 2 mounts). Kindly let me know the brief steps which will not harm the HDFS and OS. Private IP on existing nodes in AWS can be moved to new instance types. I think of the below two options.
1. Add EBS mounts to the existing instance types and Stop the any jobs using the cluster and then copy the HDFS data to them i,e from 6 mount points to other two mount points. Set the ownership accordingly.
2. Stop the cluster
3. Change the HDFS directory type for name and data nodes
4. Start the cluster
5. If all looks good, stop the cluster services and the instance (have to be cautious because data on the old 6 mount points will be lost due to instance store storage type). Start the instance with the new instance type and attach the 2 EBS mounts.
1. Add a new nodes to the cluster with the new instance type and EBS mounts. Rebalance the HDFS. Here new nodes will have only 2 mounts compared to 6 and i will keep the same name for the two EBS mounts.
2. This way i can add and remove 4 data nodes in my cluster.
3. But for Name node we have HA and i can add the node the cluster and move name node services, but i have got Ranger services installed which does not have move option, Also journal node service and few more. How can i overcome this or add it to new name nodes?
4. If this is done, i can decommission and remove old nodes from the cluster and assign the old private ip back to new nodes else i have a problem in firewall which will not allow access to new private ips to on-premise node. I have to request and open firewall ports for the new private ip's.
I'm thinking of going with option 1 which looks easy compared to option 2. Please let me know your suggestions or your views on doing this change and also add any steps if i have missed any?
Appreciate your help!!!
You can follow 'Option 1' with below additional steps:
FYI, these steps are as KB.
Thank you for your reply. My case for production the data size is around 45 TB. Copy will take long time. Thought of copying online before stopping the cluster and services and assume it takes couple of days to complete and then stop the cluster & Jobs and find last two days updated files / directories and copy them to new mount points. (Have to formulate small shell script to do this).
We cant stop and hold the cluster for 2 days to do the copy. Let me know if this is fine good? I will be testing the same in development and then move to production.
The 6 mounts are like /data0, /data1, /data2, /data3, /data4, /data5 and new mount will be like /eim_data0 & /eim_data1.
Under /data0 to 5, the folder structures are same with many sub directories and final files are different. If i copy /data0 all data to /eim_data0 and when i copy /data1 to the same place then it will overwrite the existing directories as they are same. Can i do in a below way?
copy /data0 to /eim_data0/data0/
copy /data1 to /eim_data0/data1/
copy /data2 to /eim_data0/data2/
& same way data3, 4, 5 to eim_data1/data3 eim_data1/data4 eim_data1/data5
and mention these 6 directories in the hdfs configurations.
Will it cause any issue or is this fine?