How to delete Data Dir from the DNs 1. Open the Namenode UI and check the status. It should be healthy i.e. no missing, corrupt or under-replicated blocks
2. Go to Ambari—Services—HDFS—Configs and change the Datanode directories (remove the datadir which is not required) and save the change Note: Please just make the config change and DON’T delete the directories or any files yet. 3. IMP: We need to be very careful here NOT to restart all the services as Ambari will ask to “Restart Required”, otherwise we will see the missing blocks in NN and have to revert the changes back. 4.Go to any one of the DN and restart the DataNode service 5. Login to the same DN from putty and run the block pool report. Run the below command for same hdfs dfsadmin -triggerBlockReport <datanode_host:ipc_port> You can get the datanode ipc port here Ambari---Services---HDFS---Configs and search for dfs.datanode.ipc.address Here is the sample command output ran on the datanode [hdfs@piku1 ~]$ hdfs dfsadmin -triggerBlockReport piku1.openstacklocal:8010Triggering a full block report on piku1.openstacklocal:8010. 6. Open the NN UI again and now will see Under Replicated blocks Which will be decreasing as the data gets replicated 7. We need to wait until the Under replicated blocks turn to 0. 8. Once the Under Replicated block becomes 0, we need to iterate Step1-7 for Second DN and so on for all the DNs. 9. Once all the DNs started and the NN UI is back to healthy state (i.e. no missing, corrupt or under-replicated blocks), It is safe to re-start the NN 10. Verify the NN UI again to double check the health status of NN after restart. 11. If all good, it’s safe to restart other services which require restart.
... View more