Created 11-20-2017 05:31 PM
we have amabri cluster with 24 workers machines
we want to run following commands only on worker23 machine ( because problem on worker23 ) , dose these commands effected on all FileSystem of all the workers? or only on worker23 ?
if yes , then how to clean the HDFS directories only on the specific host?
$ hadoop namenode -format $ hdfs namenode -format
Created 11-20-2017 11:40 PM
Just to add to what @Xiaoyu Yao said; if you want to clean only a data node (worker23), you will need to ssh to that node. Then you can either format the data disks or delete data. You can see in Ambari which paths map to the data store. But you need to make sure that are actual disks if you want to run format.
A safer approach would be to run "rm -rf /data/disk1", "rm -rf /data/disk2", assuming data node has stored data in a path called /data/disk1 and /data/disk2
Just as @Xiaoyu Yao mentioned, please do NOT format the namenode, the whole cluster will be lost.
Created 11-20-2017 11:34 PM
There is only one namenode per HFDS cluster assuming you don't have federation.
The Format namenode command you mentioned will clear all the HDFS namespace info and affect all the other works no matter where the command is being executed.
Created 11-20-2017 11:36 PM
so how to clean the HDFS directories only on the specific host
Created 11-20-2017 11:40 PM
Just to add to what @Xiaoyu Yao said; if you want to clean only a data node (worker23), you will need to ssh to that node. Then you can either format the data disks or delete data. You can see in Ambari which paths map to the data store. But you need to make sure that are actual disks if you want to run format.
A safer approach would be to run "rm -rf /data/disk1", "rm -rf /data/disk2", assuming data node has stored data in a path called /data/disk1 and /data/disk2
Just as @Xiaoyu Yao mentioned, please do NOT format the namenode, the whole cluster will be lost.
Created 11-21-2017 12:09 AM
That's really a very good question. I can see there is still work going on it and one JIRA ( HDFS-107 ) in open state.
So the answer to your first question is obvious NO. Formatting the namenode will surely be going to impact your whole cluster as it's the master node and contains metadata of all of the data nodes. So, formatting namenode is not a good idea in my view.
I have to replicate some steps before answering your second question. But, it's the tricky one. I will try to answer it ASAP.