Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hadoop namenode -format

avatar

we have amabri cluster with 24 workers machines

we want to run following commands only on worker23 machine ( because problem on worker23 ) , dose these commands effected on all FileSystem of all the workers? or only on worker23 ?

if yes , then how to clean the HDFS directories only on the specific host?

     $ hadoop namenode -format

     $ hdfs namenode -format



Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Michael Bronson

Just to add to what @Xiaoyu Yao said; if you want to clean only a data node (worker23), you will need to ssh to that node. Then you can either format the data disks or delete data. You can see in Ambari which paths map to the data store. But you need to make sure that are actual disks if you want to run format.

A safer approach would be to run "rm -rf /data/disk1", "rm -rf /data/disk2", assuming data node has stored data in a path called /data/disk1 and /data/disk2

Just as @Xiaoyu Yao mentioned, please do NOT format the namenode, the whole cluster will be lost.

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

There is only one namenode per HFDS cluster assuming you don't have federation.

The Format namenode command you mentioned will clear all the HDFS namespace info and affect all the other works no matter where the command is being executed.

avatar

so how to clean the HDFS directories only on the specific host

Michael-Bronson

avatar
Expert Contributor

@Michael Bronson

Just to add to what @Xiaoyu Yao said; if you want to clean only a data node (worker23), you will need to ssh to that node. Then you can either format the data disks or delete data. You can see in Ambari which paths map to the data store. But you need to make sure that are actual disks if you want to run format.

A safer approach would be to run "rm -rf /data/disk1", "rm -rf /data/disk2", assuming data node has stored data in a path called /data/disk1 and /data/disk2

Just as @Xiaoyu Yao mentioned, please do NOT format the namenode, the whole cluster will be lost.

avatar
Expert Contributor

@Michael Bronson

That's really a very good question. I can see there is still work going on it and one JIRA ( HDFS-107 ) in open state.

So the answer to your first question is obvious NO. Formatting the namenode will surely be going to impact your whole cluster as it's the master node and contains metadata of all of the data nodes. So, formatting namenode is not a good idea in my view.

I have to replicate some steps before answering your second question. But, it's the tricky one. I will try to answer it ASAP.