Hopefully I have this post in the correct category!
Essentially an issue occurred with the namenode that had caused it to stop working properly. After doing a lot of research and investigating it seemed like our only option was to format the namenode. Before doing so, we made a backup of the namenode data directories (dfs.name.dir, dfs.namenode.name.dir) in hopes that we could somehow just place it back there after the format.
Is it possible to do a namenode format and place the namenode data directory back in after the format so that it recognizes all of the data that we had in hdfs before?
I'm running CDH5, HDFS, YARN and nothing more. It's a 6 node cluster (2 head nodes, 4 data nodes).
Yes, should be fine as long as your new configuration is identical to what it was before the formatting. To deal with situations like this, you should consider using three journal nodes, provides redundancy in situations like this
So, it seems like it shouldn't be a big deal. When attempting to place the metadata back in we always ran into blockpool id issues in which the block pool ids didn't match between two items. We found some mismatching blockpool ids and attempted to change them to match, but were unable to successfully put the metadata back in.
We have been exploring and testing HA in hopes that it can at some point be introduced into all of the hadoop clusters we have running. That link on the journal nodes was very helpful and I believe that is defintely something we should implement! Thank you for that!