Support Questions
Find answers, ask questions, and share your expertise

After creating Namenode Ha on a cluster these are the errors i am trying to solve. Please let me know the configurations tweaks.

Highlighted

After creating Namenode Ha on a cluster these are the errors i am trying to solve. Please let me know the configurations tweaks.

New Contributor

2801-histroywebui.pngnamenoderpclatency.png

histroyserverconnectionrefused.png

The critical errors showed up after enabling namenode ha and adding new namenode instance in node1 as shown in pic below. node1:50070 . The cluster has 5 hosts . Using AMbari in this scenario

The problems are namenode UI is not able to connect . The problems started after i enabled namenode ha from active namenode . (and trying to add standby in node1:50070). In the manual steps of namenode ha i didn't realize that i am running cli of physical host node1 instead of physical host namenode and ran the dfsadmin safemode,savenamespace, intialize and other manual steps . While in the process it gave a message to do a namenode format which i did it , and ended up in all steps to complete the wizard. Finally critical errors in alerts section are

a) Standy namenode (node1:50070) starts but active (namenode:50070) does not start . namenode webUIdoesn't open. If i tried to start the node1:50070 namenode in the service menu the this becomes the standby and other namenode:50070 stops and vice versa action. The namenode webUI dropdown shows one as standby and other just the hostname.

b)The mapred2 Service the history server process does not start. I have gone throughthe configurations in yarn log aggregationbut not much help.

What can I do for these .All of above are critical errors after doing Namenode HA.Untill then the cluster ran fine

Please let me know the solutions. Please share your expertise.I am trying too. All these are errors . So how can I do assaign acitve namenode and standby if anything goes wrong in the wizard.

(DoesHdfshaadmin will do the trick)
DFSHAAdmin [-ns <nameserviceId>]

[-transitionToActive <serviceId>]

[-transitionToStandby <serviceId>]

[-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]

[-getServiceState <serviceId>]

[-checkHealth <serviceId>]

[-help <command>

In the place of nameserviceId and serviceId what should be values in namenode ha configuration.

2798-namenodehighavailabilityhealth.png

2800-hdfs-node1.png

3 REPLIES 3
Highlighted

Re: After creating Namenode Ha on a cluster these are the errors i am trying to solve. Please let me know the configurations tweaks.

Mentor

Are you still having this issue?

Highlighted

Re: After creating Namenode Ha on a cluster these are the errors i am trying to solve. Please let me know the configurations tweaks.

Super Guru

@satish k.t - Can you please make current standby namenode as active by below command, atleast your hdfs will be up and running and we can troubleshoot issues with other NN

hdfs haadmin -transitionToActive <service-id-of-standby-NN> --forcemanual

Once current standby becomes active then try to restart problematic NN and let us know how it goes.

Highlighted

Re: After creating Namenode Ha on a cluster these are the errors i am trying to solve. Please let me know the configurations tweaks.

New Contributor

ya thanks , disable namenode ha , and revert back to secondary namenode will do fine. But i believe when executing the manual steps in did in node1 instead i had to do those physical steps on host on namenode which is the active and current . SO i believe that is the problem and started all the errors

Don't have an account?