About Shelton

Shelton · ‎12-04-2017

@Michael Bronson Election issue !

Shelton · ‎12-04-2017

@Michael Bronson Can you validate the entry was created. [zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha/hdfsha Check lock [zk: localhost:2181(CONNECTED) 2] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock Check the status $ hdfs haadmin -getServiceState namenode1 $ hdfs haadmin -getServiceState namenode2 Try to failover $ hdfs haadmin -failover Use the below to force one to Active $ hdfs haadmin -transitionToActive <serviceId> --forceactive or $ hdfs haadmin -transitionToStandby <serviceId> Hope that helps

Shelton · ‎12-04-2017

@Michael Bronson Can you delete the entry in zookeeper and restart [zk: localhost:2181(CONNECTED) 1] rmr /hadoop-ha Validate that there is no hadoop-ha entry, [zk: localhost:2181(CONNECTED) 2] ls / Then restart the all components HDFS service. This will create a new ZNode with correct lock(of Failover controller). Also see https://community.hortonworks.com/questions/12942/how-to-clean-up-files-in-zookeeper-directory.html#

Shelton · ‎12-04-2017

@Michael Bronson localhost:2181(CONNECTED) 2] ls /hadoop-ha [hdfsha] Next localhost:2181(CONNECTED) 2] get /hadoop-ha/hdfsha/ActiveStandbyElectorLock What output do you get ?

Shelton · ‎12-04-2017

@Michael Bronson Can you attach your namenode log How's your /etc/hosts entry? 103.114.28.13 master01.sys4.com 103.114.28.12 master03.sys4.com or IP/hostname /Alias 103.114.28.13 master01.sys4.com master01 103.114.28.12 master03.sys4.com master03 What is the output of $ zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls /hadoop-ha If this cluster is not critical then you might have to have to go through these steps

Shelton · ‎12-04-2017

@Lukas Müller Can you copy and paste the contents of your /etc/yum.repos.d ? The filename /etc/yum.repos.d/ambari-hdp-2.repo doesn't look correct you should see something like this # ls -al /etc/yum.repos.d/ total 56 drwxr-xr-x. 2 root root 4096 Oct 19 13:13 . ...... -rw-r--r-- 1 root root 306 Oct 19 13:04 ambari.repo -rw-r--r--. 1 root root 575 Aug 30 21:34 hdp.repo -rw-r--r-- 1 root root 128 Oct 19 13:13 HDP.repo -rw-r--r-- 1 root root 151 Oct 19 13:13 HDP-UTILS.repo Please correct that and retry

Shelton · ‎12-04-2017

@ Christian Nunez Can you check in the ambari database whether the hosts have been registered below is from mysql mysql> select host_id,host_name,last_registration_time, public_host_name from hosts; Please let me know, an advice always open a new thread because this is closed thread and members usually ignore.

Shelton · ‎12-04-2017

@Sedat Kestepe Stop the Hdfs service if its running. Start only the journal nodes (as they will need to be made aware of the formatting) On the namenode (as user hdfs) # su - hdfs Format the namenode $ hadoop namenode -format Initialize the Edits (for the journal nodes) $ hdfs namenode -initializeSharedEdits -force Format Zookeeper (to force zookeeper to reinitialise) $ hdfs zkfc -formatZK -force Using Ambari restart the namenode If you are running HA name node then On the second namenode Sync (force synch with first namenode) $ hdfs namenode -bootstrapStandby -force On every datanode clear the data directory which is already done in your case Restart the HDFS service Hope that helps

Shelton · ‎12-04-2017

@Michael Bronson If this is a production environment I would advise you to contact hortonworks support. How many nodes in your cluster? How many Journalnodes you have in cluster ? Make sure you have odd number. Could you also confirm whether at any point after enabling the HA the Active and Standby namenodes ever functioned? Your log messages indicates that there was a timeout condition when the NameNode attempted to call the JournalNodes. The NameNode must successfully call a quorum of JournalNodes: at least 2 out of 3. This means that the call timed out to at least 2 out of 3 of them. This is a fatal condition for the NameNode, so by design, it aborts. There are multiple potential reasons for this timeout condition. Reviewing logs from the NameNodes and JournalNodes would likely reveal more details. If its a none critical cluster ,you can follow the below steps Stop the Hdfs service if its running. Start only the journal nodes (as they will need to be made aware of the formatting) On the first namenode (as user hdfs) # su - hdfs Format the namenode $ hadoop namenode -format Initialize the Edits (for the journal nodes) $ hdfs namenode -initializeSharedEdits -force Format Zookeeper (to force zookeeper to reinitialise) $ hdfs zkfc -formatZK -force Using Ambari restart that first namenode On the second namenode Sync (force synch with first namenode) $ hdfs namenode -bootstrapStandby -force On every datanode clear the data directory Restart the HDFS service Hope that helps

Shelton · ‎12-04-2017

@Michael Bronson From your screenshot, both namenodes are down hence the failure of the failover commands. Since you enabled NameNode HA using Ambari and the ZooKeeper service instances and ZooKeeper FailoverControllers to be up and running. Just restart the name nodes but its bizarre that none is marked (Active and Standby). Depending on the cluster use DEV or Prod please take the appropriate steps to restart the namenode because your cluster is now unusable anyway. Using Ambari use the HDFS restart all command under Service actions ,

Online	Offline
Last Visited	‎06-05-2025 02:03 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎06-05-2025 02:03 PM
Posts	3,676
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: how to force name node to be active

Re: how to force name node to be active

Re: how to force name node to be active

Re: how to force name node to be active

Re: how to force name node to be active

Re: Ambari failed to install services

Re: Installation of HDP2.4 with Ambari got stuck

Re: Can not start HDFS which its data was deleted ...

Re: how to force name node to be active

Re: how to force name node to be active