Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

cant start Standby NameNode

Solved Go to solution

cant start Standby NameNode

we are trying to start the Standby NameNode on master03 machines but withou success

from the error log we can see the follwing

but we cant capture what is the problem , please advice what chuld be the reason that namenode not started according to the follwing log

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 424, in <module>
    NameNode().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 314, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 100, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 167, in namenode
    create_log_dir=True
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 271, in service
    Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports)
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run
    tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303, in _call
    raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode'' returned 1. starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-master03.sys57.com.out
Michael-Bronson
1 ACCEPTED SOLUTION

Accepted Solutions

Re: cant start Standby NameNode

Super Mentor

@Michael Bronson

If this error is somewhat related to the other thread that you posted recently: https://community.hortonworks.com/questions/168750/ambari-cluster-no-valid-image-files-found.html?ch...

Then please apply the same solution and close one of the thread.

Pasting the steps here:

Please check if the dfs.namenode.name.dir (default path: /hadoop/hdfs/namenode) directory is empty by any chance, due to disk issue the files are not present there.

If this is the case and the Active NameNode is already running (this must be true) then you can try the following:

Try running the following command:

# su - hdfs 
# hdfs namenode -bootstrapStandby <br>

NOTE: Please run this command ONLY on Standby NameNode. DO NOT run this command on Active NameNode. This command will try to recover all metadata on Standby NameNode.
.

- Now try to start Standby NameNode from Ambari
- Also please Restart ZKFailoverController from Ambari

.

6 REPLIES 6

Re: cant start Standby NameNode

when I run it alone we get -

 su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ;  /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start namenode'
starting namenode, logging to /var/log/hadoop/hdfs/hadoop-hdfs-namenode-master03.sys57.com.out
echo $?
1
Michael-Bronson

Re: cant start Standby NameNode

Super Mentor

@Michael Bronson

If this error is somewhat related to the other thread that you posted recently: https://community.hortonworks.com/questions/168750/ambari-cluster-no-valid-image-files-found.html?ch...

Then please apply the same solution and close one of the thread.

Pasting the steps here:

Please check if the dfs.namenode.name.dir (default path: /hadoop/hdfs/namenode) directory is empty by any chance, due to disk issue the files are not present there.

If this is the case and the Active NameNode is already running (this must be true) then you can try the following:

Try running the following command:

# su - hdfs 
# hdfs namenode -bootstrapStandby <br>

NOTE: Please run this command ONLY on Standby NameNode. DO NOT run this command on Active NameNode. This command will try to recover all metadata on Standby NameNode.
.

- Now try to start Standby NameNode from Ambari
- Also please Restart ZKFailoverController from Ambari

.

Re: cant start Standby NameNode

@jay I run the hdfs namenode -bootstrapStandby on stand by but I get

Retrying connect to server: master01.sys57.com/100.4.3.21:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)

and that because both name node are down - I can start the name node on both machines

Michael-Bronson

Re: cant start Standby NameNode

@Jay so how to connue from this step?

Michael-Bronson
Highlighted

Re: cant start Standby NameNode

One NN should be active to run the bootstrap command. you need to bring the healthy NN up and running first.

Re: cant start Standby NameNode

can I run like - hdfs namenode -bootstrap.... on the active node , if yes then what is the complete syntax ?

Michael-Bronson