Created 09-05-2016 12:40 PM
Hi there,
I have a fresh installation of HDP 2.3.4 on a 5-node cluster. All of my services were running successfully, with statistics displayed in the widgets. I have not have any NameNode issues up til today.
Earlier today I started the "Enable NameNode HA" Wizard. It failed at the first step in the installation phase (I think it was the namenode) and retrying didn't work, but I wasn't able to move forward or back in the process so I left and followed https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Ambari_Users_Guide/content/_how_to_roll_....
At the end of completing the entire guide (and I've now gone back and done the whole thing over in case I missed something), I started HDFS (step 1.2.13) and the operation failed for the NameNode. I have no idea what to do! Does anyone recognize this error?
Here is the output:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 408, in <module> NameNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart self.start(env, upgrade_type=upgrade_type) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 103, in start upgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 212, in namenode create_hdfs_directories(is_active_namenode_cmd) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 278, in create_hdfs_directories only_if=check File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 463, in action_create_on_execute self.action_delayed("create") File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 460, in action_delayed self.get_hdfs_resource_executor().action_delayed(action_name, self) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 246, in action_delayed main_resource.resource.security_enabled, main_resource.resource.logoutput) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 133, in __init__ security_enabled, run_user) File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/namenode_ha_utils.py", line 167, in get_property_for_active_namenode if INADDR_ANY in value and rpc_key in hdfs_site: File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/config_dictionary.py", line 81, in __getattr__ raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!") resource_management.core.exceptions.Fail: Configuration parameter 'dfs.namenode.https-address' was not found in configurations dictionary!
Created 09-06-2016 09:03 AM
This time you are getting same error but for different property. Previously you got for 'dfs.namenode.https-address' and now you are getting for 'dfs.namenode.http-address'. Please repeat the same step again and this time use http property:
/var/lib/ambari-server/resources/scripts/configs.sh -u AMBARI_USER -p AMBARI_PASS set AMBARI_HOST_NAME CLUSTER_NAME hdfs-site dfs.namenode.http-address "abc.xyz.com:50070"
Remember the port is 50070 this time for http address.
Created 09-05-2016 08:35 PM
@Savanna Endicott You can use below command to push the property to the cluster and then try to restart NN
/var/lib/ambari-server/resources/scripts/configs.sh -u AMBARI_USER -p AMBARI_PASS set AMBARI_HOST_NAME CLUSTER_NAME PROPERTY_FILE PROPERTY_NAME "VALUE"
In your case this would be :
/var/lib/ambari-server/resources/scripts/configs.sh -u AMBARI_USER -p AMBARI_PASS set AMBARI_HOST_NAME CLUSTER_NAME hdfs-site dfs.namenode.https-address "abc.xyz.com:50470"
Replace value according to your cluster specification.Where:
AMBARI_USER - Your Ambari UI login user (Default admin) AMBARI_PASSWORD - Login user's password (Default admin) AMBARI_HOST_NAME - Your Amabri server host CLSUTER_NAME - Your cluster name (Case Sensitive) abc.xyz.com - Namenode hostname.
Created 09-06-2016 07:53 AM
Hi @lraheja, thanks for your response.
I ran the command you suggested, and the result is the same error. I had restarted all of my instances, and stopped all services again. Is there something else I could do in addition to this? It also prints out the following attempt ten times which is shown in the stdout for the operation, maybe that's the issue? I turned safemode off using the same command used to turn it off but with "leave".
2016-09-06 07:35:08,376 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://abc.xyz.com -safemode get | grep 'Safe mode is OFF'' returned 1.
And here is the error again:
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 408, in <module> NameNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 103, in start upgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 212, in namenode create_hdfs_directories(is_active_namenode_cmd) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 278, in create_hdfs_directories only_if=check File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 463, in action_create_on_execute self.action_delayed("create") File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 460, in action_delayed self.get_hdfs_resource_executor().action_delayed(action_name, self) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 246, in action_delayed main_resource.resource.security_enabled, main_resource.resource.logoutput) File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 135, in __init__ security_enabled, run_user) File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/namenode_ha_utils.py", line 167, in get_property_for_active_namenode if INADDR_ANY in value and rpc_key in hdfs_site: File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/config_dictionary.py", line 81, in __getattr__ raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!") resource_management.core.exceptions.Fail: Configuration parameter 'dfs.namenode.http-address' was not found in configurations dictionary!
Created 09-06-2016 09:03 AM
This time you are getting same error but for different property. Previously you got for 'dfs.namenode.https-address' and now you are getting for 'dfs.namenode.http-address'. Please repeat the same step again and this time use http property:
/var/lib/ambari-server/resources/scripts/configs.sh -u AMBARI_USER -p AMBARI_PASS set AMBARI_HOST_NAME CLUSTER_NAME hdfs-site dfs.namenode.http-address "abc.xyz.com:50070"
Remember the port is 50070 this time for http address.
Created 09-06-2016 09:24 AM
Oh, silly me! That fixed everything and now my namenode is working!!!! Thank you sooo much for your help.