Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NameNode always coming up in safe mode after installation.

avatar
Explorer

Hi All

I have installed HDP 2.6 in RHEL 7.4 in Azure.

The installation process is completed. But when I started the cluster Name node is going in safe mode. Due to that other services are not able to come up..

I tried to manually exit from safe-mode, and tried restarting the services. its all starting fine.

Any suggestions.

Thanks in advance

Muthu

Error log :

2018-09-05 12:04:03,445 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:15,700 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:27,959 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:40,239 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:52,457 - The NameNode is still in Safemode. Please be careful with commands that need Safemode OFF.

2018-09-05 12:04:52,458 - HdfsResource['/tmp'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://bdm.localdomain:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'owner': 'hdfs', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0777}

2018-09-05 12:04:52,461 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET '"'"'http://bdm.localdomain:50070/webhdfs/v1/tmp?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpE8fWXX 2>/tmp/tmptJfEmS''] {'logoutput': None, 'quiet': False}

2018-09-05 12:04:53,922 - call returned (0, '')

2018-09-05 12:04:53,923 - Skipping the operation for not managed DFS directory /tmp since immutable_paths contains it.

2018-09-05 12:04:53,924 - HdfsResource['/user/ambari-qa'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://bdm.localdomain:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'owner': 'ambari-qa', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0770}

2018-09-05 12:04:53,926 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET '"'"'http://bdm.localdomain:50070/webhdfs/v1/user/ambari-qa?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpujRH4r 2>/tmp/tmp4594S5''] {'logoutput': None, 'quiet': False}

2018-09-05 12:04:53,998 - call returned (0, '')

2018-09-05 12:04:53,999 - HdfsResource[None] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://bdm.localdomain:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp']}

2018-09-05 12:04:53,999 - Ranger Hdfs plugin is not enabled

Command completed successfully!

4 REPLIES 4

avatar
Explorer

I did some analysis with and found that few blocks are corrupted in HDFS. even tried to delete the corrupted files. once it is done. It worked fine for few hours.

The moment i restart the server, my NN is coming again. in safe mode..

Since NN is in safe mode.. other services are not coming up..

Any suggestions.

Thx

Muthu

avatar
Expert Contributor
@Muthukumar Somasundaram

Namenode will be in safemode until it receives the specified percentage(dfs.namenode.safemode.threshold-pct=0.999f) of blocks that should satisfy minimal replication and it should be reported to namenode.

In your case, Namenode still waiting for block report from datanodes. Please ensure that all datanodes are up and running, and check if datanode is sending block report.

Addition, Check how many blocks so far reported to namenode?

ie. The reported blocks 71 needs additional 17 blocks to reach the threshold 1.0000 of total blocks 87.

avatar
Explorer

hi Karthick

I resolved this issue by formatting the NN node since i found few corrupted blocks in hdfs. mine is single node cluster.

Formatting resolved the issue.

Thx

Muthu

avatar
Expert Contributor

@Muthukumar Somasundaram

Formatting is not an ideal option to solve this issue. In this case, you lost all your data.