Support Questions

smkmuthu · ‎09-05-2018

Hi All

I have installed HDP 2.6 in RHEL 7.4 in Azure.

The installation process is completed. But when I started the cluster Name node is going in safe mode. Due to that other services are not able to come up..

I tried to manually exit from safe-mode, and tried restarting the services. its all starting fine.

Any suggestions.

Thanks in advance

Muthu

Error log :

2018-09-05 12:04:03,445 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:15,700 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:27,959 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:40,239 - Retrying after 10 seconds. Reason: Execution of '/usr/hdp/current/hadoop-hdfs-namenode/bin/hdfs dfsadmin -fs hdfs://bdm.localdomain:8020 -safemode get | grep 'Safe mode is OFF'' returned 1.

2018-09-05 12:04:52,457 - The NameNode is still in Safemode. Please be careful with commands that need Safemode OFF.

2018-09-05 12:04:52,458 - HdfsResource['/tmp'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://bdm.localdomain:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'owner': 'hdfs', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0777}

2018-09-05 12:04:52,461 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET '"'"'http://bdm.localdomain:50070/webhdfs/v1/tmp?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpE8fWXX 2>/tmp/tmptJfEmS''] {'logoutput': None, 'quiet': False}

2018-09-05 12:04:53,922 - call returned (0, '')

2018-09-05 12:04:53,923 - Skipping the operation for not managed DFS directory /tmp since immutable_paths contains it.

2018-09-05 12:04:53,924 - HdfsResource['/user/ambari-qa'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://bdm.localdomain:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'owner': 'ambari-qa', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0770}

2018-09-05 12:04:53,926 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET '"'"'http://bdm.localdomain:50070/webhdfs/v1/user/ambari-qa?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpujRH4r 2>/tmp/tmp4594S5''] {'logoutput': None, 'quiet': False}

2018-09-05 12:04:53,998 - call returned (0, '')

2018-09-05 12:04:53,999 - HdfsResource[None] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'dfs_type': '', 'default_fs': 'hdfs://bdm.localdomain:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': None, 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp']}

2018-09-05 12:04:53,999 - Ranger Hdfs plugin is not enabled

Command completed successfully!

smkmuthu · ‎09-06-2018

I did some analysis with and found that few blocks are corrupted in HDFS. even tried to delete the corrupted files. once it is done. It worked fine for few hours.

The moment i restart the server, my NN is coming again. in safe mode..

Since NN is in safe mode.. other services are not coming up..

Any suggestions.

Thx

Muthu

kpalanisamy · ‎09-06-2018

@Muthukumar Somasundaram

Namenode will be in safemode until it receives the specified percentage(dfs.namenode.safemode.threshold-pct=0.999f) of blocks that should satisfy minimal replication and it should be reported to namenode.

In your case, Namenode still waiting for block report from datanodes. Please ensure that all datanodes are up and running, and check if datanode is sending block report.

Addition, Check how many blocks so far reported to namenode?

ie. The reported blocks 71 needs additional 17 blocks to reach the threshold 1.0000 of total blocks 87.

smkmuthu · ‎10-04-2018

hi Karthick

I resolved this issue by formatting the NN node since i found few corrupted blocks in hdfs. mine is single node cluster.

Formatting resolved the issue.

Thx

Muthu

kpalanisamy · ‎10-04-2018

@Muthukumar Somasundaram

Formatting is not an ideal option to solve this issue. In this case, you lost all your data.

Cloudera Community

Support Questions

NameNode always coming up in safe mode after installation.

Namenode safe mode

namenode is in safe mode

Forcefully Exit NameNode Safe Mode

Safe mode is OFF

Exercise 1: Name node in safe mode Error

Why you should always set udp_preference_limit to ...

hive always local mode

Scaling the HDFS NameNode (part 5)

Garbage Collection Pauses in Namenode and Datanode

ERROR Installing Agent