Support Questions
Find answers, ask questions, and share your expertise

Namenode impossible to leave safemode

Namenode impossible to leave safemode

New Contributor

Hi all,

 

We have a big problem with Namenode that does not want to leave Safemode with the following error:

sudo -u hdfs hdfs dfsadmin -safemode leave

/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: syntax error near unexpected token `export'

/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: `export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS} '

safemode: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

 

We changed all the owner of the files to hdfs and the access to 775 in

var/log/hadoop/hdfs 

-rw-r--r-- 1 hdfs hadoop       948 Oct 30 16:11 hadoop-hdfs-datanode-namenode.out

-rwxrwxrwx 1 hdfs hadoop       948 Oct 30 13:22 hadoop-hdfs-datanode-namenode.out.1

-rwxrwxrwx 1 hdfs hadoop       948 Oct 30 13:16 hadoop-hdfs-datanode-namenode.out.2

-rwxrwxrwx 1 hdfs hadoop       948 Oct 30 12:00 hadoop-hdfs-datanode-namenode.out.3

-rwxrwxrwx 1 hdfs hadoop       948 Aug 29 11:16 hadoop-hdfs-datanode-namenode.out.4

-rwxrwxrwx 1 hdfs hadoop       948 Aug 29 10:59 hadoop-hdfs-datanode-namenode.out.5

-rwxrwxrwx 1 hdfs hadoop  21296280 Oct 30 16:11 hadoop-hdfs-namenode-namenode.log

-rwxrwxrwx 1 hdfs hadoop 268463926 Oct 30 12:01 hadoop-hdfs-namenode-namenode.log.1

-rwxrwxrwx 1 hdfs hadoop 268435509 Jan 24  2019 hadoop-hdfs-namenode-namenode.log.10

-rwxrwxrwx 1 hdfs hadoop 268435681 Aug 11 17:12 hadoop-hdfs-namenode-namenode.log.2

-rwxrwxrwx 1 hdfs hadoop 268435701 Jul 18 12:17 hadoop-hdfs-namenode-namenode.log.3

-rwxrwxrwx 1 hdfs hadoop 268435683 Jul  2 18:53 hadoop-hdfs-namenode-namenode.log.4

-rwxrwxrwx 1 hdfs hadoop 268435504 Jun  2 07:52 hadoop-hdfs-namenode-namenode.log.5

-rwxrwxrwx 1 hdfs hadoop 268435570 May 16 04:15 hadoop-hdfs-namenode-namenode.log.6

-rwxrwxrwx 1 hdfs hadoop 268435521 Apr 14  2019 hadoop-hdfs-namenode-namenode.log.7

 

Nothing on the logs.  The IPtables and the SELINUX are deactivated.

 

Any help please ?

-rwxrwxrwx 1 hdfs hadoop 268435618 Mar 16  2019 hadoop-hdfs-namenode-namenode.log.8

4 REPLIES 4

Re: Namenode impossible to leave safemode

Mentor

@OmarYa 

 

First, you will need to revert the permissions on the logs, it's not a good idea to make these files writable to the world

 

# chmod 644 * /var/log/hadoop/hdfs

Then you need to do some housekeeping in /var/log  just out of curiosity can you share the output of

 

# df -h

 

You could be running out of space in /var if that's 100% and that could be a problem just from seeing your logs you should delete the below files not the dot (.) you should look at log rotation have a look at your hdfs-log4j settings

 

rm -rf  hadoop-hdfs-datanode-namenode.out.*

rm -rf hadoop-hdfs-namenode-namenode.log.*

 

Hadoop enters safe mode to auto-protect itself against any changes that it cannot log!  Your problem could be the file system is full /var being 100%  which is the source of the problem. Use the below snippet to locate the large files and clean

 

# du -a /var/log | sort -n -r | head -n 20

Usual culprits are kafka ,ranger,ambari-server and ambari-agent see my output

179688 /var/log
96988 /var/log/kafka
26680 /var/log/hadoop-yarn
26252 /var/log/hadoop-yarn/yarn
21624 /var/log/kafka/server.log.2019-10-28-22
21044 /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-<FQDN>.log
19364 /var/log/kafka/server.log.2019-10-27-21
18156 /var/log/kafka/server.log.2019-10-27-20
10832 /var/log/kafka/server.log.2019-10-30-20
10792 /var/log/kafka/server.log.2019-10-30-19
10096 /var/log/hadoop
8244 /var/log/hadoop/hdfs
7648 /var/log/ambari-agent
7640 /var/log/ambari-agent/ambari-agent.log
6916 /var/log/hadoop/hdfs/hadoop-hdfs-namenode-<FQDN>.log
6472 /var/log/kafka/server.log.2019-10-28-21
6168 /var/log/ambari-server
5380 /var/log/ambari-server/ambari-server.log
4864 /var/log/ambari-infra-solr
4340 /var/log/kafka/server.log.2019-10-28-23


That should resolve your safe mode issue, a good practice is to merge the fsimage and edits logs while in safe mode so that cluster startup time is reduced

 

$ hdfs dfsadmin -saveNamespace

Then

$ hdfs dfsadmin -safemode leave

 

Please revert

 

Re: Namenode impossible to leave safemode

New Contributor

thanks for the reply. It's not a space problem:

df -h

Filesystem      Size  Used Avail Use% Mounted on

/dev/sda1       985G  332G  603G  36% /

tmpfs            56G   12K   56G   1% /dev/shm

/dev/sdb1       788G  297G  452G  40% /mnt/resource

 

I deleted all the logs like you mentioned but I still have the same error when doing: hdfs dfsadmin -saveNamespace

 

sudo -u hdfs hdfs dfsadmin -saveNamespace

/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: syntax error near unexpected token `export'

/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: `export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS} '

saveNamespace: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

 

sudo hdfs dfsadmin -saveNamespace

/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: syntax error near unexpected token `export'

/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: `export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS} '

saveNamespace: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

Re: Namenode impossible to leave safemode

Super Guru

@OmarYa,

 

The error you are seeing indicates that the client you are running cannot connect to the specified host/port:

safemode: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: 
java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

 Make sure the namenode host and port 8020 are accessible from your client host.

Also, while it may not be related, it appears hadoop-env.sh has a syntax error in it.  Depending on what happens on that hadoop-env.sh, it could be possible that something going wrong there is leading to the client not attempting to connect to the correct host / port.

 

I'd focus, though, on making sure the connection is going to the right host:port and that the host:port are accessible.

Re: Namenode impossible to leave safemode

Mentor

@OmarYa 

Can you describe your cluster whether its single-node /HA /OS

Assuming your cluster is single node and  NOT HA please execute the below command as user root to start the namenode manually.

# su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode"


And share the output from the above command and the screenshot of your Ambari -->HDFS UI ?