Hi all,
We have a big problem with Namenode that does not want to leave Safemode with the following error:
sudo -u hdfs hdfs dfsadmin -safemode leave
/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: syntax error near unexpected token `export'
/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: `export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS} '
safemode: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
We changed all the owner of the files to hdfs and the access to 775 in
var/log/hadoop/hdfs
-rw-r--r-- 1 hdfs hadoop 948 Oct 30 16:11 hadoop-hdfs-datanode-namenode.out
-rwxrwxrwx 1 hdfs hadoop 948 Oct 30 13:22 hadoop-hdfs-datanode-namenode.out.1
-rwxrwxrwx 1 hdfs hadoop 948 Oct 30 13:16 hadoop-hdfs-datanode-namenode.out.2
-rwxrwxrwx 1 hdfs hadoop 948 Oct 30 12:00 hadoop-hdfs-datanode-namenode.out.3
-rwxrwxrwx 1 hdfs hadoop 948 Aug 29 11:16 hadoop-hdfs-datanode-namenode.out.4
-rwxrwxrwx 1 hdfs hadoop 948 Aug 29 10:59 hadoop-hdfs-datanode-namenode.out.5
-rwxrwxrwx 1 hdfs hadoop 21296280 Oct 30 16:11 hadoop-hdfs-namenode-namenode.log
-rwxrwxrwx 1 hdfs hadoop 268463926 Oct 30 12:01 hadoop-hdfs-namenode-namenode.log.1
-rwxrwxrwx 1 hdfs hadoop 268435509 Jan 24 2019 hadoop-hdfs-namenode-namenode.log.10
-rwxrwxrwx 1 hdfs hadoop 268435681 Aug 11 17:12 hadoop-hdfs-namenode-namenode.log.2
-rwxrwxrwx 1 hdfs hadoop 268435701 Jul 18 12:17 hadoop-hdfs-namenode-namenode.log.3
-rwxrwxrwx 1 hdfs hadoop 268435683 Jul 2 18:53 hadoop-hdfs-namenode-namenode.log.4
-rwxrwxrwx 1 hdfs hadoop 268435504 Jun 2 07:52 hadoop-hdfs-namenode-namenode.log.5
-rwxrwxrwx 1 hdfs hadoop 268435570 May 16 04:15 hadoop-hdfs-namenode-namenode.log.6
-rwxrwxrwx 1 hdfs hadoop 268435521 Apr 14 2019 hadoop-hdfs-namenode-namenode.log.7
Nothing on the logs. The IPtables and the SELINUX are deactivated.
Any help please ?
-rwxrwxrwx 1 hdfs hadoop 268435618 Mar 16 2019 hadoop-hdfs-namenode-namenode.log.8
Created 10-30-2019 01:17 PM
First, you will need to revert the permissions on the logs, it's not a good idea to make these files writable to the world
# chmod 644 * /var/log/hadoop/hdfs
Then you need to do some housekeeping in /var/log just out of curiosity can you share the output of
# df -h
You could be running out of space in /var if that's 100% and that could be a problem just from seeing your logs you should delete the below files not the dot (.) you should look at log rotation have a look at your hdfs-log4j settings
rm -rf hadoop-hdfs-datanode-namenode.out.*
rm -rf hadoop-hdfs-namenode-namenode.log.*
Hadoop enters safe mode to auto-protect itself against any changes that it cannot log! Your problem could be the file system is full /var being 100% which is the source of the problem. Use the below snippet to locate the large files and clean
# du -a /var/log | sort -n -r | head -n 20
Usual culprits are kafka ,ranger,ambari-server and ambari-agent see my output
179688 /var/log
96988 /var/log/kafka
26680 /var/log/hadoop-yarn
26252 /var/log/hadoop-yarn/yarn
21624 /var/log/kafka/server.log.2019-10-28-22
21044 /var/log/hadoop-yarn/yarn/yarn-yarn-nodemanager-<FQDN>.log
19364 /var/log/kafka/server.log.2019-10-27-21
18156 /var/log/kafka/server.log.2019-10-27-20
10832 /var/log/kafka/server.log.2019-10-30-20
10792 /var/log/kafka/server.log.2019-10-30-19
10096 /var/log/hadoop
8244 /var/log/hadoop/hdfs
7648 /var/log/ambari-agent
7640 /var/log/ambari-agent/ambari-agent.log
6916 /var/log/hadoop/hdfs/hadoop-hdfs-namenode-<FQDN>.log
6472 /var/log/kafka/server.log.2019-10-28-21
6168 /var/log/ambari-server
5380 /var/log/ambari-server/ambari-server.log
4864 /var/log/ambari-infra-solr
4340 /var/log/kafka/server.log.2019-10-28-23
That should resolve your safe mode issue, a good practice is to merge the fsimage and edits logs while in safe mode so that cluster startup time is reduced
$ hdfs dfsadmin -saveNamespace
Then
$ hdfs dfsadmin -safemode leave
Please revert
Created 10-31-2019 12:50 AM
thanks for the reply. It's not a space problem:
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 985G 332G 603G 36% /
tmpfs 56G 12K 56G 1% /dev/shm
/dev/sdb1 788G 297G 452G 40% /mnt/resource
I deleted all the logs like you mentioned but I still have the same error when doing: hdfs dfsadmin -saveNamespace
sudo -u hdfs hdfs dfsadmin -saveNamespace
/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: syntax error near unexpected token `export'
/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: `export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS} '
saveNamespace: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
sudo hdfs dfsadmin -saveNamespace
/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: syntax error near unexpected token `export'
/usr/hdp/2.6.4.0-91/hadoop/conf/hadoop-env.sh: line 51: `export HADOOP_CLASSPATH=${HADOOP_CLASSPATH}${JAVA_JDBC_LIBS} '
saveNamespace: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Created 10-31-2019 10:47 AM
The error you are seeing indicates that the client you are running cannot connect to the specified host/port:
safemode: Call From namenode/10.0.0.4 to namenode:8020 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Make sure the namenode host and port 8020 are accessible from your client host.
Also, while it may not be related, it appears hadoop-env.sh has a syntax error in it. Depending on what happens on that hadoop-env.sh, it could be possible that something going wrong there is leading to the client not attempting to connect to the correct host / port.
I'd focus, though, on making sure the connection is going to the right host:port and that the host:port are accessible.
Created 10-31-2019 11:03 AM
Can you describe your cluster whether its single-node /HA /OS ?
Assuming your cluster is single node and NOT HA please execute the below command as user root to start the namenode manually.
# su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode"
And share the output from the above command and the screenshot of your Ambari -->HDFS UI ?