Community Articles

sshimpi · ‎12-25-2016

SYMPTOM: While following steps for HDP pre-upgrade activity the active namenode went down while issuing - #hdfs dfsadmin -savenamespace" command.

Below was the error -

================================================ 
hdfs@namenode1~$ hdfs dfsadmin -saveNamespace 
saveNamespace: Call From namenode1.example.com/10.160.81.30 to namenode1.example.com:8020 
failed on connection exception: java.net.ConnectException: Connection refused; For more details see: 
http://wiki.apache.org/hadoop/ConnectionRefused
================================================

ERROR:

2016-06-14 02:18:49,774 WARN  ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

2016-06-14 02:18:51,774 INFO  ipc.Client (Client.java:handleConnectionFailure(859)) - Retrying connect to server: namenode1.example.com/10.10.20.30:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)

2016-06-14 02:18:51,775 WARN  ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

2016-06-14 02:18:53,776 INFO  ipc.Client (Client.java:handleConnectionFailure(859)) - Retrying connect to server: namenode1.example.com/10.10.20.30:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)

2016-06-14 02:18:53,777 WARN  ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

2016-06-14 02:18:55,778 INFO  ipc.Client (Client.java:handleConnectionFailure(859)) - Retrying connect to server: namenode1.example.com/10.10.20.30:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)

2016-06-14 02:18:55,778 WARN  ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

OR

ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Swallowing exception in NameNodeEditLogRoller:
java.lang.IllegalStateException: Bad state: BETWEEN_LOG_SEGMENTS
        at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.getCurSegmentTxId(FSEditLog.java:493)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeEditLogRoller.run(FSNamesystem.java:4358)
        at java.lang.Thread.run(Thread.java:745)

ROOT CAUSE: This is a BUG https://issues.apache.org/jira/browse/HDFS-7871 and it has been fixed in HDP 2.2.9 and HDP 2.4.

RESOLUTION: Upgrading to HDP 2.4.0.0-169 resolved the issue.

Cloudera Community

Community Articles

Active namenode getting down while we issue "hdfs dfsadmin -savenamespace"

Apache Hadoop

HDFS

Hortonworks Data Platform (HDP)

Explaining "block missing" and "block corruption" ...

How to resolve : caught exception: No active NameN...

The active NameNode is out of sync with this Journ...

Namenode Issue Active and Standby

DataNode cannot send block report to NameNode due ...

Scaling the HDFS NameNode (part 5)

Activity monitor Issue

Namenode Issue Active and Standby issue in ambari ...

How to resolve : NameNode nn1 is not listed as Act...

Cannot get a file on HDFS becouse of "java.lang.Ar...