Created on 12-25-201607:52 AM - edited 09-16-202201:37 AM
SYMPTOM: While following steps for HDP pre-upgrade activity the active namenode went down while issuing - #hdfs dfsadmin -savenamespace" command.
Below was the error -
================================================
hdfs@namenode1~$ hdfs dfsadmin -saveNamespace
saveNamespace: Call From namenode1.example.com/10.160.81.30 to namenode1.example.com:8020
failed on connection exception: java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
================================================
ERROR:
2016-06-14 02:18:49,774 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2016-06-14 02:18:51,774 INFO ipc.Client (Client.java:handleConnectionFailure(859)) - Retrying connect to server: namenode1.example.com/10.10.20.30:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
2016-06-14 02:18:51,775 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2016-06-14 02:18:53,776 INFO ipc.Client (Client.java:handleConnectionFailure(859)) - Retrying connect to server: namenode1.example.com/10.10.20.30:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
2016-06-14 02:18:53,777 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2016-06-14 02:18:55,778 INFO ipc.Client (Client.java:handleConnectionFailure(859)) - Retrying connect to server: namenode1.example.com/10.10.20.30:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
2016-06-14 02:18:55,778 WARN ha.HealthMonitor (HealthMonitor.java:doHealthChecks(209)) - Transport-level exception trying to monitor health of NameNode at namenode1.example.com/10.10.20.30:8020: Call From namenode1.example.com/10.10.20.30 to namenode1.example.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
OR
ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Swallowing exception in NameNodeEditLogRoller:
java.lang.IllegalStateException: Bad state: BETWEEN_LOG_SEGMENTS
at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.getCurSegmentTxId(FSEditLog.java:493)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$NameNodeEditLogRoller.run(FSNamesystem.java:4358)
at java.lang.Thread.run(Thread.java:745)