Posts: 10
Registered: ‎06-19-2018

Namenodes going down by AbstractDelegationTokenSecretManager error

[ Edited ]

Hi, I'm experimenting the following error every single day on the active Namenode, that causes the Namenode becomes to Standby and no transition to the Standby Namenode is done, so HDFS is going down and Impala queries or Spark jobs falls down too: 2019-05-13 09:20:55,631 ERROR ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted 2019-05-13 09:20:55,632 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Ending log segment 3390145558, 3390146568 2019-05-13 09:20:55,633 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: LazyPersistFileScrubber was interrupted, exiting 2019-05-13 09:20:55,636 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 1012 Total time for transactions(ms): 17 Number of transactions batched in Syncs: 71 Number of syncs: 941 SyncTimes(ms): 1278 1683 2019-05-13 09:20:55,637 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNodeEditLogRoller was interrupted, exiting 2019-05-13 09:20:55,653 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /mnt/data/dfs/nn/current/edits_inprogress_0000000003390145558 -> /mnt/data/dfs/nn/current/edits_0000000003390145558-0000000003390146569 2019-05-13 09:20:55,654 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: FSEditLogAsync was interrupted, exiting 2019-05-13 09:20:55,660 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Shutting down CacheReplicationMonitor 2019-05-13 09:20:55,661 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state 2019-05-13 09:20:55,665 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Will roll logs on active node at every 120 seconds. 2019-05-13 09:20:55,667 INFO org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Starting standby checkpoint thread... How can I solve this?

Cloudera Employee
Posts: 53
Registered: ‎09-08-2017

Re: Namenodes going down by AbstractDelegationTokenSecretManager error

Hi rlopez,

Nothing in the stack you've posted suggests any issue with the NameNode itself. Something else must be happening before this point to trigger your Failover Controller to transition this NameNode to Standby.

Usually this happens when the FC cannot get a response from the NameNode due to something like excessive GC pausing on the NameNode itself. I would advise searching the Failover Controller logs for the point at which it decided to transition this NameNode to Standby, take that timestamp and then search the NameNode logs for the same timestamp.