Support Questions

Find answers, ask questions, and share your expertise

Namenode not starting on standby node Apache Hadoop HA

avatar
Explorer

 

Hello Guys ,

Problem is - i started the the cluster using ./start-all.sh from standby node . somehow it did work because ssh is not configured from standby node .

 

then i stop the cluster on standby node and started it from master node .

 

Master :

20180 QuorumPeerMain
21432 ResourceManager
20756 DataNode
21057 JournalNode
20513 NameNode
21342 DFSZKFailoverController
21675 NodeManager
22554 Jps

 

Standby :

16066 Jps
15776 NodeManager
15554 DFSZKFailoverController
14673 QuorumPeerMain
15122 DataNode
15328 JournalNode

 

Namenode process is not coming up on standby node  . log is below .

standby node IP address is correct.

 

STARTUP_MSG:   host = node01-standby/192.168.171.151
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.6.0
@                                                                                                                                                                       @                                   

STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG:   java = 1.6.0_30
************************************************************/
2015-04-05 18:46:33,902 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2015-04-05 18:46:33,903 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: createNameNode []
2015-04-05 18:46:34,276 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-04-05 18:46:34,375 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-04-05 18:46:34,375 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2015-04-05 18:46:34,377 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: fs.defaultFS is hdfs://mycluster
2015-04-05 18:46:34,379 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Clients are to use mycluster to access this namenode/service.
2015-04-05 18:46:35,156 INFO org.apache.hadoop.hdfs.DFSUtil: Starting Web-server for hdfs at: http://node01-standby:50070
2015-04-05 18:46:35,230 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-04-05 18:46:35,232 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.namenode is not defined
2015-04-05 18:46:35,249 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-04-05 18:46:35,258 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2015-04-05 18:46:35,258 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-04-05 18:46:35,258 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-04-05 18:46:35,315 INFO org.apache.hadoop.http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter)
2015-04-05 18:46:35,316 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2015-04-05 18:46:35,354 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 50070
2015-04-05 18:46:35,354 INFO org.mortbay.log: jetty-6.1.26
2015-04-05 18:46:35,863 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@node01-standby:50070
2015-04-05 18:46:35,901 WARN org.apache.hadoop.hdfs.server.common.Util: Path /app/hadoop/tmp/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
2015-04-05 18:46:35,901 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
2015-04-05 18:46:35,907 WARN org.apache.hadoop.hdfs.server.common.Util: Path /app/hadoop/tmp/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
2015-04-05 18:46:35,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No KeyProvider found.
2015-04-05 18:46:35,943 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsLock is fair:true
2015-04-05 18:46:35,972 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
2015-04-05 18:46:35,973 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2015-04-05 18:46:35,975 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2015-04-05 18:46:35,976 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: The block deletion will start around 2015 Apr 05 18:46:35
2015-04-05 18:46:35,978 INFO org.apache.hadoop.util.GSet: Computing capacity for map BlocksMap
2015-04-05 18:46:35,978 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:35,980 INFO org.apache.hadoop.util.GSet: 2.0% max memory 966.7 MB = 19.3 MB

2015-04-05 18:46:35,980 INFO org.apache.hadoop.util.GSet: capacity      = 2^21 = 2097152 entries
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication         = 2
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication             = 512
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication             = 1
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams      = 2
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer        = false
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2015-04-05 18:46:35,994 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner             = hduser (auth:SIMPLE)
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup          = supergroup
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Determined nameservice ID: mycluster
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: true
2015-04-05 18:46:35,996 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: capacity      = 2^20 = 1048576 entries
2015-04-05 18:46:36,062 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2015-04-05 18:46:36,068 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks
2015-04-05 18:46:36,068 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:36,069 INFO org.apache.hadoop.util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
2015-04-05 18:46:36,069 INFO org.apache.hadoop.util.GSet: capacity      = 2^18 = 262144 entries
2015-04-05 18:46:36,070 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2015-04-05 18:46:36,070 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
2015-04-05 18:46:36,070 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
2015-04-05 18:46:36,071 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled
2015-04-05 18:46:36,071 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2015-04-05 18:46:36,072 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache
2015-04-05 18:46:36,073 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:36,073 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
2015-04-05 18:46:36,073 INFO org.apache.hadoop.util.GSet: capacity      = 2^15 = 32768 entries
2015-04-05 18:46:36,076 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: ACLs enabled? false
2015-04-05 18:46:36,076 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: XAttrs enabled? true
2015-04-05 18:46:36,076 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 16384
2015-04-05 18:46:36,085 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /app/hadoop/tmp/dfs/name/in_use.lock acquired by nodename 14912@node01-standby
2015-04-05 18:46:38,381 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:39,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:39,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:39,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:40,393 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:40,394 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:40,398 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:41,395 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:41,399 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:41,396 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:42,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:42,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:42,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,180 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 6001 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:44,182 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 7003 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:44,407 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:44,408 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:44,408 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:45,183 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 8004 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:45,409 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:45,410 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:45,410 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:46,184 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 9005 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:46,412 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:46,416 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:46,416 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:47,186 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 10007 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:47,413 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:47,424 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:47,424 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:47,426 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.171.147:8485, 192.168.171.148:8485, 192.168.171.151:8485]. Skipping.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.171.151:8485: Call From node01-standby/192.168.171.151 to node01-standby:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.147:8485: Call From node01-standby/192.168.171.151 to node01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.148:8485: Call From node01-standby/192.168.171.151 to node02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1463)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1487)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:639)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1020)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:739)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:536)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:595)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
2015-04-05 18:46:47,430 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: No edit log streams selected.
2015-04-05 18:46:47,458 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 1 INodes.
2015-04-05 18:46:47,479 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf: Loaded FSImage in 0 seconds.
2015-04-05 18:46:47,479 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Loaded image for txid 0 from /app/hadoop/tmp/dfs/name/current/fsimage_0000000000000000000
2015-04-05 18:46:47,483 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false)
2015-04-05 18:46:47,483 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups
2015-04-05 18:46:47,483 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 11407 msecs
2015-04-05 18:46:47,620 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to node01-standby:8020
2015-04-05 18:46:47,623 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-04-05 18:46:47,632 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2015-04-05 18:46:47,654 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean

015-04-05 18:46:47,654 WARN org.apache.hadoop.hdfs.server.common.Util: Path /app/hadoop/tmp/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of blocks under construction: 0
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of blocks under construction: 0
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 11 secs
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
2015-04-05 18:46:47,706 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-04-05 18:46:47,707 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8020: starting
2015-04-05 18:46:47,708 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode RPC up at: node01-standby/192.168.171.151:8020
2015-04-05 18:46:47,708 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state
2015-04-05 18:46:47,710 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Will roll logs on active node at node01/192.168.171.147:8020 every 120 seconds.
2015-04-05 18:46:47,721 INFO org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Starting standby checkpoint thread...
Checkpointing active NN at http://node01:50070
Serving checkpoints at http://node01-standby:50070
2015-04-05 18:46:48,723 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:48,724 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:48,724 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:49,725 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:49,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:49,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:50,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:50,727 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:50,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:51,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:51,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:51,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:52,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:57,768 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.171.147:8485, 192.168.171.148:8485, 192.168.171.151:8485]. Skipping.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.171.151:8485: Call From node01-standby/192.168.171.151 to node01-standby:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.147:8485: Call From node01-standby/192.168.171.151 to node01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.148:8485: Call From node01-standby/192.168.171.151 to node02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1463)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1487)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:212)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:324)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
2015-04-05 18:46:57,768 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state
2015-04-05 18:46:57,768 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:337)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
2015-04-05 18:46:57,769 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state
2015-04-05 18:46:57,807 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Starting recovery process for unclosed journal segments...
2015-04-05 18:46:58,844 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:58,845 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:47:07,870 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.171.147:8485, 192.168.171.148:8485, 192.168.171.151:8485], stream=null))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.171.151:8485: Call From node01-standby/192.168.171.151 to node01-standby:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.148:8485: Call From node01-standby/192.168.171.151 to node02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.147:8485: Call From node01-standby/192.168.171.151 to node01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createNewUniqueEpoch(QuorumJournalManager.java:182)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:436)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1394)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1149)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1655)
        at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1533)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1246)
        at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
2015-04-05 18:47:07,872 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-04-05 18:47:07,873 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node01-standby/192.168.171.151
************************************************************/

 

1 ACCEPTED SOLUTION

avatar
Explorer

Its resloved now .

 

 

 

View solution in original post

4 REPLIES 4

avatar
Explorer

Its resloved now .

 

 

 

avatar
New Contributor

Hi, Could you please let me know what you did to resolve this ?

avatar
New Contributor

How was this resolved?

avatar
New Contributor

Useless if not resolution specified.