Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Namenode not starting on standby node Apache Hadoop HA

avatar
Explorer

 

Hello Guys ,

Problem is - i started the the cluster using ./start-all.sh from standby node . somehow it did work because ssh is not configured from standby node .

 

then i stop the cluster on standby node and started it from master node .

 

Master :

20180 QuorumPeerMain
21432 ResourceManager
20756 DataNode
21057 JournalNode
20513 NameNode
21342 DFSZKFailoverController
21675 NodeManager
22554 Jps

 

Standby :

16066 Jps
15776 NodeManager
15554 DFSZKFailoverController
14673 QuorumPeerMain
15122 DataNode
15328 JournalNode

 

Namenode process is not coming up on standby node  . log is below .

standby node IP address is correct.

 

STARTUP_MSG:   host = node01-standby/192.168.171.151
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 2.6.0
@                                                                                                                                                                       @                                   

STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG:   java = 1.6.0_30
************************************************************/
2015-04-05 18:46:33,902 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2015-04-05 18:46:33,903 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: createNameNode []
2015-04-05 18:46:34,276 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2015-04-05 18:46:34,375 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2015-04-05 18:46:34,375 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2015-04-05 18:46:34,377 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: fs.defaultFS is hdfs://mycluster
2015-04-05 18:46:34,379 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Clients are to use mycluster to access this namenode/service.
2015-04-05 18:46:35,156 INFO org.apache.hadoop.hdfs.DFSUtil: Starting Web-server for hdfs at: http://node01-standby:50070
2015-04-05 18:46:35,230 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-04-05 18:46:35,232 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.namenode is not defined
2015-04-05 18:46:35,249 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2015-04-05 18:46:35,258 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2015-04-05 18:46:35,258 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-04-05 18:46:35,258 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-04-05 18:46:35,315 INFO org.apache.hadoop.http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter)
2015-04-05 18:46:35,316 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2015-04-05 18:46:35,354 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 50070
2015-04-05 18:46:35,354 INFO org.mortbay.log: jetty-6.1.26
2015-04-05 18:46:35,863 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@node01-standby:50070
2015-04-05 18:46:35,901 WARN org.apache.hadoop.hdfs.server.common.Util: Path /app/hadoop/tmp/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
2015-04-05 18:46:35,901 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
2015-04-05 18:46:35,907 WARN org.apache.hadoop.hdfs.server.common.Util: Path /app/hadoop/tmp/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
2015-04-05 18:46:35,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No KeyProvider found.
2015-04-05 18:46:35,943 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsLock is fair:true
2015-04-05 18:46:35,972 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
2015-04-05 18:46:35,973 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2015-04-05 18:46:35,975 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2015-04-05 18:46:35,976 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: The block deletion will start around 2015 Apr 05 18:46:35
2015-04-05 18:46:35,978 INFO org.apache.hadoop.util.GSet: Computing capacity for map BlocksMap
2015-04-05 18:46:35,978 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:35,980 INFO org.apache.hadoop.util.GSet: 2.0% max memory 966.7 MB = 19.3 MB

2015-04-05 18:46:35,980 INFO org.apache.hadoop.util.GSet: capacity      = 2^21 = 2097152 entries
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication         = 2
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication             = 512
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication             = 1
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams      = 2
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer        = false
2015-04-05 18:46:35,989 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2015-04-05 18:46:35,994 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner             = hduser (auth:SIMPLE)
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup          = supergroup
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Determined nameservice ID: mycluster
2015-04-05 18:46:35,995 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: true
2015-04-05 18:46:35,996 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
2015-04-05 18:46:36,051 INFO org.apache.hadoop.util.GSet: capacity      = 2^20 = 1048576 entries
2015-04-05 18:46:36,062 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2015-04-05 18:46:36,068 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks
2015-04-05 18:46:36,068 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:36,069 INFO org.apache.hadoop.util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
2015-04-05 18:46:36,069 INFO org.apache.hadoop.util.GSet: capacity      = 2^18 = 262144 entries
2015-04-05 18:46:36,070 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2015-04-05 18:46:36,070 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
2015-04-05 18:46:36,070 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
2015-04-05 18:46:36,071 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled
2015-04-05 18:46:36,071 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2015-04-05 18:46:36,072 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache
2015-04-05 18:46:36,073 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
2015-04-05 18:46:36,073 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
2015-04-05 18:46:36,073 INFO org.apache.hadoop.util.GSet: capacity      = 2^15 = 32768 entries
2015-04-05 18:46:36,076 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: ACLs enabled? false
2015-04-05 18:46:36,076 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: XAttrs enabled? true
2015-04-05 18:46:36,076 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 16384
2015-04-05 18:46:36,085 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /app/hadoop/tmp/dfs/name/in_use.lock acquired by nodename 14912@node01-standby
2015-04-05 18:46:38,381 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:39,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:39,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:39,391 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:40,393 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:40,394 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:40,398 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:41,395 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:41,399 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:41,396 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:42,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:42,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:42,402 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,180 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 6001 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:43,405 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:44,182 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 7003 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:44,407 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:44,408 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:44,408 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:45,183 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 8004 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:45,409 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:45,410 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:45,410 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:46,184 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 9005 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:46,412 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:46,416 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:46,416 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:47,186 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 10007 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.
2015-04-05 18:46:47,413 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:47,424 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:47,424 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:47,426 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.171.147:8485, 192.168.171.148:8485, 192.168.171.151:8485]. Skipping.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.171.151:8485: Call From node01-standby/192.168.171.151 to node01-standby:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.147:8485: Call From node01-standby/192.168.171.151 to node01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.148:8485: Call From node01-standby/192.168.171.151 to node02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1463)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1487)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:639)
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1020)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:739)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:536)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:595)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504)
2015-04-05 18:46:47,430 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: No edit log streams selected.
2015-04-05 18:46:47,458 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 1 INodes.
2015-04-05 18:46:47,479 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf: Loaded FSImage in 0 seconds.
2015-04-05 18:46:47,479 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Loaded image for txid 0 from /app/hadoop/tmp/dfs/name/current/fsimage_0000000000000000000
2015-04-05 18:46:47,483 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false)
2015-04-05 18:46:47,483 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups
2015-04-05 18:46:47,483 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 11407 msecs
2015-04-05 18:46:47,620 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to node01-standby:8020
2015-04-05 18:46:47,623 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue
2015-04-05 18:46:47,632 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020
2015-04-05 18:46:47,654 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean

015-04-05 18:46:47,654 WARN org.apache.hadoop.hdfs.server.common.Util: Path /app/hadoop/tmp/dfs/name should be specified as a URI in configuration files. Please update hdfs configuration.
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of blocks under construction: 0
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of blocks under construction: 0
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 11 secs
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes
2015-04-05 18:46:47,665 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
2015-04-05 18:46:47,706 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2015-04-05 18:46:47,707 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8020: starting
2015-04-05 18:46:47,708 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode RPC up at: node01-standby/192.168.171.151:8020
2015-04-05 18:46:47,708 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state
2015-04-05 18:46:47,710 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Will roll logs on active node at node01/192.168.171.147:8020 every 120 seconds.
2015-04-05 18:46:47,721 INFO org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Starting standby checkpoint thread...
Checkpointing active NN at http://node01:50070
Serving checkpoints at http://node01-standby:50070
2015-04-05 18:46:48,723 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:48,724 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:48,724 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:49,725 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:49,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:49,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:50,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:50,727 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:50,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:51,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:51,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node02/192.168.171.148:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:51,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:52,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:46:57,768 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.171.147:8485, 192.168.171.148:8485, 192.168.171.151:8485]. Skipping.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.171.151:8485: Call From node01-standby/192.168.171.151 to node01-standby:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.147:8485: Call From node01-standby/192.168.171.151 to node01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.148:8485: Call From node01-standby/192.168.171.151 to node02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1463)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1487)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:212)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:324)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
2015-04-05 18:46:57,768 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state
2015-04-05 18:46:57,768 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:337)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)
2015-04-05 18:46:57,769 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state
2015-04-05 18:46:57,807 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Starting recovery process for unclosed journal segments...
2015-04-05 18:46:58,844 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01-standby/192.168.171.151:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2015-04-05 18:46:58,845 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node01/192.168.171.147:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

2015-04-05 18:47:07,870 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.171.147:8485, 192.168.171.148:8485, 192.168.171.151:8485], stream=null))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.171.151:8485: Call From node01-standby/192.168.171.151 to node01-standby:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.148:8485: Call From node01-standby/192.168.171.151 to node02:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.171.147:8485: Call From node01-standby/192.168.171.151 to node01:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
        at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createNewUniqueEpoch(QuorumJournalManager.java:182)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:436)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1394)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1149)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1655)
        at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:63)
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1533)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1246)
        at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:416)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
2015-04-05 18:47:07,872 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-04-05 18:47:07,873 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node01-standby/192.168.171.151
************************************************************/

 

1 ACCEPTED SOLUTION

avatar
Explorer

Its resloved now .

 

 

 

View solution in original post

4 REPLIES 4

avatar
Explorer

Its resloved now .

 

 

 

avatar
New Contributor

Hi, Could you please let me know what you did to resolve this ?

avatar
New Contributor

How was this resolved?

avatar
New Contributor

Useless if not resolution specified.