Created on 04-09-2018 12:01 AM - edited 09-16-2022 06:04 AM
Hi al,
I have set up a 3 node Cloudera 5.14 cluster on Azure with Ubuntu 14 machines and there seems an issue when I start the cluster.
If I shutdown the Azure instances for the day and start the Azure instances next morning and when I try to start the cloudera cluster, all services are getting up except name node. The name node is not starting up and shows the error message " the name node is not formatted". The work aroud I do is format the name node every morning (yes, it is a test cluster) and do the rest of the job. Any clue what is going wrong?
STARTUP_MSG: java = 1.7.0_67 ************************************************************/ 2018-04-08 23:29:41,562 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 2018-04-08 23:29:41,565 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: createNameNode [] 2018-04-08 23:29:41,945 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2018-04-08 23:29:42,067 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2018-04-08 23:29:42,067 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started 2018-04-08 23:29:42,097 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: fs.defaultFS is hdfs://udp-cdh-node1:8020 2018-04-08 23:29:42,097 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Clients are to use udp-cdh-node1:8020 to access this namenode/service. 2018-04-08 23:29:42,404 INFO org.apache.hadoop.util.JvmPauseMonitor: Starting JVM pause monitor 2018-04-08 23:29:42,418 INFO org.apache.hadoop.hdfs.DFSUtil: Starting Web-server for hdfs at: http://udp-cdh-node1:50070 2018-04-08 23:29:42,468 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2018-04-08 23:29:42,476 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2018-04-08 23:29:42,483 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.namenode is not defined 2018-04-08 23:29:42,495 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2018-04-08 23:29:42,499 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 2018-04-08 23:29:42,499 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 2018-04-08 23:29:42,499 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2018-04-08 23:29:42,716 INFO org.apache.hadoop.http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 2018-04-08 23:29:42,718 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 2018-04-08 23:29:42,734 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 50070 2018-04-08 23:29:42,734 INFO org.mortbay.log: jetty-6.1.26.cloudera.4 2018-04-08 23:29:43,038 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@udp-cdh-node1:50070 2018-04-08 23:29:43,073 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories! 2018-04-08 23:29:43,073 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories! 2018-04-08 23:29:43,116 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Edit logging is async:true 2018-04-08 23:29:43,129 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No KeyProvider found. 2018-04-08 23:29:43,139 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsLock is fair: true 2018-04-08 23:29:43,303 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 2018-04-08 23:29:43,303 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 2018-04-08 23:29:43,304 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2018-04-08 23:29:43,305 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: The block deletion will start around 2018 Apr 08 23:29:43 2018-04-08 23:29:43,307 INFO org.apache.hadoop.util.GSet: Computing capacity for map BlocksMap 2018-04-08 23:29:43,307 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2018-04-08 23:29:43,310 INFO org.apache.hadoop.util.GSet: 2.0% max memory 1.0 GB = 21.5 MB 2018-04-08 23:29:43,310 INFO org.apache.hadoop.util.GSet: capacity = 2^21 = 2097152 entries 2018-04-08 23:29:43,317 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false 2018-04-08 23:29:43,319 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication = 3 2018-04-08 23:29:43,319 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication = 512 2018-04-08 23:29:43,319 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication = 1 2018-04-08 23:29:43,320 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams = 20 2018-04-08 23:29:43,320 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000 2018-04-08 23:29:43,320 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer = false 2018-04-08 23:29:43,320 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog = 1000 2018-04-08 23:29:43,327 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = hdfs (auth:SIMPLE) 2018-04-08 23:29:43,327 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = supergroup 2018-04-08 23:29:43,327 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true 2018-04-08 23:29:43,327 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false 2018-04-08 23:29:43,329 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true 2018-04-08 23:29:43,506 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap 2018-04-08 23:29:43,506 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2018-04-08 23:29:43,506 INFO org.apache.hadoop.util.GSet: 1.0% max memory 1.0 GB = 10.7 MB 2018-04-08 23:29:43,507 INFO org.apache.hadoop.util.GSet: capacity = 2^20 = 1048576 entries 2018-04-08 23:29:43,507 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: POSIX ACL inheritance enabled? false 2018-04-08 23:29:43,508 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times 2018-04-08 23:29:43,514 INFO org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true 2018-04-08 23:29:43,519 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks 2018-04-08 23:29:43,520 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2018-04-08 23:29:43,520 INFO org.apache.hadoop.util.GSet: 0.25% max memory 1.0 GB = 2.7 MB 2018-04-08 23:29:43,520 INFO org.apache.hadoop.util.GSet: capacity = 2^18 = 262144 entries 2018-04-08 23:29:43,523 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 2018-04-08 23:29:43,523 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 1 2018-04-08 23:29:43,524 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 2018-04-08 23:29:43,527 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10 2018-04-08 23:29:43,528 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10 2018-04-08 23:29:43,528 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25 2018-04-08 23:29:43,532 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled 2018-04-08 23:29:43,532 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 2018-04-08 23:29:43,535 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache 2018-04-08 23:29:43,535 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2018-04-08 23:29:43,535 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 1.0 GB = 330.2 KB 2018-04-08 23:29:43,535 INFO org.apache.hadoop.util.GSet: capacity = 2^15 = 32768 entries 2018-04-08 23:29:43,538 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ACLs enabled? false 2018-04-08 23:29:43,538 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: XAttrs enabled? true 2018-04-08 23:29:43,538 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Maximum size of an xattr: 16384 2018-04-08 23:29:43,548 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /mnt/dfs/nn/in_use.lock acquired by nodename 5705@udp-cdh-node1 2018-04-08 23:29:43,552 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage java.io.IOException: NameNode is not formatted. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:232) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1150) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:797) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:844) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:823) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) 2018-04-08 23:29:43,567 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@udp-cdh-node1:50070 2018-04-08 23:29:43,568 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system... 2018-04-08 23:29:43,568 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped. 2018-04-08 23:29:43,569 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete. 2018-04-08 23:29:43,569 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. java.io.IOException: NameNode is not formatted. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:232) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1150) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:797) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:844) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:823) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
Created 04-09-2018 03:48 AM
Created 04-09-2018 03:48 AM
Created 04-09-2018 09:03 AM
Glad your problem is solved. Thanks for letting the community know the solution. Hopefully if someone else ends up in your situation they will find your answer helpful.
Created 06-04-2019 01:39 AM
Hey arunvpy ,
can you explain it in brief what you did to get rid of this problem.
Created 06-28-2019 11:38 AM
I had the same issue with Cloudera on Azure.
Quickly to reproduce...
1) just shutdown (via Cloudera manager all the services)
2) then shutdown the VMs via Altus portal
On startup... the HDFS (namenode) and all dependent services began reporting errors.
The problem went away after I switching to the HA configuration via bootstrap-remote.