Support Questions
Find answers, ask questions, and share your expertise

Datanodes are not able to start in HDP 3.0

Datanodes are not able to start in HDP 3.0

I have installed Ambari 2.7 and HDP 3.0 , namenode nad secondar NN are up but the datanode is getting down after few seconds. I have tried cd /hadoop/hdfs/data and removing current folder but did not work. The logs are attached from /var/log/hadoop/hdfs/hadoop-hdfs-datanode-machine.log

==> /var/log/hadoop/hdfs/hadoop-hdfs-datanode-machine.log <==
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817)
        at java.lang.Thread.run(Thread.java:745)
2019-02-12 11:10:34,999 WARN  datanode.DataNode (BPServiceActor.java:run(853)) - Ending block pool service for: Block pool <registering> (Datanode Uuid 3b87ba72-4e41-425f-afb0-17d2b37dde4b) service to machine/xx.xxx.x.xxx.x:8020
2019-02-12 11:10:35,102 INFO  datanode.DataNode (BlockPoolManager.java:remove(102)) - Removed Block pool <registering> (Datanode Uuid 3b87ba72-4e41-425f-afb0-17d2b37dde4b)
2019-02-12 11:10:37,102 WARN  datanode.DataNode (DataNode.java:secureMain(2890)) - Exiting Datanode
2019-02-12 11:10:37,108 INFO  datanode.DataNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at machine/xx.xxx.xxx.x.4
2 REPLIES 2

Re: Datanodes are not able to start in HDP 3.0

Super Mentor

@Shraddha Singh

Can you please share the complete error stack trace from the "/var/log/hadoop/hdfs/hadoop-hdfs-datanode-machine.log"

Better if you can try restarting the DataNode and then collect the whole log which is produced by DataNode after your restart attempt and then share the logs here.

Re: Datanodes are not able to start in HDP 3.0

Hi @Jay kumar SenSharma,

Below is the log generated.

************************************************************/
2019-02-12 11:29:49,759 INFO  datanode.DataNode (LogAdapter.java:info(51)) - registered UNIX signal handlers for [TERM, HUP, INT]
2019-02-12 11:29:50,817 INFO  checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(137)) - Scheduling a check for [DISK]file:/hadoop/hdfs/data
2019-02-12 11:29:50,842 INFO  checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(137)) - Scheduling a check for [DISK]file:/data/0/hadoop/hdfs/data
2019-02-12 11:29:50,846 INFO  checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(137)) - Scheduling a check for [DISK]file:/data/1/hadoop/hdfs/data
2019-02-12 11:29:51,087 INFO  impl.MetricsConfig (MetricsConfig.java:loadFirst(118)) - Loaded properties from hadoop-metrics2.properties
2019-02-12 11:29:52,011 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(85)) - Initializing Timeline metrics sink.
2019-02-12 11:29:52,012 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(105)) - Identified hostname = devaz02, serviceName = datanode
2019-02-12 11:29:52,117 INFO  availability.MetricSinkWriteShardHostnameHashingStrategy (MetricSinkWriteShardHostnameHashingStrategy.java:findCollectorShard(42)) - Calculated collector shard devaz03 based on hostname: devaz02
2019-02-12 11:29:52,117 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(130)) - Collector Uri: http://devaz03:6188/ws/v1/timeline/metrics
2019-02-12 11:29:52,122 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(131)) - Container Metrics Uri: http://devaz03:6188/ws/v1/timeline/containermetrics
2019-02-12 11:29:52,136 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(204)) - Sink timeline started
2019-02-12 11:29:52,269 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(374)) - Scheduled Metric snapshot period at 10 second(s).
2019-02-12 11:29:52,269 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:start(191)) - DataNode metrics system started
2019-02-12 11:29:52,733 INFO  common.Util (Util.java:isDiskStatsEnabled(395)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2019-02-12 11:29:52,736 INFO  datanode.BlockScanner (BlockScanner.java:<init>(184)) - Initialized block scanner with targetBytesPerSec 1048576
2019-02-12 11:29:52,747 INFO  datanode.DataNode (DataNode.java:<init>(486)) - File descriptor passing is enabled.
2019-02-12 11:29:52,748 INFO  datanode.DataNode (DataNode.java:<init>(499)) - Configured hostname is devaz02
2019-02-12 11:29:52,753 INFO  common.Util (Util.java:isDiskStatsEnabled(395)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2019-02-12 11:29:52,757 INFO  datanode.DataNode (DataNode.java:startDataNode(1399)) - Starting DataNode with maxLockedMemory = 0
2019-02-12 11:29:52,805 INFO  datanode.DataNode (DataNode.java:initDataXceiver(1147)) - Opened streaming server at /0.0.0.0:50010
2019-02-12 11:29:52,807 INFO  datanode.DataNode (DataXceiverServer.java:<init>(78)) - Balancing bandwidth is 6250000 bytes/s
2019-02-12 11:29:52,808 INFO  datanode.DataNode (DataXceiverServer.java:<init>(79)) - Number threads for balancing is 50
2019-02-12 11:29:52,815 INFO  datanode.DataNode (DataXceiverServer.java:<init>(78)) - Balancing bandwidth is 6250000 bytes/s
2019-02-12 11:29:52,816 INFO  datanode.DataNode (DataXceiverServer.java:<init>(79)) - Number threads for balancing is 50
2019-02-12 11:29:52,816 INFO  datanode.DataNode (DataNode.java:initDataXceiver(1165)) - Listening on UNIX domain socket: /var/lib/hadoop-hdfs/dn_socket
2019-02-12 11:29:52,905 INFO  util.log (Log.java:initialized(192)) - Logging initialized @4548ms
2019-02-12 11:29:53,107 INFO  server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(240)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2019-02-12 11:29:53,115 INFO  http.HttpRequestLog (HttpRequestLog.java:getRequestLog(81)) - Http request log for http.requests.datanode is not defined
2019-02-12 11:29:53,127 INFO  http.HttpServer2 (HttpServer2.java:addGlobalFilter(968)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2019-02-12 11:29:53,135 INFO  http.HttpServer2 (HttpServer2.java:addFilter(941)) - Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context datanode
2019-02-12 11:29:53,136 INFO  http.HttpServer2 (HttpServer2.java:addFilter(951)) - Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context logs
2019-02-12 11:29:53,136 INFO  http.HttpServer2 (HttpServer2.java:addFilter(951)) - Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context static
2019-02-12 11:29:53,136 INFO  security.HttpCrossOriginFilterInitializer (HttpCrossOriginFilterInitializer.java:initFilter(49)) - CORS filter not enabled. Please set hadoop.http.cross-origin.enabled to 'true' to enable it
2019-02-12 11:29:53,185 INFO  http.HttpServer2 (HttpServer2.java:bindListener(1185)) - Jetty bound to port 44838
2019-02-12 11:29:53,186 INFO  server.Server (Server.java:doStart(346)) - jetty-9.3.19.v20170502
2019-02-12 11:29:53,253 INFO  server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(240)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2019-02-12 11:29:53,260 INFO  handler.ContextHandler (ContextHandler.java:doStart(781)) - Started o.e.j.s.ServletContextHandler@2a79d4b1{/logs,file:///var/log/hadoop/hdfs/,AVAILABLE}
2019-02-12 11:29:53,261 INFO  handler.ContextHandler (ContextHandler.java:doStart(781)) - Started o.e.j.s.ServletContextHandler@17cdf2d0{/static,file:///usr/hdp/3.0.1.0-187/hadoop-hdfs/webapps/static/,AVAILABLE}
2019-02-12 11:29:53,399 INFO  handler.ContextHandler (ContextHandler.java:doStart(781)) - Started o.e.j.w.WebAppContext@662f5666{/,file:///usr/hdp/3.0.1.0-187/hadoop-hdfs/webapps/datanode/,AVAILABLE}{/datanode}
2019-02-12 11:29:53,416 INFO  server.AbstractConnector (AbstractConnector.java:doStart(278)) - Started ServerConnector@4879f0f2{HTTP/1.1,[http/1.1]}{localhost:44838}
2019-02-12 11:29:53,417 INFO  server.Server (Server.java:doStart(414)) - Started @5060ms
2019-02-12 11:29:53,770 INFO  web.DatanodeHttpServer (DatanodeHttpServer.java:start(255)) - Listening HTTP traffic on /0.0.0.0:50075
2019-02-12 11:29:53,781 INFO  util.JvmPauseMonitor (JvmPauseMonitor.java:run(188)) - Starting JVM pause monitor
2019-02-12 11:29:53,860 INFO  datanode.DataNode (DataNode.java:startDataNode(1427)) - dnUserName = hdfs
2019-02-12 11:29:53,860 INFO  datanode.DataNode (DataNode.java:startDataNode(1428)) - supergroup = hdfs
2019-02-12 11:29:53,956 INFO  ipc.CallQueueManager (CallQueueManager.java:<init>(84)) - Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 1000 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2019-02-12 11:29:53,986 INFO  ipc.Server (Server.java:run(1074)) - Starting Socket Reader #1 for port 8010
2019-02-12 11:29:54,518 INFO  datanode.DataNode (DataNode.java:initIpcServer(1033)) - Opened IPC server at /0.0.0.0:8010
2019-02-12 11:29:54,548 INFO  datanode.DataNode (BlockPoolManager.java:refreshNamenodes(149)) - Refresh request received for nameservices: null
2019-02-12 11:29:54,569 INFO  datanode.DataNode (BlockPoolManager.java:doRefreshNamenodes(210)) - Starting BPOfferServices for nameservices: <default>
2019-02-12 11:29:54,600 INFO  datanode.DataNode (BPServiceActor.java:run(810)) - Block pool <registering> (Datanode Uuid unassigned) service to devaz01/10.161.137.4:8020 starting to offer service
2019-02-12 11:29:54,612 INFO  ipc.Server (Server.java:run(1314)) - IPC Server Responder: starting
2019-02-12 11:29:54,613 INFO  ipc.Server (Server.java:run(1153)) - IPC Server listener on 8010: starting
2019-02-12 11:29:54,911 INFO  datanode.DataNode (BPOfferService.java:verifyAndSetNamespaceInfo(378)) - Acknowledging ACTIVE Namenode during handshakeBlock pool <registering> (Datanode Uuid unassigned) service to devaz01/10.161.137.4:8020
2019-02-12 11:29:54,915 INFO  common.Storage (DataStorage.java:getParallelVolumeLoadThreadsNum(354)) - Using 3 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=3, dataDirs=3)
2019-02-12 11:29:54,963 INFO  common.Storage (Storage.java:tryLock(905)) - Lock on /hadoop/hdfs/data/in_use.lock acquired by nodename 32007@devaz02
2019-02-12 11:29:55,014 INFO  common.Storage (Storage.java:tryLock(905)) - Lock on /data/0/hadoop/hdfs/data/in_use.lock acquired by nodename 32007@devaz02
2019-02-12 11:29:55,016 WARN  common.Storage (DataStorage.java:loadDataStorage(418)) - Failed to add storage directory [DISK]file:/data/0/hadoop/hdfs/data
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /data/0/hadoop/hdfs/data is in an inconsistent state: cluster Id is incompatible with others.
        at org.apache.hadoop.hdfs.server.common.StorageInfo.setClusterId(StorageInfo.java:193)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:620)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:605)
        at org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:134)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:714)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:551)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1718)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1678)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817)
        at java.lang.Thread.run(Thread.java:745)
2019-02-12 11:29:55,038 INFO  common.Storage (Storage.java:tryLock(905)) - Lock on /data/1/hadoop/hdfs/data/in_use.lock acquired by nodename 32007@devaz02
2019-02-12 11:29:55,039 WARN  common.Storage (DataStorage.java:loadDataStorage(418)) - Failed to add storage directory [DISK]file:/data/1/hadoop/hdfs/data
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /data/1/hadoop/hdfs/data is in an inconsistent state: cluster Id is incompatible with others.
        at org.apache.hadoop.hdfs.server.common.StorageInfo.setClusterId(StorageInfo.java:193)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:620)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:605)
        at org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:134)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:714)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:551)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1718)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1678)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817)
        at java.lang.Thread.run(Thread.java:745)
2019-02-12 11:29:55,072 INFO  common.Storage (BlockPoolSliceStorage.java:recoverTransitionRead(251)) - Analyzing storage directories for bpid BP-49067060-10.161.137.4-1549610122990
2019-02-12 11:29:55,072 INFO  common.Storage (Storage.java:lock(864)) - Locking is disabled for /hadoop/hdfs/data/current/BP-49067060-10.161.137.4-1549610122990
2019-02-12 11:29:55,087 INFO  datanode.DataNode (DataNode.java:initStorage(1721)) - Setting up storage: nsid=1887919669;bpid=BP-49067060-10.161.137.4-1549610122990;lv=-57;nsInfo=lv=-64;cid=CID-20f8f873-2247-415f-ab5d-580d50c81e0a;nsid=1887919669;c=1549610122990;bpid=BP-49067060-10.161.137.4-1549610122990;dnuuid=2d5b2949-fa6a-405a-b56d-e8a0ef860e37
2019-02-12 11:29:55,115 ERROR datanode.DataNode (BPServiceActor.java:run(829)) - Initialization failed for Block pool <registering> (Datanode Uuid 2d5b2949-fa6a-405a-b56d-e8a0ef860e37) service to devaz01/10.161.137.4:8020. Exiting.
org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 1, volumes configured: 3, volumes failed: 2, volume failures tolerated: 0
        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:311)
        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)
        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1732)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1678)
        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817)
        at java.lang.Thread.run(Thread.java:745)
2019-02-12 11:29:55,115 WARN  datanode.DataNode (BPServiceActor.java:run(853)) - Ending block pool service for: Block pool <registering> (Datanode Uuid 2d5b2949-fa6a-405a-b56d-e8a0ef860e37) service to devaz01/10.161.137.4:8020
2019-02-12 11:29:55,218 INFO  datanode.DataNode (BlockPoolManager.java:remove(102)) - Removed Block pool <registering> (Datanode Uuid 2d5b2949-fa6a-405a-b56d-e8a0ef860e37)
2019-02-12 11:29:57,218 WARN  datanode.DataNode (DataNode.java:secureMain(2890)) - Exiting Datanode
2019-02-12 11:29:57,223 INFO  datanode.DataNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at devaz02/10.161.137.5