Created 02-12-2019 11:17 AM
I have installed Ambari 2.7 and HDP 3.0 , namenode nad secondar NN are up but the datanode is getting down after few seconds. I have tried cd /hadoop/hdfs/data and removing current folder but did not work. The logs are attached from /var/log/hadoop/hdfs/hadoop-hdfs-datanode-machine.log
==> /var/log/hadoop/hdfs/hadoop-hdfs-datanode-machine.log <== at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817) at java.lang.Thread.run(Thread.java:745) 2019-02-12 11:10:34,999 WARN datanode.DataNode (BPServiceActor.java:run(853)) - Ending block pool service for: Block pool <registering> (Datanode Uuid 3b87ba72-4e41-425f-afb0-17d2b37dde4b) service to machine/xx.xxx.x.xxx.x:8020 2019-02-12 11:10:35,102 INFO datanode.DataNode (BlockPoolManager.java:remove(102)) - Removed Block pool <registering> (Datanode Uuid 3b87ba72-4e41-425f-afb0-17d2b37dde4b) 2019-02-12 11:10:37,102 WARN datanode.DataNode (DataNode.java:secureMain(2890)) - Exiting Datanode 2019-02-12 11:10:37,108 INFO datanode.DataNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at machine/xx.xxx.xxx.x.4
Created 02-12-2019 11:21 AM
Can you please share the complete error stack trace from the "/var/log/hadoop/hdfs/hadoop-hdfs-datanode-machine.log"
Better if you can try restarting the DataNode and then collect the whole log which is produced by DataNode after your restart attempt and then share the logs here.
Created 02-12-2019 03:52 PM
Hi @Jay kumar SenSharma, Below is the log generated. ************************************************************/ 2019-02-12 11:29:49,759 INFO datanode.DataNode (LogAdapter.java:info(51)) - registered UNIX signal handlers for [TERM, HUP, INT] 2019-02-12 11:29:50,817 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(137)) - Scheduling a check for [DISK]file:/hadoop/hdfs/data 2019-02-12 11:29:50,842 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(137)) - Scheduling a check for [DISK]file:/data/0/hadoop/hdfs/data 2019-02-12 11:29:50,846 INFO checker.ThrottledAsyncChecker (ThrottledAsyncChecker.java:schedule(137)) - Scheduling a check for [DISK]file:/data/1/hadoop/hdfs/data 2019-02-12 11:29:51,087 INFO impl.MetricsConfig (MetricsConfig.java:loadFirst(118)) - Loaded properties from hadoop-metrics2.properties 2019-02-12 11:29:52,011 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(85)) - Initializing Timeline metrics sink. 2019-02-12 11:29:52,012 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(105)) - Identified hostname = devaz02, serviceName = datanode 2019-02-12 11:29:52,117 INFO availability.MetricSinkWriteShardHostnameHashingStrategy (MetricSinkWriteShardHostnameHashingStrategy.java:findCollectorShard(42)) - Calculated collector shard devaz03 based on hostname: devaz02 2019-02-12 11:29:52,117 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(130)) - Collector Uri: http://devaz03:6188/ws/v1/timeline/metrics 2019-02-12 11:29:52,122 INFO timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(131)) - Container Metrics Uri: http://devaz03:6188/ws/v1/timeline/containermetrics 2019-02-12 11:29:52,136 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(204)) - Sink timeline started 2019-02-12 11:29:52,269 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(374)) - Scheduled Metric snapshot period at 10 second(s). 2019-02-12 11:29:52,269 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:start(191)) - DataNode metrics system started 2019-02-12 11:29:52,733 INFO common.Util (Util.java:isDiskStatsEnabled(395)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling 2019-02-12 11:29:52,736 INFO datanode.BlockScanner (BlockScanner.java:<init>(184)) - Initialized block scanner with targetBytesPerSec 1048576 2019-02-12 11:29:52,747 INFO datanode.DataNode (DataNode.java:<init>(486)) - File descriptor passing is enabled. 2019-02-12 11:29:52,748 INFO datanode.DataNode (DataNode.java:<init>(499)) - Configured hostname is devaz02 2019-02-12 11:29:52,753 INFO common.Util (Util.java:isDiskStatsEnabled(395)) - dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling 2019-02-12 11:29:52,757 INFO datanode.DataNode (DataNode.java:startDataNode(1399)) - Starting DataNode with maxLockedMemory = 0 2019-02-12 11:29:52,805 INFO datanode.DataNode (DataNode.java:initDataXceiver(1147)) - Opened streaming server at /0.0.0.0:50010 2019-02-12 11:29:52,807 INFO datanode.DataNode (DataXceiverServer.java:<init>(78)) - Balancing bandwidth is 6250000 bytes/s 2019-02-12 11:29:52,808 INFO datanode.DataNode (DataXceiverServer.java:<init>(79)) - Number threads for balancing is 50 2019-02-12 11:29:52,815 INFO datanode.DataNode (DataXceiverServer.java:<init>(78)) - Balancing bandwidth is 6250000 bytes/s 2019-02-12 11:29:52,816 INFO datanode.DataNode (DataXceiverServer.java:<init>(79)) - Number threads for balancing is 50 2019-02-12 11:29:52,816 INFO datanode.DataNode (DataNode.java:initDataXceiver(1165)) - Listening on UNIX domain socket: /var/lib/hadoop-hdfs/dn_socket 2019-02-12 11:29:52,905 INFO util.log (Log.java:initialized(192)) - Logging initialized @4548ms 2019-02-12 11:29:53,107 INFO server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(240)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2019-02-12 11:29:53,115 INFO http.HttpRequestLog (HttpRequestLog.java:getRequestLog(81)) - Http request log for http.requests.datanode is not defined 2019-02-12 11:29:53,127 INFO http.HttpServer2 (HttpServer2.java:addGlobalFilter(968)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2019-02-12 11:29:53,135 INFO http.HttpServer2 (HttpServer2.java:addFilter(941)) - Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context datanode 2019-02-12 11:29:53,136 INFO http.HttpServer2 (HttpServer2.java:addFilter(951)) - Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context logs 2019-02-12 11:29:53,136 INFO http.HttpServer2 (HttpServer2.java:addFilter(951)) - Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context static 2019-02-12 11:29:53,136 INFO security.HttpCrossOriginFilterInitializer (HttpCrossOriginFilterInitializer.java:initFilter(49)) - CORS filter not enabled. Please set hadoop.http.cross-origin.enabled to 'true' to enable it 2019-02-12 11:29:53,185 INFO http.HttpServer2 (HttpServer2.java:bindListener(1185)) - Jetty bound to port 44838 2019-02-12 11:29:53,186 INFO server.Server (Server.java:doStart(346)) - jetty-9.3.19.v20170502 2019-02-12 11:29:53,253 INFO server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(240)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2019-02-12 11:29:53,260 INFO handler.ContextHandler (ContextHandler.java:doStart(781)) - Started o.e.j.s.ServletContextHandler@2a79d4b1{/logs,file:///var/log/hadoop/hdfs/,AVAILABLE} 2019-02-12 11:29:53,261 INFO handler.ContextHandler (ContextHandler.java:doStart(781)) - Started o.e.j.s.ServletContextHandler@17cdf2d0{/static,file:///usr/hdp/3.0.1.0-187/hadoop-hdfs/webapps/static/,AVAILABLE} 2019-02-12 11:29:53,399 INFO handler.ContextHandler (ContextHandler.java:doStart(781)) - Started o.e.j.w.WebAppContext@662f5666{/,file:///usr/hdp/3.0.1.0-187/hadoop-hdfs/webapps/datanode/,AVAILABLE}{/datanode} 2019-02-12 11:29:53,416 INFO server.AbstractConnector (AbstractConnector.java:doStart(278)) - Started ServerConnector@4879f0f2{HTTP/1.1,[http/1.1]}{localhost:44838} 2019-02-12 11:29:53,417 INFO server.Server (Server.java:doStart(414)) - Started @5060ms 2019-02-12 11:29:53,770 INFO web.DatanodeHttpServer (DatanodeHttpServer.java:start(255)) - Listening HTTP traffic on /0.0.0.0:50075 2019-02-12 11:29:53,781 INFO util.JvmPauseMonitor (JvmPauseMonitor.java:run(188)) - Starting JVM pause monitor 2019-02-12 11:29:53,860 INFO datanode.DataNode (DataNode.java:startDataNode(1427)) - dnUserName = hdfs 2019-02-12 11:29:53,860 INFO datanode.DataNode (DataNode.java:startDataNode(1428)) - supergroup = hdfs 2019-02-12 11:29:53,956 INFO ipc.CallQueueManager (CallQueueManager.java:<init>(84)) - Using callQueue: class java.util.concurrent.LinkedBlockingQueue queueCapacity: 1000 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler 2019-02-12 11:29:53,986 INFO ipc.Server (Server.java:run(1074)) - Starting Socket Reader #1 for port 8010 2019-02-12 11:29:54,518 INFO datanode.DataNode (DataNode.java:initIpcServer(1033)) - Opened IPC server at /0.0.0.0:8010 2019-02-12 11:29:54,548 INFO datanode.DataNode (BlockPoolManager.java:refreshNamenodes(149)) - Refresh request received for nameservices: null 2019-02-12 11:29:54,569 INFO datanode.DataNode (BlockPoolManager.java:doRefreshNamenodes(210)) - Starting BPOfferServices for nameservices: <default> 2019-02-12 11:29:54,600 INFO datanode.DataNode (BPServiceActor.java:run(810)) - Block pool <registering> (Datanode Uuid unassigned) service to devaz01/10.161.137.4:8020 starting to offer service 2019-02-12 11:29:54,612 INFO ipc.Server (Server.java:run(1314)) - IPC Server Responder: starting 2019-02-12 11:29:54,613 INFO ipc.Server (Server.java:run(1153)) - IPC Server listener on 8010: starting 2019-02-12 11:29:54,911 INFO datanode.DataNode (BPOfferService.java:verifyAndSetNamespaceInfo(378)) - Acknowledging ACTIVE Namenode during handshakeBlock pool <registering> (Datanode Uuid unassigned) service to devaz01/10.161.137.4:8020 2019-02-12 11:29:54,915 INFO common.Storage (DataStorage.java:getParallelVolumeLoadThreadsNum(354)) - Using 3 threads to upgrade data directories (dfs.datanode.parallel.volumes.load.threads.num=3, dataDirs=3) 2019-02-12 11:29:54,963 INFO common.Storage (Storage.java:tryLock(905)) - Lock on /hadoop/hdfs/data/in_use.lock acquired by nodename 32007@devaz02 2019-02-12 11:29:55,014 INFO common.Storage (Storage.java:tryLock(905)) - Lock on /data/0/hadoop/hdfs/data/in_use.lock acquired by nodename 32007@devaz02 2019-02-12 11:29:55,016 WARN common.Storage (DataStorage.java:loadDataStorage(418)) - Failed to add storage directory [DISK]file:/data/0/hadoop/hdfs/data org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /data/0/hadoop/hdfs/data is in an inconsistent state: cluster Id is incompatible with others. at org.apache.hadoop.hdfs.server.common.StorageInfo.setClusterId(StorageInfo.java:193) at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:620) at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:605) at org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:134) at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:714) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:551) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1718) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1678) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817) at java.lang.Thread.run(Thread.java:745) 2019-02-12 11:29:55,038 INFO common.Storage (Storage.java:tryLock(905)) - Lock on /data/1/hadoop/hdfs/data/in_use.lock acquired by nodename 32007@devaz02 2019-02-12 11:29:55,039 WARN common.Storage (DataStorage.java:loadDataStorage(418)) - Failed to add storage directory [DISK]file:/data/1/hadoop/hdfs/data org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /data/1/hadoop/hdfs/data is in an inconsistent state: cluster Id is incompatible with others. at org.apache.hadoop.hdfs.server.common.StorageInfo.setClusterId(StorageInfo.java:193) at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:620) at org.apache.hadoop.hdfs.server.datanode.DataStorage.setFieldsFromProperties(DataStorage.java:605) at org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:134) at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:714) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:294) at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407) at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:551) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1718) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1678) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817) at java.lang.Thread.run(Thread.java:745) 2019-02-12 11:29:55,072 INFO common.Storage (BlockPoolSliceStorage.java:recoverTransitionRead(251)) - Analyzing storage directories for bpid BP-49067060-10.161.137.4-1549610122990 2019-02-12 11:29:55,072 INFO common.Storage (Storage.java:lock(864)) - Locking is disabled for /hadoop/hdfs/data/current/BP-49067060-10.161.137.4-1549610122990 2019-02-12 11:29:55,087 INFO datanode.DataNode (DataNode.java:initStorage(1721)) - Setting up storage: nsid=1887919669;bpid=BP-49067060-10.161.137.4-1549610122990;lv=-57;nsInfo=lv=-64;cid=CID-20f8f873-2247-415f-ab5d-580d50c81e0a;nsid=1887919669;c=1549610122990;bpid=BP-49067060-10.161.137.4-1549610122990;dnuuid=2d5b2949-fa6a-405a-b56d-e8a0ef860e37 2019-02-12 11:29:55,115 ERROR datanode.DataNode (BPServiceActor.java:run(829)) - Initialization failed for Block pool <registering> (Datanode Uuid 2d5b2949-fa6a-405a-b56d-e8a0ef860e37) service to devaz01/10.161.137.4:8020. Exiting. org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 1, volumes configured: 3, volumes failed: 2, volume failures tolerated: 0 at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:311) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1732) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1678) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:817) at java.lang.Thread.run(Thread.java:745) 2019-02-12 11:29:55,115 WARN datanode.DataNode (BPServiceActor.java:run(853)) - Ending block pool service for: Block pool <registering> (Datanode Uuid 2d5b2949-fa6a-405a-b56d-e8a0ef860e37) service to devaz01/10.161.137.4:8020 2019-02-12 11:29:55,218 INFO datanode.DataNode (BlockPoolManager.java:remove(102)) - Removed Block pool <registering> (Datanode Uuid 2d5b2949-fa6a-405a-b56d-e8a0ef860e37) 2019-02-12 11:29:57,218 WARN datanode.DataNode (DataNode.java:secureMain(2890)) - Exiting Datanode 2019-02-12 11:29:57,223 INFO datanode.DataNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at devaz02/10.161.137.5