Created 06-23-2016 07:37 AM
Hi,
Datanode getting down for following reason. Can you please tell me the root cause and resolution-
2016-05-31 06:38:45,807 INFO impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(295)) - Deleted BP-838165258-10.1.1.94-1459246457024 blk_1073790458_49647 file /var/log/hadoop/hdfs/data/current/BP-838165258-10.1.1.94-1459246457024/current/finalized/subdir0/subdir189/blk_1073790458 2016-05-31 06:38:45,808 INFO impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(295)) - Deleted BP-838165258-10.1.1.94-1459246457024 blk_1073790460_49649 file /var/log/hadoop/hdfs/data/current/BP-838165258-10.1.1.94-1459246457024/current/finalized/subdir0/subdir189/blk_1073790460 2016-05-31 06:38:50,917 INFO datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790961_50150 src: /10.1.1.30:56265 dest: /10.1.1.29:50010 2016-05-31 06:38:50,987 INFO DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56265, dest: /10.1.1.29:50010, bytes: 4688706, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790961_50150, duration: 61792605 2016-05-31 06:38:50,988 INFO datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790961_50150, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2016-05-31 06:39:17,899 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,900 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 1, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,901 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 2, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,902 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 3, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,903 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 4, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,904 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 5, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,904 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 6, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,905 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 7, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,905 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 8, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,907 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 9, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,908 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 10, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,908 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 11, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,908 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 12, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,908 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 13, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,909 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 14, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,909 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 15, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:17,909 INFO DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 16, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true 2016-05-31 06:39:23,630 ERROR datanode.DataNode (DataXceiver.java:run(278)) - data1.corp.mirrorplus.com:50010:DataXceiver error processing unknown operation src: /127.0.0.1:43209 dst: /127.0.0.1:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:227) at java.lang.Thread.run(Thread.java:745) 2016-05-31 06:39:30,882 INFO datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790962_50151 src: /10.1.1.30:56392 dest: /10.1.1.29:50010 2016-05-31 06:39:30,902 INFO DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56169, dest: /10.1.1.29:50010, bytes: 130970563, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790960_50149, duration: 59970347965 2016-05-31 06:39:30,902 INFO datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790960_50149, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790497_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790499_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790505_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790513_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790515_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790523_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790525_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,498 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790527_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,499 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790529_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:50,499 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790531_0 on volume /hadoop/hdfs1/hadoop/hdfs/data 2016-05-31 06:39:56,902 INFO datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790963_50152 src: /10.1.1.30:56483 dest: /10.1.1.29:50010 2016-05-31 06:39:56,968 INFO DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56483, dest: /10.1.1.29:50010, bytes: 4694274, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790963_50152, duration: 61182280 2016-05-31 06:39:56,968 INFO datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790963_50152, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2016-05-31 06:39:57,014 INFO datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790964_50153 src: /10.1.1.30:56488 dest: /10.1.1.29:50010 2016-05-31 06:39:57,438 INFO DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56488, dest: /10.1.1.29:50010, bytes: 31717025, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790964_50153, duration: 420854449 2016-05-31 06:39:57,438 INFO datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790964_50153, type=HAS_DOWNSTREAM_IN_PIPELINE terminating 2016-05-31 06:40:23,622 ERROR datanode.DataNode (DataXceiver.java:run(278)) - data1.corp.mirrorplus.com:50010:DataXceiver error processing unknown operation src: /127.0.0.1:43354 dst: /127.0.0.1:50010 java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:227) at java.lang.Thread.run(Thread.java:745) 2016-05-31 06:40:30,505 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790484_0 on volume /var/log/hadoop/hdfs/data 2016-05-31 06:40:30,505 INFO datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790492_0 on volume /var/log/hadoop/hdfs/data
Created 06-24-2016 06:18 AM
Hi,
I found out the problem. One of the Data node got rebooted.
That's why this ind of log was written.
Thanks.
Created 06-23-2016 08:33 AM
As per the logs I couldn't find where datanode is getting down, can you please provide more logs?
Also the error might not be relevant since it seems to be related to https://issues.apache.org/jira/browse/AMBARI-12420. Check if you are using same ambari version.
Created 06-23-2016 11:39 PM
Hi @Jitendra Yadav, you are right - that particular message is harmless and usually a side effect of the Ambari health check. The fix for this issue was actually made in HDFS via HDFS-9572.
Created 06-24-2016 06:18 AM
Hi,
I found out the problem. One of the Data node got rebooted.
That's why this ind of log was written.
Thanks.