Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Datanode getting down

avatar
Expert Contributor

Hi,

Datanode getting down for following reason. Can you please tell me the root cause and resolution-

2016-05-31 06:38:45,807 INFO  impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(295)) - Deleted BP-838165258-10.1.1.94-1459246457024 blk_1073790458_49647 file /var/log/hadoop/hdfs/data/current/BP-838165258-10.1.1.94-1459246457024/current/finalized/subdir0/subdir189/blk_1073790458
2016-05-31 06:38:45,808 INFO  impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(295)) - Deleted BP-838165258-10.1.1.94-1459246457024 blk_1073790460_49649 file /var/log/hadoop/hdfs/data/current/BP-838165258-10.1.1.94-1459246457024/current/finalized/subdir0/subdir189/blk_1073790460
2016-05-31 06:38:50,917 INFO  datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790961_50150 src: /10.1.1.30:56265 dest: /10.1.1.29:50010
2016-05-31 06:38:50,987 INFO  DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56265, dest: /10.1.1.29:50010, bytes: 4688706, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790961_50150, duration: 61792605
2016-05-31 06:38:50,988 INFO  datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790961_50150, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2016-05-31 06:39:17,899 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,900 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 1, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,901 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 2, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,902 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 3, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,903 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 4, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,904 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 5, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,904 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 6, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,905 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 7, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,905 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 8, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,907 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 9, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,908 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 10, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,908 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 11, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,908 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 12, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,908 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 13, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,909 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 14, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,909 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 15, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:17,909 INFO  DataNode.clienttrace (DataXceiver.java:releaseShortCircuitFds(407)) - src: 127.0.0.1, dest: 127.0.0.1, op: RELEASE_SHORT_CIRCUIT_FDS, shmId: b51fe9cee4cd76c97452ee0bfcf62919, slotIdx: 16, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, success: true
2016-05-31 06:39:23,630 ERROR datanode.DataNode (DataXceiver.java:run(278)) - data1.corp.mirrorplus.com:50010:DataXceiver error processing unknown operation  src: /127.0.0.1:43209 dst: /127.0.0.1:50010
java.io.EOFException
	at java.io.DataInputStream.readShort(DataInputStream.java:315)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:227)
	at java.lang.Thread.run(Thread.java:745)
2016-05-31 06:39:30,882 INFO  datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790962_50151 src: /10.1.1.30:56392 dest: /10.1.1.29:50010
2016-05-31 06:39:30,902 INFO  DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56169, dest: /10.1.1.29:50010, bytes: 130970563, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790960_50149, duration: 59970347965
2016-05-31 06:39:30,902 INFO  datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790960_50149, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790497_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790499_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790505_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790513_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790515_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790523_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790525_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,498 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790527_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,499 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790529_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:50,499 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790531_0 on volume /hadoop/hdfs1/hadoop/hdfs/data
2016-05-31 06:39:56,902 INFO  datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790963_50152 src: /10.1.1.30:56483 dest: /10.1.1.29:50010
2016-05-31 06:39:56,968 INFO  DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56483, dest: /10.1.1.29:50010, bytes: 4694274, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790963_50152, duration: 61182280
2016-05-31 06:39:56,968 INFO  datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790963_50152, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2016-05-31 06:39:57,014 INFO  datanode.DataNode (DataXceiver.java:writeBlock(655)) - Receiving BP-838165258-10.1.1.94-1459246457024:blk_1073790964_50153 src: /10.1.1.30:56488 dest: /10.1.1.29:50010
2016-05-31 06:39:57,438 INFO  DataNode.clienttrace (BlockReceiver.java:finalizeBlock(1432)) - src: /10.1.1.30:56488, dest: /10.1.1.29:50010, bytes: 31717025, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_-905108031_1, offset: 0, srvID: 0362bd37-7e9f-4f43-8f6b-af1d42314e63, blockid: BP-838165258-10.1.1.94-1459246457024:blk_1073790964_50153, duration: 420854449
2016-05-31 06:39:57,438 INFO  datanode.DataNode (BlockReceiver.java:run(1405)) - PacketResponder: BP-838165258-10.1.1.94-1459246457024:blk_1073790964_50153, type=HAS_DOWNSTREAM_IN_PIPELINE terminating
2016-05-31 06:40:23,622 ERROR datanode.DataNode (DataXceiver.java:run(278)) - data1.corp.mirrorplus.com:50010:DataXceiver error processing unknown operation  src: /127.0.0.1:43354 dst: /127.0.0.1:50010
java.io.EOFException
	at java.io.DataInputStream.readShort(DataInputStream.java:315)
	at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.readOp(Receiver.java:58)
	at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:227)
	at java.lang.Thread.run(Thread.java:745)
2016-05-31 06:40:30,505 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790484_0 on volume /var/log/hadoop/hdfs/data
2016-05-31 06:40:30,505 INFO  datanode.VolumeScanner (VolumeScanner.java:scanBlock(418)) - FileNotFound while finding block BP-838165258-10.1.1.94-1459246457024:blk_1073790492_0 on volume /var/log/hadoop/hdfs/data
1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi,

I found out the problem. One of the Data node got rebooted.

That's why this ind of log was written.

Thanks.

View solution in original post

3 REPLIES 3

avatar
Super Guru
@Raja Ray

As per the logs I couldn't find where datanode is getting down, can you please provide more logs?

Also the error might not be relevant since it seems to be related to https://issues.apache.org/jira/browse/AMBARI-12420. Check if you are using same ambari version.

avatar

Hi @Jitendra Yadav, you are right - that particular message is harmless and usually a side effect of the Ambari health check. The fix for this issue was actually made in HDFS via HDFS-9572.

avatar
Expert Contributor

Hi,

I found out the problem. One of the Data node got rebooted.

That's why this ind of log was written.

Thanks.