Created on 01-28-2015 01:23 AM - edited 09-16-2022 02:20 AM
Hi, i'm using CDH5.3
i've got a cluster with 3 host: 1 master host have namenode & datanode, 2 host just have datanode,
Everything run fine till recently when i run a hive Job, the datanode on the master shutdown and i got the error missing block & underreplicated blocks.
Here is the error on the master's datanode:
3:35:09.545 PM ERROR org.apache.hadoop.hdfs.server.datanode.DirectoryScanner
Error compiling report
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:545)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:422)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:403)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
3:35:09.553 PM INFO org.apache.hadoop.hdfs.server.datanode.DataNode
opWriteBlock BP-993220972-192.168.0.140-1413974566312:blk_1074414393_678864 received exception java.io.IOException: Premature EOF from inputStream
3:35:09.553 PM ERROR org.apache.hadoop.hdfs.server.datanode.DirectoryScanner
Exception during DirectoryScanner execution - will continue next cycle
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:549)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:422)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:403)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:545)
... 10 more
Caused by: java.lang.OutOfMemoryError: Java heap space
3:35:09.553 PM ERROR org.apache.hadoop.hdfs.server.datanode.DataNode
00master.mabu.com:50010:DataXceiver error processing WRITE_BLOCK operation src: /192.168.6.10:48911 dst: /192.168.6.10:50010
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:468)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:772)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:724)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
at java.lang.Thread.run(Thread.java:745)
Can someone help me to fix this ? Thanks !
Created 01-29-2015 11:05 AM
Created 01-28-2015 03:19 AM
Created 01-29-2015 05:45 AM
Thanks for reply,
I got 3 datanodes, the one that shutdown is on master host, this is the information:
00master - block: 342823 - block pool used: 53,95GB (6,16%)
01slave - block: 346297 - block pool used: 54,38GB (12,46%)
02slave - block: 319262 - block pool used: 48,39GB (33,23%)
and this is my heap setting
DataNode Default Group / Resource Management : 186 MB
DataNode Group 1 / Resource Management 348 MB
Regards,
Tu Nguyen
Created 01-29-2015 11:05 AM
Created 02-02-2015 07:00 PM
Thanks for reply,
I've increase the datanode heap size to 1Gb , and my datanode work well so far, but there is one more thing:
I upload data (just using -put command) to my cluster (2736 folder with 200 file each folder (about 15kB each file) ) and my cluster go from 350k up to over 700k blocks each node, then the warning too many block prompted.
I really don't understand why there are so many blocks because the total size of data is just about 5GB.
Regards,
Tu Nguyen
Created 02-03-2015 07:58 AM
Each file uses a minimum of one block entry (though that block will only be the size of the actual data).
So if you are adding 2736 folders each with 200 files that's
2736 * 200 = 547,200
blocks.
Do the folders represent some particular partitioning strategy? Can the files within a particular folder be combined into a single larger file?
Depending on your source data format, you may be better off looking at something like Kite to handle the dataset management for you.