Support Questions

Find answers, ask questions, and share your expertise

Datanode shut down when running Hive

avatar
Explorer

Hi, i'm using CDH5.3

i've got a cluster with 3 host: 1 master host have namenode & datanode, 2 host just have datanode,

Everything run fine till recently when i run a hive Job, the datanode on the master shutdown and i got the error missing block & underreplicated blocks.

Here is the error on the master's datanode: 

 

3:35:09.545 PM ERROR org.apache.hadoop.hdfs.server.datanode.DirectoryScanner
Error compiling report
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:545)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:422)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:403)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
3:35:09.553 PM INFO org.apache.hadoop.hdfs.server.datanode.DataNode
opWriteBlock BP-993220972-192.168.0.140-1413974566312:blk_1074414393_678864 received exception java.io.IOException: Premature EOF from inputStream
3:35:09.553 PM ERROR org.apache.hadoop.hdfs.server.datanode.DirectoryScanner
Exception during DirectoryScanner execution - will continue next cycle
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:549)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:422)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:403)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:359)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:545)
... 10 more
Caused by: java.lang.OutOfMemoryError: Java heap space
3:35:09.553 PM ERROR org.apache.hadoop.hdfs.server.datanode.DataNode
00master.mabu.com:50010:DataXceiver error processing WRITE_BLOCK operation src: /192.168.6.10:48911 dst: /192.168.6.10:50010
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:468)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:772)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:724)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:126)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:72)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:226)
at java.lang.Thread.run(Thread.java:745)

 

Can someone help me to fix this ? Thanks !

1 ACCEPTED SOLUTION

avatar
Try increasing your datanode heap size. You may need to decrease heaps of other roles to make space, or move roles around so there isn't so much contention for memory on a single host.

View solution in original post

5 REPLIES 5

avatar
> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space

This means the datanode ran out of heap. How many datanodes do you
have and how many blocks does this one hold? Are all your datanodes
evenly filled up? What is the heap setting for your datanodes?



Regards,
Gautam Gopalakrishnan

avatar
Explorer

Thanks for reply,

I got 3 datanodes, the one that shutdown is on master host, this is the information:

00master -  block: 342823  - block pool used: 53,95GB (6,16%)

01slave    -  block: 346297  - block pool used: 54,38GB (12,46%)

02slave    -  block: 319262  - block pool used: 48,39GB (33,23%)

 

and this is my heap setting

DataNode Default Group / Resource Management : 186 MB

DataNode Group 1 / Resource Management 348 MB

 

Regards,

Tu Nguyen

avatar
Try increasing your datanode heap size. You may need to decrease heaps of other roles to make space, or move roles around so there isn't so much contention for memory on a single host.

avatar
Explorer

Thanks for reply,

I've increase the datanode heap size to 1Gb , and my datanode work well so far, but there is one more thing:

I upload data (just using -put command) to my cluster (2736 folder with 200 file each folder (about 15kB each file) ) and my cluster go from 350k up to over 700k blocks each node, then the warning too many block prompted.

 

I really don't understand why there are so many blocks because the total size of data is just about 5GB.

 

Regards,

Tu Nguyen

 

 

avatar
Expert Contributor

Each file uses a minimum of one block entry (though that block will only be the size of the actual data).

 

So if you are adding 2736 folders each with 200 files that's 

2736 * 200 = 547,200

 blocks.

 

Do the folders represent some particular partitioning strategy? Can the files within a particular folder be combined into a single larger file?

 

Depending on your source data format, you may be better off looking at something like Kite to handle the dataset management for you.