Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

java.lang.OutOfMemoryError: GC overhead limit exceeded due to CREATE TABLE .. AS SELECT .. FROM

avatar
Explorer

 

When I created hive table as select from another table, in which approximately has data around 100 GB and stored by mongostorage handler, I got "GC overhead limit exceeded" error. My query is

CREATE TABLE traffic as SELECT * FROM test2

and the error that I got is shown below.

2018-05-01 05:09:56,153 FATAL [RMCommunicator Allocator] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[RMCommunicator Allocator,5,main] threw an Error.  Shutting down now...
java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.util.Collections.singletonIterator(Collections.java:3300)
	at java.util.Collections$SingletonSet.iterator(Collections.java:3332)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.selectBestAttempt(TaskImpl.java:544)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.getProgress(TaskImpl.java:449)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.computeProgress(JobImpl.java:907)
	at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.getProgress(JobImpl.java:891)
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.getApplicationProgress(RMCommunicator.java:142)
	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:196)
	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:764)
	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:261)
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$AllocatorRunnable.run(RMCommunicator.java:282)
	at java.lang.Thread.run(Thread.java:745)

My assumption is that I inserted too large data into hive table then it turns out an error. But I can't figure out how can I solve this issue. I also try limit query

CREATE TABLE traffic as SELECT * FROM test2 limit 1000;

but it also returns the same error.

 

1 REPLY 1

avatar
Champion

are you using beeline client tool ? 

did you try increasing heap on  the below property 

 

 HADOOP_CLIENT_OPTS

 

 

just curious to know below following 

what file format are you using ? 

is there any compression ? 

is table stats being collected ? 

is table being partitioned or buckted ?