- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive query OutOfMemoryError: Java heap space
- Labels:
-
Apache Hive
-
Apache Tez
Created 05-25-2016 01:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am issuing a command that is executing about 1500 xpaths on a single XML file (it is about 10MB in size).
I am getting the error in the title. I have tried increasing just about every configuration setting I know related to Hive/Tez's java heap space.
Nothing seems to work. I restart the server after every configuration change.
I also went and changed hive-env.sh to -Xmx8g and it still doesn't seem to fix the issue. I ran -verbose:gc and see that the gc stops at ~1000MB. Why wouldn't that go on up to 8G if I changed -Xmx to be 8g?
Is there anyway to tell if it is the client breaking and needing more heap or the map jobs?
Created 05-25-2016 07:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Kevin Vasko Hi Kevin, can you export the below in hive-env.sh file from Ambari then restart the affected components :
---
export HADOOP_CLIENT_OPTS="-Xmx6144m"
---
Then run the below command on the node where HS2 is running as hive user, to check the heap size (MaxHeapSize):
# jmap -heap <PID-of-HS2>
Thanks !
Created 05-25-2016 02:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you give me the output of ps -ef | grep hiveserver2 ?
Created 05-25-2016 03:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hive 24964 0.2 1.7 2094636 566148 ? Sl 17:03 0:56 /usr/lib/jvm/ja va-1.7.0-openjdk-1.7.0.91.x86_64/bin/java -Xmx1024m -Dhdp.version=2.3.2.0-2950 - Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Dhadoop.log.dir=/var/ log/hadoop/hive -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.3.2.0- 2950/hadoop -Dhadoop.id.str=hive -Dhadoop.root.logger=INFO,console -Djava.librar y.path=:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/2.3.2. 0-2950/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.prefe rIPv4Stack=true -Xmx1024m -XX:MaxPermSize=512m -Dhadoop.security.logger=INFO,Nul lAppender org.apache.hadoop.util.RunJar /usr/hdp/2.3.2.0-2950/hive/lib/hive-serv ice-1.2.1.2.3.2.0-2950.jar org.apache.hive.service.server.HiveServer2 --hiveconf hive.aux.jars.path=file:///usr/hdp/current/hive-webhcat/share/hcatalog/hive-hca talog-core.jar -hiveconf hive.metastore.uris= -hiveconf hive.log.file=hiveserve r2.log -hiveconf hive.log.dir=/var/log/hive
So I can see that it is set at 1024m, however it is set to some really large value.
Created 05-25-2016 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Xmx1024m means it's just 1 GB, please increase it to 4 or 6 or 8 GB based on available Memory.
Go to Ambari -Hive-search for heap (hive.heapsize) and update the value to 8192 and then re-start affected services.
Created 05-25-2016 04:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Divakar AnnapureddyCorrect but if you look at my comments i posted a picture and it shows, it is changed to 12GB in the UI. The services have been restarted (complete server has been restarted).
Created 05-25-2016 04:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
your GUI is showing around 12 GB but it's not showing in your ps -ef script. it means your 12 GB doesn't take an effect..
May be ambari issue or some hard coded value is not allowing to update.
I would recommend to increase 6 GB or 8 GB instead of 12GB, 12 GB is very high number.
Created 05-25-2016 04:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The configuration of hive.heapsize does not exist in my hive-site.xml for some reason and whenever I add it to the file it keeps getting overwritten.
Created 05-25-2016 05:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yes, we have to make changes through Ambari only if your cluster managed by Ambari. Ambari will overwrite the changes if we make any changes at command level.
nothing to do with hive-site.xml , we have to check in hive-env.sh
Created 05-25-2016 05:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
what do I need to set in hive-env.sh? It seems that anything I touch it gets overwritten. This has to be a bug in ambari where it won't save the hive.heapsize value. How can I get it to persist?
Created 05-25-2016 07:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Kevin Vasko Hi Kevin, can you export the below in hive-env.sh file from Ambari then restart the affected components :
---
export HADOOP_CLIENT_OPTS="-Xmx6144m"
---
Then run the below command on the node where HS2 is running as hive user, to check the heap size (MaxHeapSize):
# jmap -heap <PID-of-HS2>
Thanks !
