- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive tez memory error
Created ‎01-11-2017 08:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Running in HDP 2.4
hive> SELECT COUNT(*) from tweets;
I get
Error: Failure while running task:java.lang.IllegalArgumentException: tez.runtime.io.sort.mb 859 should be larger than 0 and should be less than the available task memory (MB):789
I have tried setting tez.runtime.io.sort.mb to 789 and also to 600.
I have tried running hive thus:
hive --hiveconf hive.tez.container.size=1024
Always the same error.
My config is:
Using cores=16 memory=64GB disks=4 hbase=True Profile: cores=16 memory=49152MB reserved=16GB usableMem=48GB disks=4 Num Container=8 Container Ram=6144MB Used Ram=48GB Unused Ram=16GB ***** mapred-site.xml ***** mapreduce.map.memory.mb=6144 mapreduce.map.java.opts=-Xmx4096m mapreduce.reduce.memory.mb=6144 mapreduce.reduce.java.opts=-Xmx4096m mapreduce.task.io.sort.mb=1792 ***** yarn-site.xml ***** yarn.scheduler.minimum-allocation-mb=6144 yarn.scheduler.maximum-allocation-mb=49152 yarn.nodemanager.resource.memory-mb=49152 yarn.app.mapreduce.am.resource.mb=6144 yarn.app.mapreduce.am.command-opts=-Xmx4096m ***** tez-site.xml ***** tez.am.resource.memory.mb=6144 tez.am.java.opts=-Xmx4096m ***** hive-site.xml ***** hive.tez.container.size=6144 hive.tez.java.opts=-Xmx4096m hive.auto.convert.join.noconditionaltask.size=1342177000
Any help is much appreciated. TIA!!
Created ‎01-12-2017 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
tez.runtime.io.sort.mb to 1024
Changed: HADOOP_USER_NAME=hdfs hive --hiveconf hive.tez.container.size=2048
Now a different error. So that is good...ish.
Created ‎01-11-2017 08:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you check the value for property 'tez.task.resource.memory.mb'?
And try increasing the value of the property tez.task.resource.memory.mb=2048 and check.
Created ‎01-12-2017 10:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Sindhu. I upped it from 1024 to 2048 but still same error.
Created ‎01-11-2017 09:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ed day You can also look at (tez-config) tez.task.launch.cmd-opts via Ambari to see the amount of heap set for -Xmx and also ensure it matches the value in hive.tez.java.opts (hive config)
Created ‎01-12-2017 10:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks srai. There is no -Xmx in either of these. For me, tez.task.launch.cmd-opts is:
-XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC
hive.tez.java.opts is:
-server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseParallelGC -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps
Created ‎01-12-2017 10:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've tried the following:
Upped hive.tez.container.size to 2048
tez.am.resource.memory.mb = 2048
hive.tez.container.size=4096
tez.runtime.io.sort.mb = 409
Created ‎05-25-2017 03:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, I didn't see this until after the struggle but this worked for me.
Created ‎01-12-2017 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
tez.runtime.io.sort.mb to 1024
Changed: HADOOP_USER_NAME=hdfs hive --hiveconf hive.tez.container.size=2048
Now a different error. So that is good...ish.
Created ‎01-12-2017 04:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- You can remove "hive.tez.java.opts=-Xmx4096m". Tez automatically takes up 80% of the container size allocated to it. As per your hive-site.xml and yarn-site.xml, "hive.tez.container.size=6144", "yarn.scheduler.minimum-allocation-mb=6144". So ~4915MB should be automatically be assigned without specifying any Xmx value in hive.tez.java.opts.
- Remove "--hiveconf hive.tez.container.size=2048" from hive cli command. By specifying 2048, it would end up under utilizing the memory.
- After incorporating #1, with the current config in hive-site, yarn-site you have posted, can you run "hive" cli without specifying any option and run the query?
- For simple like select count(*) from table, it should not launch tez job if it had enough information about the rows in metastore. Run "analyze table tweets compute statistics" and re-run this select count(*) statement. It should fetch information from metastore directly as opposed to launching the tez job.
Created ‎05-25-2017 03:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This ultimately lead me to the solution. I started by removing hive.tez.java.opts -Xmx200m which gave me a slightly higher value in my error message increased from 192 MB to the 245 MB show below but my sort size was still to big.
tez.runtime.io.sort.mb 3244 should be larger than 0 and should be less than the available task memory (MB):245
Removing it here allows me to leave the sort size and task completes as expected.
tez.task.launch.cmd-opts -Xmx256m{{heap_dump_opts}} CHANGED TO RECOMMENDED shown next. -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC{{heap_dump_opts}}
