Created on 06-23-2017 03:10 AM - edited 09-16-2022 04:48 AM
Hi,
I'm currently experiencing an issue with some hive queries being launched as local tasks in HiveServer2.
In the log of the local task (in /tmp/hive/<query_id>.log) I can see that the issue is that the task runs out of memory:
2017-06-23 10:43:39,503 ERROR [main]: mr.MapredLocalTask (MapredLocalTask.java:executeInProcess(377)) - Hive Runtime Error: Map local work exhausted memory org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2017-06-23 10:43:39 Processing rows: 200000 Hashtable size: 199999 Memory usage: 572176016 percentage: 0.551 at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:99) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:249) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:120) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:432) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:403) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:369) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:743) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
But I don't understand why HiveServer runs these queries in local mode: from the documentation https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-Hive,Map-ReduceandLoc... this feature should be disabled by default and I haven't changed this.
Is there another way to permanently disable this feature on the server? I don't want to run these operations on the server...
Created 06-23-2017 05:58 AM
That setting, mapreduce.framework.name, can be found in the the yarn-site.xml. Check for it under /etc/hadoop/conf/ and for the value. If it is there with yarn as the value, then it is likely that HS2 is not running with the correct Hadoop environmental variables like HADOOP_CONF_DIR. If it isn't there or the value is incorrect then try installing the YARN gateway role on the HS2 host.
Created 06-23-2017 06:28 AM
Thanks mbigelow for the response.
I will add the YARN gateway and try again but I don't think this is the issue because only a subset of the queries are executed in local mode: a lot of other queries are executed via MR (and I have other queries executed via HiveOnSpark).
It looks like some queries are "optimized" in local mode.
Created 06-23-2017 06:58 AM
Created 06-23-2017 08:01 AM
I've done a test with:
SET hive.fetch.task.conversion=none;
And some tasks are still getting executed in local mode (I've tried a couple of different queries):
INFO : Starting task [Stage-14:MAPREDLOCAL] in serial mode
Created 06-24-2017 05:44 AM
Created 12-19-2017 06:43 PM
The way things are implemented, a MapJoin optimization will always use local task operation. If you would like to remove all instances of local tasks, you will have to disable MapJoins.
Please examine these two explain plans (first with MapJoin enabled, second with disabled)
| STAGE PLANS: | | Stage: Stage-5 | | Map Reduce Local Work | | Alias -> Map Local Tables: | | s07 | | Fetch Operator | | limit: -1 | | Alias -> Map Local Operator Tree: | | s07 | | TableScan | | alias: s07 | | filterExpr: code is not null (type: boolean) | | Statistics: Num rows: 225 Data size: 46055 Basic stats: COMPLETE Column stats: NONE | | Filter Operator | | predicate: code is not null (type: boolean) | | Statistics: Num rows: 113 Data size: 23129 Basic stats: COMPLETE Column stats: NONE | | HashTable Sink Operator | | keys: | | 0 code (type: string) | | 1 code (type: string) |
| STAGE PLANS: | | Stage: Stage-1 | | Map Reduce | | Map Operator Tree: | | TableScan | | alias: s07 | | filterExpr: code is not null (type: boolean) | | Statistics: Num rows: 225 Data size: 46055 Basic stats: COMPLETE Column stats: NONE | | Filter Operator | | predicate: code is not null (type: boolean) | | Statistics: Num rows: 113 Data size: 23129 Basic stats: COMPLETE Column stats: NONE | | Reduce Output Operator | | key expressions: code (type: string) | | sort order: + | | Map-reduce partition columns: code (type: string) | | Statistics: Num rows: 113 Data size: 23129 Basic stats: COMPLETE Column stats: NONE | | value expressions: description (type: string), salary (type: int) | | TableScan | | alias: s08 | | filterExpr: code is not null (type: boolean) | | Statistics: Num rows: 442 Data size: 46069 Basic stats: COMPLETE Column stats: NONE | | Filter Operator | | predicate: code is not null (type: boolean) | | Statistics: Num rows: 221 Data size: 23034 Basic stats: COMPLETE Column stats: NONE | | Reduce Output Operator | | key expressions: code (type: string) | | sort order: + | | Map-reduce partition columns: code (type: string) | | Statistics: Num rows: 221 Data size: 23034 Basic stats: COMPLETE Column stats: NONE | | value expressions: salary (type: int) |
We can see that the first one uses "Map Reduce Local Work" and the second one does not.
set hive.auto.convert.join=false;
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
This can be important becaue I'm seeing a case where the Local Job Runners are leaking the log file output from these Local Job Runners into the HS2's /tmp directory in the following format:
/tmp/hive_20171219184242_3ecaf468-51c7-4ced-99b3-6bd9eaaa980a.log
Disable the MapJoin optimization and these log files are not generated.