Created 08-28-2017 07:49 PM
Hi,
I am continually getting HIVE query failures in TEZ from tableau a dashboard I have created connected to a large analysis table using the Hortonworks Hadoop Hive connection.
This table exists at a very low level of aggregation. I am aware of the difficulties that can arise in using a tableau ODBC connection to HIVE that requires a lot of aggregation, but the level of detail calculations in tableau can handle our needs, and this tableau workbook is intended for use for ad-hoc analysis at any level of detail by less technical analysts. We are fine to trade-off some performance issues (i.e. taking a few minutes to refresh worksheets when making any changes) for convenience and ease of use of tableau, so long as we can actually get the queries to complete successfully.
I mainly run into this failure when I have both a percentile calculation as well as a filter condition.
Thanks in advance for any assistance!
here is the error I receive:
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1503617266073_1756_1_10, diagnostics=[Task failed, taskId=task_1503617266073_1756_1_10_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"010774035"},"value":null} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"010774035"},"value":null} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"010774035"},"value":null} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"059425852","_col1":{0:{"_col0":"1303373-1"}}},"value":{"_col0":65.21}} at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:416) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:379) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:485) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:207) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 17 more Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"059425852","_col1":{0:{"_col0":"1303373-1"}}},"value":{"_col0":65.21}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404) ... 26 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"059425852","_col1":{0:{"_col0":"1303373-1"}}},"value":{"_col0":65.21}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274) ... 27 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:708) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 28 more
here is the query (generated by tableau):
SELECT `t4`.`x_measure__2` AS `temp_attr_calculation_796011276809453592_qk__2189057549__0_`, `t7`.`x_measure__5` AS `temp_attr_calculation_796011276809453592_qk__3958669016__0_`, `t0`.`physicianid` AS `physicianid`, `t1`.`x_measure__6` AS `sum_calculation_796011276799070219_ok`, `t1`.`x_measure__7` AS `sum_calculation_796011276799180812_ok`, (CASE WHEN `t1`.`x_measure__9` = 0 THEN NULL ELSE `t1`.`x_measure__8` / `t1`.`x_measure__9` END) AS `sum_calculation_796011276800053265_ok` FROM ( SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid` FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1` WHERE ((`dbo_flat_analysissource_1`.`triggerclaimruleset` >= 'E01') AND (`dbo_flat_analysissource_1`.`triggerclaimruleset` <= 'E06')) GROUP BY `dbo_flat_analysissource_1`.`physicianid` ) `t0` JOIN ( SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`, SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__6`, COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__7`, SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__8`, COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__9` FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1` GROUP BY `dbo_flat_analysissource_1`.`physicianid` ) `t1` ON (`t0`.`physicianid` = `t1`.`physicianid`) JOIN ( SELECT PERCENTILE_APPROX((CASE WHEN `t3`.`x_measure__1` = 0 THEN NULL ELSE `t3`.`x_measure__0` / `t3`.`x_measure__1` END), 0.5) AS `x_measure__2` FROM ( SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid` FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1` WHERE ((`dbo_flat_analysissource_1`.`triggerclaimruleset` >= 'E01') AND (`dbo_flat_analysissource_1`.`triggerclaimruleset` <= 'E06')) GROUP BY `dbo_flat_analysissource_1`.`physicianid` ) `t2` JOIN ( SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`, SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__0`, COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__1` FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1` GROUP BY `dbo_flat_analysissource_1`.`physicianid` ) `t3` ON (`t2`.`physicianid` = `t3`.`physicianid`) GROUP BY 1 ) `t4` JOIN ( SELECT PERCENTILE_APPROX((CASE WHEN `t6`.`x_measure__4` = 0 THEN NULL ELSE `t6`.`x_measure__3` / `t6`.`x_measure__4` END), 0.5) AS `x_measure__5` FROM ( SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid` FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1` WHERE ((`dbo_flat_analysissource_1`.`triggerclaimruleset` >= 'E01') AND (`dbo_flat_analysissource_1`.`triggerclaimruleset` <= 'E06')) GROUP BY `dbo_flat_analysissource_1`.`physicianid` ) `t5` JOIN ( SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`, SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__3`, COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__4` FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1` GROUP BY `dbo_flat_analysissource_1`.`physicianid` ) `t6` ON (`t5`.`physicianid` = `t6`.`physicianid`) GROUP BY 1 ) `T7`
Created 09-20-2017 09:01 AM
Hi @wlf211,
Have you tried running the query when hive.execution.engine set to MR. This should resolve the problem.
Though, if you want to continue using TEZ as your execution engine, try setting the below property and then test the query:
hive.auto.convert.sortmerge.join=false
Thanks,
Rohit Rai Malhotra