Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HIVE/TEZ query failures from Tableau

Highlighted

HIVE/TEZ query failures from Tableau

New Contributor

Hi,

I am continually getting HIVE query failures in TEZ from tableau a dashboard I have created connected to a large analysis table using the Hortonworks Hadoop Hive connection.

This table exists at a very low level of aggregation. I am aware of the difficulties that can arise in using a tableau ODBC connection to HIVE that requires a lot of aggregation, but the level of detail calculations in tableau can handle our needs, and this tableau workbook is intended for use for ad-hoc analysis at any level of detail by less technical analysts. We are fine to trade-off some performance issues (i.e. taking a few minutes to refresh worksheets when making any changes) for convenience and ease of use of tableau, so long as we can actually get the queries to complete successfully.

I mainly run into this failure when I have both a percentile calculation as well as a filter condition.

Thanks in advance for any assistance!

here is the error I receive:

Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1503617266073_1756_1_10, diagnostics=[Task failed, taskId=task_1503617266073_1756_1_10_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"010774035"},"value":null}
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"010774035"},"value":null}
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"_col0":"010774035"},"value":null}
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"059425852","_col1":{0:{"_col0":"1303373-1"}}},"value":{"_col0":65.21}}
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:416)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:379)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:485)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:207)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:821)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
... 17 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"059425852","_col1":{0:{"_col0":"1303373-1"}}},"value":{"_col0":65.21}}
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
... 26 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {"key":{"_col0":"059425852","_col1":{0:{"_col0":"1303373-1"}}},"value":{"_col0":65.21}}
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
... 27 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:708)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
... 28 more

here is the query (generated by tableau):

SELECT `t4`.`x_measure__2` AS `temp_attr_calculation_796011276809453592_qk__2189057549__0_`,
  `t7`.`x_measure__5` AS `temp_attr_calculation_796011276809453592_qk__3958669016__0_`,
  `t0`.`physicianid` AS `physicianid`,
  `t1`.`x_measure__6` AS `sum_calculation_796011276799070219_ok`,
  `t1`.`x_measure__7` AS `sum_calculation_796011276799180812_ok`,
  (CASE WHEN `t1`.`x_measure__9` = 0 THEN NULL ELSE `t1`.`x_measure__8` / `t1`.`x_measure__9` END) AS `sum_calculation_796011276800053265_ok`
FROM (
  SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`
  FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1`
  WHERE ((`dbo_flat_analysissource_1`.`triggerclaimruleset` >= 'E01') AND (`dbo_flat_analysissource_1`.`triggerclaimruleset` <= 'E06'))
  GROUP BY `dbo_flat_analysissource_1`.`physicianid`
) `t0`   JOIN (
  SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`,
    SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__6`,
    COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__7`,
    SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__8`,
    COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__9`
  FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1`
  GROUP BY `dbo_flat_analysissource_1`.`physicianid`
) `t1` ON (`t0`.`physicianid` = `t1`.`physicianid`)   JOIN (
  SELECT PERCENTILE_APPROX((CASE WHEN `t3`.`x_measure__1` = 0 THEN NULL ELSE `t3`.`x_measure__0` / `t3`.`x_measure__1` END), 0.5) AS `x_measure__2`
  FROM (
    SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`
    FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1`
    WHERE ((`dbo_flat_analysissource_1`.`triggerclaimruleset` >= 'E01') AND (`dbo_flat_analysissource_1`.`triggerclaimruleset` <= 'E06'))
    GROUP BY `dbo_flat_analysissource_1`.`physicianid`
  ) `t2`     JOIN (
    SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`,
      SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__0`,
      COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__1`
    FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1`
    GROUP BY `dbo_flat_analysissource_1`.`physicianid`
  ) `t3` ON (`t2`.`physicianid` = `t3`.`physicianid`)
  GROUP BY 1
) `t4`   JOIN (
  SELECT PERCENTILE_APPROX((CASE WHEN `t6`.`x_measure__4` = 0 THEN NULL ELSE `t6`.`x_measure__3` / `t6`.`x_measure__4` END), 0.5) AS `x_measure__5`
  FROM (
    SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`
    FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1`
    WHERE ((`dbo_flat_analysissource_1`.`triggerclaimruleset` >= 'E01') AND (`dbo_flat_analysissource_1`.`triggerclaimruleset` <= 'E06'))
    GROUP BY `dbo_flat_analysissource_1`.`physicianid`
  ) `t5`     JOIN (
    SELECT `dbo_flat_analysissource_1`.`physicianid` AS `physicianid`,
      SUM(`dbo_flat_analysissource_1`.`cps_epispendother1`) AS `x_measure__3`,
      COUNT(DISTINCT `dbo_flat_analysissource_1`.`episodeidd`) AS `x_measure__4`
    FROM `sourcedb`.`dbo_flat_analysissource_1` `dbo_flat_analysissource_1`
    GROUP BY `dbo_flat_analysissource_1`.`physicianid`
  ) `t6` ON (`t5`.`physicianid` = `t6`.`physicianid`)
  GROUP BY 1
) `T7`
1 REPLY 1

Re: HIVE/TEZ query failures from Tableau

Cloudera Employee

Hi @wlf211,

Have you tried running the query when hive.execution.engine set to MR. This should resolve the problem.

Though, if you want to continue using TEZ as your execution engine, try setting the below property and then test the query:

hive.auto.convert.sortmerge.join=false

Thanks,

Rohit Rai Malhotra