Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NullPointerException (but not always) in GroupBy in Hive with tez

avatar
Expert Contributor

This query outputs NPE. The tasks with NPEs are retried, and most of the times (but not always) end up succeeding.

I could not find a smaller query showing my problem so I give here my full query:

select
  s.ts_utc as sent_dowhour
, o.ts_utc as open_dowhour
, sum(count(s.ts_utc)) over(partition by s.ts_utc) as sent_count
from vault.sent s
left join open o on
o.id=s.id
group by 1, 2

My guess is that the construction

sum(count(...)) over(partition by ...)

has issues.

When it fails, this is the output I get:

Vertex failed, vertexName=Reducer 2, vertexId=vertex_1556016846110_42971_7_03, diagnostics=
» Task failed, taskId=task_1556016846110_42971_7_03_000221, diagnostics=
» TaskAttempt 0 failed, info=
» Error: Error while running task ( failure ) : attempt_1556016846110_42971_7_03_000221_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
  at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
  at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
  at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
  at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
  at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
  at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
  at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
  ... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)
  ... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:795)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)
  ... 19 more
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
  at org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
  at org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200)
  at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155)
  at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538)
  at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349)
  at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123)
  at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:850)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:724)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:790)
  ... 20 more

Semantically my query is valid (and indeed sometimes succeeds) so what is going on?

Note:

  • hdp 3.1, hive 3
  • orc tables, orc intermediate results
  • tez
1 ACCEPTED SOLUTION

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
New Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Expert Contributor

Thanks you nailed it indeed. set hiveconf:tez.am.container.reuse.enabled=false; did the trick.