Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NullPointerException (but not always) in GroupBy in Hive with tez

Solved Go to solution
Highlighted

NullPointerException (but not always) in GroupBy in Hive with tez

Expert Contributor

This query outputs NPE. The tasks with NPEs are retried, and most of the times (but not always) end up succeeding.

I could not find a smaller query showing my problem so I give here my full query:

select
  s.ts_utc as sent_dowhour
, o.ts_utc as open_dowhour
, sum(count(s.ts_utc)) over(partition by s.ts_utc) as sent_count
from vault.sent s
left join open o on
o.id=s.id
group by 1, 2

My guess is that the construction

sum(count(...)) over(partition by ...)

has issues.

When it fails, this is the output I get:

Vertex failed, vertexName=Reducer 2, vertexId=vertex_1556016846110_42971_7_03, diagnostics=
» Task failed, taskId=task_1556016846110_42971_7_03_000221, diagnostics=
» TaskAttempt 0 failed, info=
» Error: Error while running task ( failure ) : attempt_1556016846110_42971_7_03_000221_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
  at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
  at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
  at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
  at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
  at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
  at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
  at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
  at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
  ... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)
  ... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:795)
  at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)
  ... 19 more
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
  at org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
  at org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200)
  at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155)
  at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538)
  at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349)
  at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123)
  at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:850)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:724)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:790)
  ... 20 more

Semantically my query is valid (and indeed sometimes succeeds) so what is going on?

Note:

  • hdp 3.1, hive 3
  • orc tables, orc intermediate results
  • tez
1 ACCEPTED SOLUTION

Accepted Solutions

Re: NullPointerException (but not always) in GroupBy in Hive with tez

New Contributor

Might be related to a container-reuse issue: HIVE-18786 -- perhaps disable tez.am.container.reuse.enabled in tez-site.xml to verify?

2 REPLIES 2

Re: NullPointerException (but not always) in GroupBy in Hive with tez

New Contributor

Might be related to a container-reuse issue: HIVE-18786 -- perhaps disable tez.am.container.reuse.enabled in tez-site.xml to verify?

Re: NullPointerException (but not always) in GroupBy in Hive with tez

Expert Contributor

Thanks you nailed it indeed. set hiveconf:tez.am.container.reuse.enabled=false; did the trick.

Don't have an account?
Coming from Hortonworks? Activate your account here