- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
NullPointerException (but not always) in GroupBy in Hive with tez
Created ‎05-03-2019 01:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This query outputs NPE. The tasks with NPEs are retried, and most of the times (but not always) end up succeeding.
I could not find a smaller query showing my problem so I give here my full query:
select s.ts_utc as sent_dowhour , o.ts_utc as open_dowhour , sum(count(s.ts_utc)) over(partition by s.ts_utc) as sent_count from vault.sent s left join open o on o.id=s.id group by 1, 2
My guess is that the construction
sum(count(...)) over(partition by ...)
has issues.
When it fails, this is the output I get:
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1556016846110_42971_7_03, diagnostics= » Task failed, taskId=task_1556016846110_42971_7_03_000221, diagnostics= » TaskAttempt 0 failed, info= » Error: Error while running task ( failure ) : attempt_1556016846110_42971_7_03_000221_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row   at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)   at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)   at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)   at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)   at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)   at java.security.AccessController.doPrivileged(Native Method)   at javax.security.auth.Subject.doAs(Subject.java:422)   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)   at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)   at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)   at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)   at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)   at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)   at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row   at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)   at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)   at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)   ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row   at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)   at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)   ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException   at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:795)   at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)   ... 19 more Caused by: java.lang.NullPointerException   at org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)   at org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)   at org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200)   at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155)   at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538)   at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349)   at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123)   at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)   at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)   at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:850)   at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:724)   at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:790)   ... 20 more
Semantically my query is valid (and indeed sometimes succeeds) so what is going on?
Note:
- hdp 3.1, hive 3
- orc tables, orc intermediate results
- tez
Created ‎08-12-2019 03:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Might be related to a container-reuse issue: HIVE-18786 -- perhaps disable tez.am.container.reuse.enabled in tez-site.xml to verify?
Created ‎08-12-2019 03:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Might be related to a container-reuse issue: HIVE-18786 -- perhaps disable tez.am.container.reuse.enabled in tez-site.xml to verify?
Created ‎09-04-2019 10:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks you nailed it indeed. set hiveconf:tez.am.container.reuse.enabled=false; did the trick.
