Member since
10-13-2016
68
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2883 | 02-15-2019 11:50 AM |
09-04-2019
10:30 PM
Thanks you nailed it indeed. set hiveconf:tez.am.container.reuse.enabled=false; did the trick.
... View more
05-03-2019
01:50 PM
This query outputs NPE. The tasks with NPEs are retried, and most of the times (but not always) end up succeeding. I could not find a smaller query showing my problem so I give here my full query: select
s.ts_utc as sent_dowhour
, o.ts_utc as open_dowhour
, sum(count(s.ts_utc)) over(partition by s.ts_utc) as sent_count
from vault.sent s
left join open o on
o.id=s.id
group by 1, 2 My guess is that the construction sum(count(...)) over(partition by ...) has issues. When it fails, this is the output I get: Vertex failed, vertexName=Reducer 2, vertexId=vertex_1556016846110_42971_7_03, diagnostics=
» Task failed, taskId=task_1556016846110_42971_7_03_000221, diagnostics=
» TaskAttempt 0 failed, info=
» Error: Error while running task ( failure ) : attempt_1556016846110_42971_7_03_000221_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)
... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:795)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)
... 19 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
at org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
at org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200)
at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155)
at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538)
at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349)
at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:850)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:724)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:790)
... 20 more Semantically my query is valid (and indeed sometimes succeeds) so what is going on? Note: hdp 3.1, hive 3 orc tables, orc intermediate results tez
... View more
Labels:
02-15-2019
11:50 AM
The ODBC driver does not support all syntax niceties (no CTE) and if there is a syntax error, it will output a completely irrelevant message, which adds a lot to the confusion. To actually see the actual error, you need to add ODBC logging and look at the log files.
... View more
01-28-2019
07:07 AM
Context: Hive3, HDP 3.1. Tests done with Python/odbc (official HDP driver) under Windows and Linux. I ran the following queries:
"select ? as lic, ? as cpg" "select * from (select ? as lic, ? as cpg) as t" "with init as (select ? as lic, ? as cpg) select * from init", 1) and 2) work fine, and give me the expected result. 3 gives me a ParseException :
Error while compiling statement: FAILED: ParseException line 1:21 cannot recognize input near '?' 'as' 'lic' in select clause (80) (SQLPrepare)") The exact same statements ran with java/jdbc work fine. Note that 2) looks like is a workaround for 3) but it works for this tiny example, not for bigger queries. Is there something I can do to have ODBC working as expected? Alternatively, where can I find the limits of the ODBC driver? For full context, the full test code is as follow: cnxnstr = 'DSN=HiveProd'
cnxn = pyodbc.connect(cnxnstr, autocommit=True)
cursor = cnxn.cursor()
queries = [
"with init as (select ? as lic, ? as cpg) select * from init",
"select 2 * ? as lic, ? as cpg",
"select * from (select ? as lic, ? as cpg) as t",
]
for q in queries:
print("\nExecuting " + q)
try:
cursor.execute(q, '1', '2')
except pyodbc.ProgrammingError as e:
print(e)
continue
... View more
Labels:
- Labels:
-
Apache Hive