Member since
10-13-2016
68
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2111 | 02-15-2019 11:50 AM | |
4644 | 10-12-2017 02:03 PM | |
873 | 10-13-2016 11:52 AM |
09-04-2019
10:30 PM
Thanks you nailed it indeed. set hiveconf:tez.am.container.reuse.enabled=false; did the trick.
... View more
05-03-2019
01:50 PM
This query outputs NPE. The tasks with NPEs are retried, and most of the times (but not always) end up succeeding. I could not find a smaller query showing my problem so I give here my full query: select
s.ts_utc as sent_dowhour
, o.ts_utc as open_dowhour
, sum(count(s.ts_utc)) over(partition by s.ts_utc) as sent_count
from vault.sent s
left join open o on
o.id=s.id
group by 1, 2 My guess is that the construction sum(count(...)) over(partition by ...) has issues. When it fails, this is the output I get: Vertex failed, vertexName=Reducer 2, vertexId=vertex_1556016846110_42971_7_03, diagnostics=
» Task failed, taskId=task_1556016846110_42971_7_03_000221, diagnostics=
» TaskAttempt 0 failed, info=
» Error: Error while running task ( failure ) : attempt_1556016846110_42971_7_03_000221_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294)
... 18 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:795)
at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363)
... 19 more
Caused by: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
at org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
at org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200)
at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155)
at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538)
at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349)
at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:850)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:724)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:790)
... 20 more Semantically my query is valid (and indeed sometimes succeeds) so what is going on? Note: hdp 3.1, hive 3 orc tables, orc intermediate results tez
... View more
Labels:
02-15-2019
11:50 AM
The ODBC driver does not support all syntax niceties (no CTE) and if there is a syntax error, it will output a completely irrelevant message, which adds a lot to the confusion. To actually see the actual error, you need to add ODBC logging and look at the log files.
... View more
01-28-2019
07:07 AM
Context: Hive3, HDP 3.1. Tests done with Python/odbc (official HDP driver) under Windows and Linux. I ran the following queries:
"select ? as lic, ? as cpg" "select * from (select ? as lic, ? as cpg) as t" "with init as (select ? as lic, ? as cpg) select * from init", 1) and 2) work fine, and give me the expected result. 3 gives me a ParseException :
Error while compiling statement: FAILED: ParseException line 1:21 cannot recognize input near '?' 'as' 'lic' in select clause (80) (SQLPrepare)") The exact same statements ran with java/jdbc work fine. Note that 2) looks like is a workaround for 3) but it works for this tiny example, not for bigger queries. Is there something I can do to have ODBC working as expected? Alternatively, where can I find the limits of the ODBC driver? For full context, the full test code is as follow: cnxnstr = 'DSN=HiveProd'
cnxn = pyodbc.connect(cnxnstr, autocommit=True)
cursor = cnxn.cursor()
queries = [
"with init as (select ? as lic, ? as cpg) select * from init",
"select 2 * ? as lic, ? as cpg",
"select * from (select ? as lic, ? as cpg) as t",
]
for q in queries:
print("\nExecuting " + q)
try:
cursor.execute(q, '1', '2')
except pyodbc.ProgrammingError as e:
print(e)
continue
... View more
Labels:
- Labels:
-
Apache Hive
01-11-2019
05:26 AM
In addition to these steps: restart ambari server (we had one instance where it looked like the application was OK but the alert was cached and keep being displayed), check your yarn logs. If there is not enough memory for yarn, the service will not be able to start.
... View more
01-11-2019
05:24 AM
1 Kudo
You will lose some job history, but nothing else and certainly no data, so it should not be an issue.
... View more
01-10-2019
08:50 AM
2 Kudos
It worked for me eventually after cleaning up *everything*: - destroying the app and cleaning hdfs as explained there: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/remove_ats_hbase_before_switching_between_clusters.html - cleaning zookeeper: zookeeper-client rmr /atsv2-hbase-unsecure and finally restarting *all* yarn services from ambari should did the trick.
... View more
01-07-2019
05:01 AM
I got it working by - cleaning up the hdfs directories of hbase-ats - cleaning up the zookeeper nodes related to hbase-hdfs I hope there are better ways, but that's the only one I found out and was working.
... View more
11-22-2018
06:37 PM
@Aditya Sirna, you are right, hbase runs as a service (is_hbase_system_service_launch is true). I am giving example with nodeN, which are the names of my data nodes, This is based on what I see right now and makes it easier to understand The region server (node5) tries to report for duty but fails. It tries to connect to node1:17020, but port 17020 is only open on node5. On node1 hbase master tried to start, but stopped because it apparently cannot find the active namenode Failed get of master address: java.io.IOException: Can't get master address from ZooKeeper; znode data == null I will look into zookeeper, it seems to ring a bell. I have 2 questions if you don't mind: - how do you start a yarn service on a specic node? - how does the timelinereader know where to connect? In any case thanks, you gave me some ideas to carry on.
... View more
11-22-2018
03:07 PM
Hello, I have a new hdp3.0.1 installation with ats-hbase which runs embedded (with proper queue configured, as per the documentation). At the end of all tasks (seen with the hive compactor, oozie steps), I have hundreds of lines with org.apache.hadoop.yarn.event.AsyncDispatcher: Waiting for AsyncDispatcher to drain. Thread state is :WAITING ending up with : org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Failed to process Event JOB_FINISHED for the job : job_1542872934100_0068 org.apache.hadoop.yarn.exceptions.YarnException: Failed while publishing entity at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:548) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForNewTimelineService(JobHistoryEventHandler.java:1405) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleTimelineEvent(JobHistoryEventHandler.java:742) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.access$1200(JobHistoryEventHandler.java:93) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1795) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1791) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:745) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Call From null to prod-nl-dpnode3.dmdelivery.local:33602 failed on socket timeout exception:t java.lang.Thread.run(Thread.java:745) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Call From null to prod-nl-dpnode3.dmdelivery.local:33602 failed on socket timeout exception Looking at /var/log/hadoop-yarn/yar/hadoop-yarn-nodemanager I have a lot of lines with: Call exception, tries=7, retries=7, started=8194 ms ago, cancelled=false, msg=Call to xxxxx/192.168.x.x:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: prod-nl-dpnode1.dmdelivery.local/192.168.36.161:17020, details=row 'prod.timelineservice.entity,hive!yarn-cluster!xxxx-34-compactor-vault.contact.license_name=lectiva!^?�����@@!^?����d��^?���!MAPREDUCE_TASK_ATTEMPT!^?�����!attempt_1542205428050_2307_m_000461_0,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxx,17020,1542294270073, seqNum=-1 Looking at /var/log/hadoop-yarn/yarn/hadoop-yarn-timelinereader, I see Connection refused: dpnode1/192.168.36.161:17020 Indeed, there is no hbase on dpnode1. Hbase does run on dpnode5 (or another one, depending on yarn restart), but in any case, the timelinereader does not know which server to reach, and always goes to one seemingly hardcoded hostname. How can I tell yarn to use the right node to connect to hbase? Thanks,
... View more
Labels: