About guillaume_roger

guillaume_roger · ‎09-04-2019

Thanks you nailed it indeed. set hiveconf:tez.am.container.reuse.enabled=false; did the trick.

guillaume_roger · ‎05-03-2019

This query outputs NPE. The tasks with NPEs are retried, and most of the times (but not always) end up succeeding. I could not find a smaller query showing my problem so I give here my full query: select s.ts_utc as sent_dowhour , o.ts_utc as open_dowhour , sum(count(s.ts_utc)) over(partition by s.ts_utc) as sent_count from vault.sent s left join open o on o.id=s.id group by 1, 2 My guess is that the construction sum(count(...)) over(partition by ...) has issues. When it fails, this is the output I get: Vertex failed, vertexName=Reducer 2, vertexId=vertex_1556016846110_42971_7_03, diagnostics= » Task failed, taskId=task_1556016846110_42971_7_03_000221, diagnostics= » TaskAttempt 0 failed, info= » Error: Error while running task ( failure ) : attempt_1556016846110_42971_7_03_000221_0:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:304) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:318) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:378) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:294) ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:795) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:363) ... 19 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115) at org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114) at org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator.getPartitionAgg(BasePartitionEvaluator.java:200) at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.evaluateFunctionOnPartition(WindowingTableFunction.java:155) at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.iterator(WindowingTableFunction.java:538) at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:349) at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:123) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:850) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:724) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:790) ... 20 more Semantically my query is valid (and indeed sometimes succeeds) so what is going on? Note: hdp 3.1, hive 3 orc tables, orc intermediate results tez

guillaume_roger · ‎02-15-2019

The ODBC driver does not support all syntax niceties (no CTE) and if there is a syntax error, it will output a completely irrelevant message, which adds a lot to the confusion. To actually see the actual error, you need to add ODBC logging and look at the log files.

guillaume_roger · ‎01-28-2019

Context: Hive3, HDP 3.1. Tests done with Python/odbc (official HDP driver) under Windows and Linux. I ran the following queries: "select ? as lic, ? as cpg" "select * from (select ? as lic, ? as cpg) as t" "with init as (select ? as lic, ? as cpg) select * from init", 1) and 2) work fine, and give me the expected result. 3 gives me a ParseException : Error while compiling statement: FAILED: ParseException line 1:21 cannot recognize input near '?' 'as' 'lic' in select clause (80) (SQLPrepare)") The exact same statements ran with java/jdbc work fine. Note that 2) looks like is a workaround for 3) but it works for this tiny example, not for bigger queries. Is there something I can do to have ODBC working as expected? Alternatively, where can I find the limits of the ODBC driver? For full context, the full test code is as follow: cnxnstr = 'DSN=HiveProd' cnxn = pyodbc.connect(cnxnstr, autocommit=True) cursor = cnxn.cursor() queries = [ "with init as (select ? as lic, ? as cpg) select * from init", "select 2 * ? as lic, ? as cpg", "select * from (select ? as lic, ? as cpg) as t", ] for q in queries: print("\nExecuting " + q) try: cursor.execute(q, '1', '2') except pyodbc.ProgrammingError as e: print(e) continue

guillaume_roger · ‎01-11-2019

In addition to these steps: restart ambari server (we had one instance where it looked like the application was OK but the alert was cached and keep being displayed), check your yarn logs. If there is not enough memory for yarn, the service will not be able to start.

guillaume_roger · ‎01-11-2019

You will lose some job history, but nothing else and certainly no data, so it should not be an issue.

guillaume_roger · ‎01-10-2019

It worked for me eventually after cleaning up *everything*: - destroying the app and cleaning hdfs as explained there: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/remove_ats_hbase_before_switching_between_clusters.html - cleaning zookeeper: zookeeper-client rmr /atsv2-hbase-unsecure and finally restarting *all* yarn services from ambari should did the trick.

guillaume_roger · ‎01-07-2019

I got it working by - cleaning up the hdfs directories of hbase-ats - cleaning up the zookeeper nodes related to hbase-hdfs I hope there are better ways, but that's the only one I found out and was working.

guillaume_roger · ‎11-22-2018

@Aditya Sirna, you are right, hbase runs as a service (is_hbase_system_service_launch is true). I am giving example with nodeN, which are the names of my data nodes, This is based on what I see right now and makes it easier to understand The region server (node5) tries to report for duty but fails. It tries to connect to node1:17020, but port 17020 is only open on node5. On node1 hbase master tried to start, but stopped because it apparently cannot find the active namenode Failed get of master address: java.io.IOException: Can't get master address from ZooKeeper; znode data == null I will look into zookeeper, it seems to ring a bell. I have 2 questions if you don't mind: - how do you start a yarn service on a specic node? - how does the timelinereader know where to connect? In any case thanks, you gave me some ideas to carry on.

guillaume_roger · ‎11-22-2018

Hello, I have a new hdp3.0.1 installation with ats-hbase which runs embedded (with proper queue configured, as per the documentation). At the end of all tasks (seen with the hive compactor, oozie steps), I have hundreds of lines with org.apache.hadoop.yarn.event.AsyncDispatcher: Waiting for AsyncDispatcher to drain. Thread state is :WAITING ending up with : org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Failed to process Event JOB_FINISHED for the job : job_1542872934100_0068 org.apache.hadoop.yarn.exceptions.YarnException: Failed while publishing entity at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:548) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:149) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForNewTimelineService(JobHistoryEventHandler.java:1405) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleTimelineEvent(JobHistoryEventHandler.java:742) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.access$1200(JobHistoryEventHandler.java:93) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1795) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$ForwardingEventHandler.handle(JobHistoryEventHandler.java:1791) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:745) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Call From null to prod-nl-dpnode3.dmdelivery.local:33602 failed on socket timeout exception:t java.lang.Thread.run(Thread.java:745) Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Call From null to prod-nl-dpnode3.dmdelivery.local:33602 failed on socket timeout exception Looking at /var/log/hadoop-yarn/yar/hadoop-yarn-nodemanager I have a lot of lines with: Call exception, tries=7, retries=7, started=8194 ms ago, cancelled=false, msg=Call to xxxxx/192.168.x.x:17020 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: prod-nl-dpnode1.dmdelivery.local/192.168.36.161:17020, details=row 'prod.timelineservice.entity,hive!yarn-cluster!xxxx-34-compactor-vault.contact.license_name=lectiva!^?��@@!^?��d��^?��!MAPREDUCE_TASK_ATTEMPT!^?��!attempt_1542205428050_2307_m_000461_0,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxx,17020,1542294270073, seqNum=-1 Looking at /var/log/hadoop-yarn/yarn/hadoop-yarn-timelinereader, I see Connection refused: dpnode1/192.168.36.161:17020 Indeed, there is no hbase on dpnode1. Hbase does run on dpnode5 (or another one, depending on yarn restart), but in any case, the timelinereader does not know which server to reach, and always goes to one seemingly hardcoded hostname. How can I tell yarn to use the right node to connect to hbase? Thanks,

Online	Offline
Last Visited	‎11-22-2019 08:48 AM

Member Since	‎10-13-2016 09:10 AM
Last Visited	‎11-22-2019 08:48 AM
Posts	68
Kudos received	10

Cloudera Community

Re: Hive odbc with prepared statements: ParseExcep...

Re: Fix under replicated blocks very slow

Re: Adding a host to ambari from another DC behind...

Re: NullPointerException (but not always) in Group...

NullPointerException (but not always) in GroupBy i...

Re: Hive odbc with prepared statements: ParseExcep...

Hive odbc with prepared statements: ParseException

Re: ATS hbase does not seem to start

Re: ATS hbase does not seem to start

Re: ATS hbase does not seem to start

Re: ATS2-hbase starts but on the wrong node

Re: ATS2-hbase starts but on the wrong node

ATS2-hbase starts but on the wrong node