Support Questions

yagoaparecidoti · ‎08-23-2022

Hello cloudera community,

we have the following problem:

we are using powerbi with hortonworks odbc driver to connect to hive in cluster ambari 2.6.2.2, hdp 2.6.5

the connection is made successfully, but when making a query on a table that has 23 thousand rows, it returns the following error below:

Erro do OLE DB ou do ODBC : [DataSource.Error] ERROR [HY000] [Hortonworks][Hardy] (35) Error from server: error code: '0' error message: 'Invalid OperationHandle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=0345789f-6c9a-4990-adf5-f823232338]'..

if we make a query putting a limiter of at most 10,000 thousand lines in the select, the result is ok

what could be causing this problem?

PS: there are queries in powerbi with the same ODBC in other tables that have more than 200 thousand rows and the result is OK too

tarak271 · ‎01-20-2023

Invalid OperationHandle: OperationHandle

This exception occurs when there are multiple HiveServer2 instances and access them using Zookeeper/Knox with failover configured
When a query(irrespective of number of rows) took more time and HS2 is not able to respond within the defined timeout, ZK/KNOX will do a failover to the next available HS2
Since the other HS2 is unaware of the Query/Operation Handle, it throws Invalid OperationHandle exception

To solve this problem

Check if we can optimize the query to run faster either by adding a filter or splitting the available data into multiple tables and then query them in separate queries etc
Check if HS2 is utilized beyond its capacity like
- using 200 connections at a given point in time for a 24GB heap of HS2/HMS
- HMS backend database not able to cope up to serve requests from HMS
Check yarn queue has enough capacity to serve the query otherwise query will be in waiting state
Check if HDFS is healthy and Namenode is able to respond to the requests without delays
Sometimes if Ranger needs to check too many files/directories in HDFS before the query gets executed
If Load balancer is used, sticky sessions should be enabled so that one-one relationship gets established for opened connections avoiding failover to another HS2 instance

The above explanation holds good for any version of Hive