Created on 01-13-2016 10:45 AM - edited 09-16-2022 02:57 AM
I am trying the sample impyla code from
http://blog.cloudera.com/blog/2014/04/a-new-python-client-for-impala/
And getting "impala.error.HiveServer2Error: Failed after retrying 3 times"
impyla is installed on the hadoop (CDH-5.3.2) node I log in to
Tried:
from impala.dbapi import connect
conn = connect(host='my.impala.host', port=21050)
cursor = conn.cursor()
cursor.execute('SELECT * FROM youval_db.accounts_info LIMIT 10')
print cursor.description # prints the result set's schema
results = cursor.fetchall()
Where for "my.impala.host" I used the the impala host I got from the cloudera manager.
(tried with host from the following groups: Impala Catalog Server Default Group, Impala Daemon Default Group and Impala StateStore Default Group)
got the same error for All.
Also tried with
conn = connect()
It did not work as well.
Any suggestion on how to make it work?
Thanks
Created 04-12-2017 10:48 AM
Been getting the same error when I was trying to connect to the impala instance on a kerberized cluster! Any particular reason why we get this??
Created on 04-24-2020 10:01 AM - edited 04-24-2020 02:12 PM
Anyone found an answer for this I am also getting same error when I run below. This is a kerberos cluster and Impala works fine through HUE and odbc:
--------------------
from impala.dbapi import connect
conn = connect(host='myhost', port=21050)
cursor = conn.cursor()
cursor.execute('SELECT * FROM default.testtable')
print (cursor.description) # prints the result set's schema
results = cursor.fetchall()
--------------------------------------------------------------------------- HiveServer2Error Traceback (most recent call last) <ipython-input-13-82112a6ffca2> in <module>() 2 conn = connect(host='myhost', port=21050) 3 ----> 4 cursor = conn.cursor() 5 cursor.execute('SELECT * FROM default.testtable') 6 print (cursor.description) # prints the result set's schema /data/opt/anaconda3/lib/python3.7/site-packages/impala/hiveserver2.py in cursor(self, user, configuration, convert_types, dictify, fetch_error) 122 log.debug('.cursor(): getting new session_handle') 123 --> 124 session = self.service.open_session(user, configuration) 125 126 log.debug('HiveServer2Cursor(service=%s, session_handle=%s, ' /data/opt/anaconda3/lib/python3.7/site-packages/impala/hiveserver2.py in open_session(self, user, configuration) 1062 username=user, 1063 configuration=configuration) -> 1064 resp = self._rpc('OpenSession', req) 1065 return HS2Session(self, resp.sessionHandle, 1066 resp.configuration, /data/opt/anaconda3/lib/python3.7/site-packages/impala/hiveserver2.py in _rpc(self, func_name, request) 990 def _rpc(self, func_name, request): 991 self._log_request(func_name, request) --> 992 response = self._execute(func_name, request) 993 self._log_response(func_name, response) 994 err_if_rpc_not_ok(response) /data/opt/anaconda3/lib/python3.7/site-packages/impala/hiveserver2.py in _execute(self, func_name, request) 1021 1022 raise HiveServer2Error('Failed after retrying {0} times' -> 1023 .format(self.retries)) 1024 1025 def _operation(self, kind, request): HiveServer2Error: Failed after retrying 3 times
/data/opt/anaconda3/lib/python3.7/site-packages/thrift_sasl/__init__.py in open(self) 65 66 def open(self): ---> 67 if not self._trans.isOpen(): 68 self._trans.open() 69 AttributeError: 'TSocket' object has no attribute 'isOpen'
Created 04-24-2020 12:23 PM
I believe that error should be fixed with the most recent releases of Impyla (0.16.1) and thrift_sasl (0.4.2)
Created on 04-24-2020 02:08 PM - edited 04-24-2020 08:29 PM
Thanks, you are a genius 🙂 .
Installing thrift-sasl-0.4.2 and impyla 0.16.2 did allow successful running of the script. However now I have a different issue. The call cursor.fetchmany(size=3) hangs indefinitely in Jupyter notebook. It executes immediately in similar pyhive script on same small table.
from impala.dbapi import connect
conn = connect(host='myhost', port=21050, auth_mechanism='GSSAPI', kerberos_service_name='impala')
cursor = conn.cursor()
cursor.execute('SELECT * FROM default.mytable LIMIT 100')
cursor.fetchmany(size=3)
cursor.close()
conn.close()
It show query status as Executing in Cloudera manager->Impala Queries monitor. But also says Query State: FINISHED in the query details .
The hang seems to be in the statement buff = self.sock.recv(sz)
/data/opt/anaconda3/lib/python3.7/site-packages/thriftpy2/transport/socket.py in read(self, sz) 107 while True: 108 try: --> 109 buff = self.sock.recv(sz) 110 except socket.error as e: 111 if e.errno == errno.EINTR: KeyboardInterrupt:
After trying various options and setting timeout=100 in the connect statement, it appears the script queries impala table successfully but every 2nd or 3rd time it fails with the below error:
/data/opt/anaconda3/lib/python3.7/site-packages/impala/hiveserver2.py in _rpc(self, func_name, request) 992 response = self._execute(func_name, request) 993 self._log_response(func_name, response) --> 994 err_if_rpc_not_ok(response) 995 return response 996 /data/opt/anaconda3/lib/python3.7/site-packages/impala/hiveserver2.py in err_if_rpc_not_ok(resp) 746 resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and 747 resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS): --> 748 raise HiveServer2Error(resp.status.errorMessage) 749 750 HiveServer2Error: Invalid query handle: b14cce8e19xxxx:5b51463xxxx
Any more thoughts?
Created 01-15-2021 08:02 AM
Hi, we're experiencing the same issue as above - "Invalid query handle" error on thrift-sasl 0.4.2 with kerberos auth. Everything works fine on thrift-sasl 0.2.1.
Was there any resolution?
Created on 01-15-2021 08:23 AM - edited 01-15-2021 08:25 AM
There seems to be different version of thrift-sasl and impyla that work or dont work and it is not easy to figure out these version mismatches. So we finally abandoned impyla and went with pyodbc with cloudera impala odbc driver which is easier to make it work and is working good so far. Check out this link: https://plenium.wordpress.com/2020/05/04/use-pyodbc-with-cloudera-impala-odbc-and-kerberos/
Created 04-30-2021 12:00 AM
have you solve this problems?
Created 04-30-2021 06:42 AM
@JasonBourne - if you have the same issue, here's a GitHub issue discussing it and linking to a pull request to fix it:
https://github.com/cloudera/thrift_sasl/issues/28
You can see in the commits (here: https://github.com/cloudera/thrift_sasl/commits/master), they are testing a new release for a fix, but it looks like it's not quite done yet. Hopefully soon.