Created 11-07-2016 04:24 AM
conn_config ={'krb_host':'hostname','krb_service':'hive'} pyhs2.connect(host='hostname', port=10000, authMechanism="KERBEROS", password="********", user='hostname@XXXXXXXXXXXXXXXXXXXX')
Error Encountered:
user='XXXXXXX@XXXXXXXXXXXXXXXXXXXX') File "build\bdist.win-amd64\egg\pyhs2\__init__.py", line 7, in connect File "build\bdist.win-amd64\egg\pyhs2\connections.py", line 46, in __init__ File "build\bdist.win-amd64\egg\pyhs2\cloudera\thrift_sasl.py", line 66, in open thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2
checked below string but still have same problem http://stackoverflow.com/questions/29814207/python-connect-to-hive-use-pyhs2-and-kerberos-authentica... Can please some one help to solve ths problem. Your help is much appreciated
Created 11-07-2016 04:24 AM
@Artem Ervits can you please help me to solve this issue.
it is much appreciated
Created 11-07-2016 04:24 AM
@Dale Bradman can you please help me to solve this issue.
it is much appreciated
Created 11-09-2016 01:53 PM
this may be a dependency issue, please look at the proposed fixes in the following threads
https://github.com/BradRuderman/pyhs2/issues/41
https://github.com/BradRuderman/pyhs2/issues/19
http://stackoverflow.com/questions/30705576/python-cannot-connect-hiveserver2
Created on 01-07-2017 08:56 AM - edited 08-18-2019 04:10 AM
1. You must install kerberos client on your PC.
2. You mast get and kinit a ticket for kerberos.
3. And then you must install Pyhive model.
4. Then test.py like this:
#!/usr/bin/env python # -*- coding: utf-8 -*- # hive util with hive server2 from impala.dbapi import connect class HiveClient: def __init__(self, db_host, port, authMechanism, user, password, database, kbservice): self.conn = connect(host=db_host, port=port, auth_mechanism=authMechanism, user=user, password=password, database=database, kerberos_service_name=kbservice ) def query(self, sql): with self.conn.cursor() as cursor: cursor.execute(sql) return cursor.fetchall() def close(self): self.conn.close() if __name__ == '__main__': hive_client = HiveClient(db_host='namenode02.xxx.com', port=10000, authMechanism='GSSAPI', user='hive', password='', database='data_mp_raw', kbservice='hive') sql = "select dt, hour, vid, sum(1) as play_num from tbl_mp_stream where dt='20161204' and hour='12' and vid is not null group by dt, hour, vid order by play_num desc limit 50" sql = "SHOW TABLES" result = hive_client.query(sql) hive_client.close() for x in result: print(x)
5. The result like this( My python version is 3.5.2 )
Created 01-18-2017 05:06 PM
hive_client = HiveClient(db_host='XXXXXXXXX.scglobaluat.aduat.scotiacapital.com', port=10000, authMechanism='GSSAPI', user='XXXXXXX', password='XXXXXX',database='tsz', kbservice='hive')
The below is error code:
Please let me know where is the issue
Traceback (most recent call last):
File "C:/Users/speesary/connct_server", line 33, in <module> database='tsz', kbservice='hive') File "C:/Users/speesary/connct_server", line 17, in __init__ kerberos_service_name=kbservice
File "C:\Users\speesary\AppData\Roaming\Python\Python27\site-packages\impala\dbapi.py", line 147, in connect auth_mechanism=auth_mechanism)
File "C:\Users\speesary\AppData\Roaming\Python\Python27\site-packages\impala\hiveserver2.py", line 658, in connect transport.open()
File "C:\Users\speesary\AppData\Roaming\Python\Python27\site-packages\thrift_sasl\__init__.py", line 72, in open message=("Could not start SASL: %s" % self.sasl.getError())) thrift.transport.TTransport.TTransportException:
Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2 Process finished with exit code 1
Created 01-07-2017 08:58 AM
pyhs2 can't support Python 3.
Created 03-30-2020 09:54 AM
The DSN-less connection string below FINALLY worked for me, in windows 10. I created a file DSN, then copy/pasted the string into the python code, as a template.
Three lessons that I learned from this struggle:
1) kerberos is CASE SENSITIVE. Your kerberos realm in the string MUST be uppercase.
2) The Cloudera driver doesn't like spaces in between the semicolons in the string. Avoid them.
3) If you don't need connection pooling, turn it off with a pyodbc.pooling = False statement.
import pyodbc
strFileDSNAsAstring = "DRIVER=Cloudera ODBC Driver for Apache Hive;USEUNICODESQLCHARACTERTYPES=1; \
SSL=0;SERVICEPRINCIPALCANONICALIZATION=0;SERVICEDISCOVERYMODE=0;SCHEMA=database;PORT=port; \
KRBSERVICENAME=hive;KRBREALM=uppercaserealm;KRBHOSTFQDN=hostfqdndomain;INVALIDSESSIONAUTORECOVER=1; \
HOST=host;HIVESERVERTYPE=2;GETTABLESWITHQUERY=0;ENABLETEMPTABLE=0;DESCRIPTION=Hive; \
DELEGATEKRBCREDS=0;AUTHMECH=1;ASYNCEXECPOLLINTERVAL=100;APPLYSSPWITHQUERIES=1;CAIssuedCertNamesMismatch=1;"
try:
pyodbc.pooling = False
conn = pyodbc.connect(strFileDSNAsAstring, autocommit=True)
except:
print("failure.")
else:
conn.close()
print("success.")