Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to connect to HiveServer2 through Knox gateway using python?

Highlighted

How to connect to HiveServer2 through Knox gateway using python?

New Contributor

I am trying to connect to HiveServer2 (HDP2.2) through Knox gateway and looking for a python API and some basic connection settings. Our cluster is Kerberized and SSL enabled for Knox

7 REPLIES 7
Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

Please see this article: http://hortonworks.com/hadoop-tutorial/secure-jdbc-odbc-clients-access-hiveserver2-using-apache-knox...

This describes the configuration needed for ODBC clients (ODBC > HTTP > Knox > HTTP > Hive Server 2), using Thrift's support for JDBC/ODBC access over HTTP with hive.server2.transport.mode set to http.

From Python, you can use a library like pyodbc to create the ODBC Python client. There are also projects like https://github.com/dropbox/PyHive, but I haven't tested connecting with HTTP transport mode.

Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

New Contributor

Thanks @slachterman

I tried using pyodbc with Hortonworks ODBC driver (with below odbc.ini settings). I have mentioned the certificate that we use here to authenticate our Knox connection. However I ended up having invalid sasl status error (is this something to do with my authentication?). Am I missing something in the driver settings

Error:

>>> con = pyodbc.connect('DSN=hive-knox-qa')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
pyodbc.Error: ('HY000', '[HY000] [unixODBC][Hortonworks][Hardy] (34) Error from server: invalid sasl status. (34) (SQLDriverConnect)')

/etc/odbc.ini

[hive-knox-qa]
Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
HOST=<myKnoxQAServer>
PORT=8443
HiveServerType=2
AuthMech=3
UID=<myUsernameHere>
PWD=<myPasswordHere>
SSL=1
AllowSelfSignedServerCert=1
CAIssuedCertNamesMismatch=1
TrustedCerts=/usr/lib/hive/lib/native/Linux-amd64-64/cacerts.pem
HTTPPath=gateway/default/hive
ThriftTransport=1
Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

Please try with AuthMech=4

Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

New Contributor

Still same error

pyodbc.Error: ('HY000', '[HY000] [unixODBC][Hortonworks][Hardy] (34) Error from server: invalid sasl status. (34) (SQLDriverConnect)')
Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

If HiveServer2 is kerberized, we'll need to use a different form of the connection string (without passing username and password) with AuthMech=1.

From the ODBC Driver User Guide:

Example connection string: Host=<hs2 host>;Port=<hs2 port>; HiveServerType=2;AuthMech=1;Schema=<Hive database>;KrbRealm=<Kerberos Realm>;KrbHostFQDN=<hs2 fully qualified domain name>;KrbServiceName=<hs2 service name>

If there is no default realm configured for your Kerberos setup, then type the value for the Kerberos realm of the HiveServer2 host. Otherwise leave it blank. The Realm is only needed if your Kerberos setup does not define a default realm or if the realm of your HiveServer2 is not the default.

In the Host FQDN field, type the value for the fully qualified domain name of the HiveServer2 host.

In the Service Name field, type the value for the service name of the Hive Server 2. For example, if the principle for the HiveServer2 is "hive/fully.qualified.domain.name@YOUR-REALM.COM", then the value in the service name field should be hive.

Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

New Contributor

@slachterman

I think direct connection to our Hive services is revoked and we can go through Knox only. I got through the SASL error by using HTTP ThriftTransport mode. Thanks for your support :-)

ThriftTransport=2
Highlighted

Re: How to connect to HiveServer2 through Knox gateway using python?

New Contributor

Hi!

Could you please post your solution?

Thanks!

Don't have an account?
Coming from Hortonworks? Activate your account here