Support Questions

Find answers, ask questions, and share your expertise

Hive sasl and python 3.5

avatar
Explorer

Hi, 

 

I'm a Hadoop newbie so don't shoot me yet. I'm trying to import hive tables with python as described how to access hive via python

Namely I'm working on a BDA cloudera under red hat 4.4.7 with GCC 4.4.7 with anaconda-python 3.5.2 installed on a single node and python 2.6.6 system wide.

The following packages are installed using anaconda (python 3.5.2 therefore):

- cyrus-sasl-devel

- python-devel

- pyhive

 

when I used the sample code (complete code and error message : stackoverflow post)

 

from pyhive import hive

conn = hive.Connection(host="myserver", port = 10000)

 

it throws:

 

"Could not start sasl" 

 

I digged in forums, googlized a lot but I didn' find a fix for this issue (tried to uninstall reinstall different version of the sasl package, tried pyhs2 but still rely on sasl)

 

Have you any idea ?

 

your help will be greatly appreciated !

Thanks,

Tom

1 ACCEPTED SOLUTION

avatar
Explorer

Hi,

 

thanks for the answer. We (I should say, IT-team) find a solution

 

Upgrade of python packages thrift (to version 0.10.0) and PyHive (to version 0.3.0) don’t know why the version we used wasn’t the latest.

 

Added the following:

 

 

 

 

<property>
   <name>hive.server2.authentication</name>
   <value>NOSASL</value>
</property>

 

To the following Hive config parameters in Cloudera Manager:

 

HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml

Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml      necessary so that HUE would work

 

 

from pyhive import hive
conn = hive.Connection(host="myserver", auth='NOSASL')
import pandas as pd
import sys
 
df = pd.read_sql("SELECT * FROM m_ytable", conn) 
print(sys.getsizeof(df))
df.head()

worked without problem/error.

 

Best,

Tom

View solution in original post

3 REPLIES 3

avatar
Champion
Try adding the saslQop config to the connection configuration. The actual value will need to match your clusters Hive configuration.

hive.connect('localhost', configuration={'hive.server2.thrift.sasl.qop': 'auth-conf})

avatar
Explorer

Hi,

 

thanks for the answer. We (I should say, IT-team) find a solution

 

Upgrade of python packages thrift (to version 0.10.0) and PyHive (to version 0.3.0) don’t know why the version we used wasn’t the latest.

 

Added the following:

 

 

 

 

<property>
   <name>hive.server2.authentication</name>
   <value>NOSASL</value>
</property>

 

To the following Hive config parameters in Cloudera Manager:

 

HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml

Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml      necessary so that HUE would work

 

 

from pyhive import hive
conn = hive.Connection(host="myserver", auth='NOSASL')
import pandas as pd
import sys
 
df = pd.read_sql("SELECT * FROM m_ytable", conn) 
print(sys.getsizeof(df))
df.head()

worked without problem/error.

 

Best,

Tom

avatar
Contributor

Hi .

 

any solution you found for  same .

 

i having the same issue the  accessing the hive through python.

 

 

Thanks

 HadoopHelp