Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Hive sasl and python 3.5

avatar
New Contributor

Hi, 

 

I'm a Hadoop newbie so don't shoot me yet. I'm trying to import hive tables with python as described how to access hive via python

Namely I'm working on a BDA cloudera under red hat 4.4.7 with GCC 4.4.7 with anaconda-python 3.5.2 installed on a single node and python 2.6.6 system wide.

The following packages are installed using anaconda (python 3.5.2 therefore):

- cyrus-sasl-devel

- python-devel

- pyhive

 

when I used the sample code (complete code and error message : stackoverflow post)

 

from pyhive import hive

conn = hive.Connection(host="myserver", port = 10000)

 

it throws:

 

"Could not start sasl" 

 

I digged in forums, googlized a lot but I didn' find a fix for this issue (tried to uninstall reinstall different version of the sasl package, tried pyhs2 but still rely on sasl)

 

Have you any idea ?

 

your help will be greatly appreciated !

Thanks,

Tom

1 ACCEPTED SOLUTION

avatar
New Contributor

Hi,

 

thanks for the answer. We (I should say, IT-team) find a solution

 

Upgrade of python packages thrift (to version 0.10.0) and PyHive (to version 0.3.0) don’t know why the version we used wasn’t the latest.

 

Added the following:

 

 

 

 

<property>
   <name>hive.server2.authentication</name>
   <value>NOSASL</value>
</property>

 

To the following Hive config parameters in Cloudera Manager:

 

HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml

Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml      necessary so that HUE would work

 

 

from pyhive import hive
conn = hive.Connection(host="myserver", auth='NOSASL')
import pandas as pd
import sys
 
df = pd.read_sql("SELECT * FROM m_ytable", conn) 
print(sys.getsizeof(df))
df.head()

worked without problem/error.

 

Best,

Tom

View solution in original post

3 REPLIES 3

avatar
Champion
Try adding the saslQop config to the connection configuration. The actual value will need to match your clusters Hive configuration.

hive.connect('localhost', configuration={'hive.server2.thrift.sasl.qop': 'auth-conf})

avatar
New Contributor

Hi,

 

thanks for the answer. We (I should say, IT-team) find a solution

 

Upgrade of python packages thrift (to version 0.10.0) and PyHive (to version 0.3.0) don’t know why the version we used wasn’t the latest.

 

Added the following:

 

 

 

 

<property>
   <name>hive.server2.authentication</name>
   <value>NOSASL</value>
</property>

 

To the following Hive config parameters in Cloudera Manager:

 

HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml

Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml      necessary so that HUE would work

 

 

from pyhive import hive
conn = hive.Connection(host="myserver", auth='NOSASL')
import pandas as pd
import sys
 
df = pd.read_sql("SELECT * FROM m_ytable", conn) 
print(sys.getsizeof(df))
df.head()

worked without problem/error.

 

Best,

Tom

avatar
Contributor

Hi .

 

any solution you found for  same .

 

i having the same issue the  accessing the hive through python.

 

 

Thanks

 HadoopHelp

Labels