06-14-2017 11:48 PM
I'm a Hadoop newbie so don't shoot me yet. I'm trying to import hive tables with python as described how to access hive via python
Namely I'm working on a BDA cloudera under red hat 4.4.7 with GCC 4.4.7 with anaconda-python 3.5.2 installed on a single node and python 2.6.6 system wide.
The following packages are installed using anaconda (python 3.5.2 therefore):
when I used the sample code (complete code and error message : stackoverflow post)
from pyhive import hive
conn = hive.Connection(host="myserver", port = 10000)
"Could not start sasl"
I digged in forums, googlized a lot but I didn' find a fix for this issue (tried to uninstall reinstall different version of the sasl package, tried pyhs2 but still rely on sasl)
Have you any idea ?
your help will be greatly appreciated !
06-15-2017 12:26 PM
06-20-2017 07:07 AM
thanks for the answer. We (I should say, IT-team) find a solution
Upgrade of python packages thrift (to version 0.10.0) and PyHive (to version 0.3.0) don’t know why the version we used wasn’t the latest.
Added the following:
<property> <name>hive.server2.authentication</name> <value>NOSASL</value> </property>
To the following Hive config parameters in Cloudera Manager:
HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml
Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml necessary so that HUE would work
from pyhive import hive conn = hive.Connection(host="myserver", auth='NOSASL') import pandas as pd import sys df = pd.read_sql("SELECT * FROM m_ytable", conn) print(sys.getsizeof(df)) df.head()
worked without problem/error.