Support Questions
Find answers, ask questions, and share your expertise

pyhive connection error: thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

Contributor

I'm trying to get a table located in hive (hortonworks) ,to collect some twitter data to implement on a machine learning project, using pyhive since pyhs2 is not supported by python3.6.

Here's my code:

from pyhive import hive
conn = hive.Connection(host='192.168.1.11', port=10000, auth='NOSASL')
import pandas as pd
import sys
df = pd.read_sql("SELECT * FROM my_table", conn)
print(sys.getsizeof(df))
df.head()

When compiling I get this error:

Traceback (most recent call last):
File "C:\Users\PWST112\Desktop\import.py", line 44, in <module>
conn = hive.Connection(host='192.168.1.11', port=10000, auth='NOSASL')
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site-    packages\pyhive\hive.py", line 164, in __init__
response = self._client.OpenSession(open_session_req)
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site- packages\TCLIService\TCLIService.py", line 187, in OpenSession
return self.recv_OpenSession()
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site-packages\TCLIService\TCLIService.py", line 199, in recv_OpenSession
(fname, mtype, rseqid) = iprot.readMessageBegin()
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site-packages\thrift\protocol\TBinaryProtocol.py", line 148, in readMessageBegin
name = self.trans.readAll(sz)
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site-packages\thrift\transport\TTransport.py", line 60, in readAll
chunk = self.read(sz - have)
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site-packages\thrift\transport\TTransport.py", line 161, in read
self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))
File "C:\Users\PWST112\AppData\Local\Programs\Python\Python36\lib\site-packages\thrift\transport\TSocket.py", line 132, in read
message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
[Finished in 0.3s]

Here is the PIP list:

beautifulsoup4 (4.6.0)
bleach (2.0.0)
colorama (0.3.9)
cycler (0.10.0)
decorator (4.0.11)
entrypoints (0.2.3)
ez-setup (0.9)
future (0.16.0)
html5lib (0.999999999)
impala (0.2)
ipykernel (4.6.1)
ipython (6.1.0)
ipython-genutils (0.2.0)
ipywidgets (6.0.0)
jedi (0.10.2)
Jinja2 (2.9.6)
jsonschema (2.6.0)
jupyter (1.0.0)
jupyter-client (5.1.0)
jupyter-console (5.1.0)
jupyter-core (4.3.0)
konlpy (0.4.4)
MarkupSafe (1.0)
matplotlib (2.0.2)
mistune (0.7.4)
nbconvert (5.2.1)
nbformat (4.3.0)
nltk (3.2.4)
notebook (5.0.0)
numpy (1.13.1+mkl)
pandas (0.20.3)
pandocfilters (1.4.1)
pickleshare (0.7.4)
pip (9.0.1)
prompt-toolkit (1.0.14)
pure-sasl (0.4.0)
Pygments (2.2.0)
PyHive (0.5.0)
pyhs2 (0.6.0)
pyparsing (2.2.0)
python-dateutil (2.6.0)
pytz (2017.2)
pyzmq (16.0.2)
qtconsole (4.3.0)
sasl (0.2.1)
scikit-learn (0.18.2)
scipy (0.19.1)
setuptools (28.8.0)
simplegeneric (0.8.1)
six (1.10.0)
testpath (0.3.1)
thrift (0.10.0)
thrift-sasl (0.3.0)
tornado (4.5.1)
traitlets (4.3.2)
wcwidth (0.1.7)
webencodings (0.5.1)
wheel (0.30.0)
widgetsnbextension (2.0.0)

Can somebody help?

I have my sandbox configured for "NONE" authentication, since the NOSASL option is not available.

Best regards

1 ACCEPTED SOLUTION

Accepted Solutions

Re: pyhive connection error: thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

Contributor

Used impyla.

Works like a charm 🙂

View solution in original post

2 REPLIES 2

Re: pyhive connection error: thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

Contributor

Used impyla.

Works like a charm 🙂

View solution in original post

Re: pyhive connection error: thrift.transport.TTransport.TTransportException: TSocket read 0 bytes

Explorer

I had a similar problem with pyhive on my horton setup. The failure was always immediate and so it was not a timeout issue that some people on the net were pointing out.

It turned out to be hive.server2.transport.mode. If this is set to binary, it works like a Charm. If it is 'http', PyHive does not work.

Also found https://github.com/dropbox/PyHive/issues/69 which talks about this. HTH.