Support Questions

Find answers, ask questions, and share your expertise

Not able to connect to IMPALA/ HIVE table

avatar
New Contributor

Hi All,

I'm not able to access BDF (HIVE/IMPALA) tables from my local machine. I'm using below query. Could you please suggest.

 

from impala.dbapi import connect
conn = connect(host='xxxximpala.rxcorp.com', port=21000,user='xxxxxx', password='xxxxx',auth_mechanism='PLAIN')
cursor = conn.cursor()
cursor.execute('SELECT COUNT(1) FROM my_table LIMIT 100')
results = cursor.fetchall()

 

TTransportException                       Traceback (most recent call last)
<ipython-input-199-b13e7182755e> in <module>
      1 from impala.dbapi import connect
----> 2 conn = connect(host='xxxmpala.rxcorp.com', port=21000,user='xxxx', password='xxxx',auth_mechanism='PLAIN')
      3 cursor = conn.cursor()
      4 cursor.execute('SELECT COUNT(1) FROM my_table LIMIT 100')
      5 #print cursor.description  # prints the result set's schema

~\AppData\Local\Continuum\anaconda3\lib\site-packages\impala\dbapi.py in connect(host, port, database, timeout, use_ssl, ca_cert, auth_mechanism, user, password, kerberos_service_name, use_ldap, ldap_user, ldap_password, use_kerberos, protocol, krb_host, use_http_transport, http_path, auth_cookie_names, http_cookie_names, retries, jwt)
    199                           http_cookie_names=http_cookie_names,
    200                           retries=retries,
--> 201                           jwt=jwt)    202     return hs2.HiveServer2Connection(service, default_db=database)
    203 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\impala\hiveserver2.py in connect(host, port, timeout, use_ssl, ca_cert, user, password, kerberos_service_name, auth_mechanism, krb_host, use_http_transport, http_path, http_cookie_names, retries, jwt)
    861                                 auth_mechanism, user, password)
    862 
--> 863     transport.open()
    864     protocol = TBinaryProtocolAccelerated(transport)
    865     service = ThriftClient(protocol)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\thrift_sasl\__init__.py in open(self)
     91     # SASL negotiation loop
     92     while True:
---> 93       status, payload = self._recv_sasl_message()
     94       if status not in (self.OK, self.COMPLETE):
     95         raise TTransportException(type=TTransportException.NOT_OPEN,

~\AppData\Local\Continuum\anaconda3\lib\site-packages\thrift_sasl\__init__.py in _recv_sasl_message(self)
    110 
    111   def _recv_sasl_message(self):
--> 112     header = self._trans_read_all(5)
    113     status, length = struct.unpack(">BI", header)
    114     if length > 0:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\thrift_sasl\__init__.py in _trans_read_all(self, sz)
    196     except AttributeError:
    197       read_all = self._trans.read # thriftpy
--> 198     return read_all(sz)
    199 
    200   def close(self):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\thrift\transport\TTransport.py in readAll(self, sz)
     58         have = 0
     59         while (have < sz):
---> 60             chunk = self.read(sz - have)
     61             chunkLen = len(chunk)
     62             have += chunkLen

~\AppData\Local\Continuum\anaconda3\lib\site-packages\thrift\transport\TSocket.py in read(self, sz)
    130         if len(buff) == 0:
    131             raise TTransportException(type=TTransportException.END_OF_FILE,
--> 132                                       message='TSocket read 0 bytes')    133         return buff
    134 

TTransportException: TSocket read 0 bytes

 

1 ACCEPTED SOLUTION

avatar

Hi @Shaswat , Without reviewing completely what (else) may be the problem, the "port=21000" is definitely not correct.

Impala has two "frontend" ports to which the clients can connect:

- Port 21000 is used only for "impala-shell"

- Port 21050 is used for all the other client applications using JDBC, ODBC, Hue or other Python based applications using Impyla - which is also used in the above example. Please see Impyla docs for more.

 

Best regards

 Miklos

View solution in original post

5 REPLIES 5

avatar

Hi @Shaswat , Without reviewing completely what (else) may be the problem, the "port=21000" is definitely not correct.

Impala has two "frontend" ports to which the clients can connect:

- Port 21000 is used only for "impala-shell"

- Port 21050 is used for all the other client applications using JDBC, ODBC, Hue or other Python based applications using Impyla - which is also used in the above example. Please see Impyla docs for more.

 

Best regards

 Miklos

avatar
Community Manager

@Shaswat, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
New Contributor

>>> impala_conn = connect(host='emr-header-3', port=21050, database="dim", user="api",password="",auth_mechanism='PLAIN')
DEBUG:impala.hiveserver2:Connecting to HiveServer2 emr-header-3:21050 with PLAIN authentication mechanism
DEBUG:impala._thrift_api:get_socket: host=emr-header-3 port=21050 use_ssl=False ca_cert=None
DEBUG:impala.hiveserver2:sock=<thriftpy2.transport.socket.TSocket object at 0x7f366e85b2b0>
DEBUG:impala._thrift_api:get_transport: socket=<thriftpy2.transport.socket.TSocket object at 0x7f366e85b2b0> host=emr-header-3 kerberos_service_name=impala auth_mechanism=PLAIN user=api password=fuggetaboutit
Please enter your password:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/hadoop/.local/lib/python3.6/site-packages/impala/dbapi.py", line 150, in connect
http_path=http_path)
File "/home/hadoop/.local/lib/python3.6/site-packages/impala/hiveserver2.py", line 826, in connect
transport.open()
File "/home/hadoop/.local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 93, in open
status, payload = self._recv_sasl_message()
File "/home/hadoop/.local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 112, in _recv_sasl_message
header = self._trans_read_all(5)
File "/home/hadoop/.local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 198, in _trans_read_all
return read_all(sz)
File "/usr/local/lib64/python3.6/site-packages/thriftpy2/transport/socket.py", line 132, in read
message='TSocket read 0 bytes')
thriftpy2.transport.base.TTransportException: TTransportException(type=4, message='TSocket read 0 bytes')

avatar
New Contributor

it works when i use 127.0.0.1 or localhost ,but error when i use hostname or remote hostname。

 

thrift 0.16.0
thrift-sasl 0.4.3
thriftpy 0.3.9
thriftpy2 0.4.14

avatar
New Contributor
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
<final>false</final>
<source>programmatically</source>
<source>org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@3fffff43</source>
</property>