Support Questions

Find answers, ask questions, and share your expertise

HBase thrift python example

avatar
Expert Contributor

Hi, I've been successfully using HBase with java client for months, but now I would like to implement a python application to reach my data. I've read several posts about thrift but cannot figure out something : I installed a standalone HBase instance on my linux VM, and it seems that I can execute thrift server directly with "hbase thrift start" command. Did I understand correctly ? Does HBase provide an embedded thrift server ? All the posts I've read suggest to download an instance of thrift server and to compile it in order to generate language (python) specific bindings. Can't I use the "embedded" thrift from hbase for that ? Does it mean that the downloaded & compiled thrift server will "only" be used to generate a few python packages that will be used by my application (but will not be launched itself) ? After having processed with all steps from https://acadgild.com/blog/connecting-hbase-with-python-application-using-thrift-server/, when I try to execute table.py, I'm facing following error :

Traceback (most recent call last):
  File "table.py", line 1, in <module>
    from thrift.transport.TSocket import TSocket
ModuleNotFoundError: No module named 'thrift'

I don't understand what I missed here...Should I install a thrift package for python somehow (in addition to my generated bindings) ? Thanks for your help, Regards, Sebastien

1 ACCEPTED SOLUTION

avatar
Rising Star

Hi,

  • regarding your first bunch of questions:

The answer depends on which distribution and versions you use or if you are using vanilla HBase. When you, e.g., install HDP 2.4, here is a guide to start the thrift server:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.3/bk_installing_manually_book/content/ref-2a6...

  • regarding your last question:

the error message indicates, that you don't have the thrift module installed, that you will need on the client side to execute your python program.

Depending on how you manage packages, e.g., using pip you would need to install the thrift module:

pip install thrift

Doing so, at least this error message will disappear.

View solution in original post

5 REPLIES 5

avatar
Rising Star

Hi,

  • regarding your first bunch of questions:

The answer depends on which distribution and versions you use or if you are using vanilla HBase. When you, e.g., install HDP 2.4, here is a guide to start the thrift server:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.3/bk_installing_manually_book/content/ref-2a6...

  • regarding your last question:

the error message indicates, that you don't have the thrift module installed, that you will need on the client side to execute your python program.

Depending on how you manage packages, e.g., using pip you would need to install the thrift module:

pip install thrift

Doing so, at least this error message will disappear.

avatar
Expert Contributor

Thanks for your help, I got rid with python error.

Anyway, I still cannot connect from my python client to hbase thrift server (about your question, I'm trying to use a vanilla HBase, without any other component), because I can see following error on server side :

2017-05-30 08:01:11,816 WARN  [thrift-worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connexion refusée
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-05-30 08:01:11,916 ERROR [thrift-worker-0] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
2017-05-30 08:01:11,917 WARN  [thrift-worker-0] zookeeper.ZKUtil: hconnection-0x3330972d0x0, quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
    at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:220)
    at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:419)
    at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
    at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
    at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:905)

It seems that HBase is trying to reach zookeeper, but I didn't setup any zookeeper...Is it possible to run HBase without zookeeper ?

Thanks again

Regards

avatar
Expert Contributor

In fact, was my fault : when my box went back from stand-bye mode, I had not restarted hbase ... It works now 🙂

avatar
Rising Star
@Sebastien Chausson

Cool, glad to see that you got it up and running yourself! If my answer was helpful you can vote it up or mark it as best answer. 🙂

avatar
Rising Star

To answer your question regarding Zookeeper. HBase needs Zookeeper. If you didn't set up Zookeeper yourself, HBase spins up an "internal" Zookeeper server, which is great for testing, but shouldn't be used in production scenarios.