Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

what protocol is used for the new spark hbase connector?

avatar
Master Guru

what protocol is used for the new spark hbase connector? is spark using the hbase thrift?

1 ACCEPTED SOLUTION

avatar
Super Guru

@Sunile Manjee

Integration between Spark and HBase relies simply on HBaseContext which provides HBase configuration to Spark Executor. So, to answer which protocol is used, the answer is simple RPC. Please check following link for more details.

https://hbase.apache.org/book.html#spark

and here is the github link to HBase Spark module.

https://github.com/apache/hbase/tree/master/hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark

View solution in original post

2 REPLIES 2

avatar

@Sunile Manjee,

Depending on what data your Spark application attempts to access, it uses the relevant JVM HBase client APIs: filters, scans, gets, range-gets, etc.

See the code here.

avatar
Super Guru

@Sunile Manjee

Integration between Spark and HBase relies simply on HBaseContext which provides HBase configuration to Spark Executor. So, to answer which protocol is used, the answer is simple RPC. Please check following link for more details.

https://hbase.apache.org/book.html#spark

and here is the github link to HBase Spark module.

https://github.com/apache/hbase/tree/master/hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark