Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Include latest hbase-spark in CDH

avatar
Explorer

Thanks for including hbase-spark in CDH since v5.7.0. Unfortunately, it does not include the latest changes to hbase-spark (see:https://issues.apache.org/jira/browse/HBASE-14789). Example: HBase - Spark Dataframe integration. That means that Python users currently cannot use hbase-spark at all.

12 REPLIES 12

avatar
New Contributor

Thanks for sharing this walkthrough!

@Harsh J  Can you help me?

I can't play this hbase-pyspark connection with cloudera CDH 6.1.1. I get the message: "An error occurred while calling o70.load .: java.lang.ClassNotFoundException: Failed to find data source: org.apache.hadoop.hbase.spark. Please find packages at http://spark.apache.org /third-party-projects.html "

 

Thank you so mucth

 

avatar
Explorer

hi 

can anyone provide the command to run spark (spark-submit for example) with the connector? 

 

i get error "Failed to find data source: org.apache.hadoop.hbase.spark"

 

 

avatar
New Contributor

hello, @amirmam 

 

Did you manage to solve? I have the same problem with the current version of CDH 6.1.1

thanks!