Support Questions

Find answers, ask questions, and share your expertise

Storing dataframe into HBase using Spark

New Contributor

I am storing dataframe to hbase table from the pyspark dataframe in CDP7, following this example in   , the components that I use are:


- Spark version 3.1.1
- Scala version 2.12.10
- shc-core-1.1.1-2.1-s_2.11.jar


The command that i use:



spark3-submit --packages com.hortonworks:shc-core:1.1.1-2.1-s_2.11 --repositories --files /etc/hbase/conf/hbase-site.xml




However, I got this error, it is quite long that I need to put it in as below:


error snippet:



Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/", line 45, in <module>
File "/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/", line 24, in main
writeDF.write.options(catalog=writeCatalog, newtable=5).format(dataSourceFormat).save()
File "/opt/cloudera/parcels/SPARK3-", line 1107, in save
File "/opt/cloudera/parcels/SPARK3-", line 1305, in __call__
File "/opt/cloudera/parcels/SPARK3-", line 111, in deco
File "/opt/cloudera/parcels/SPARK3-", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
: java.lang.NoClassDefFoundError: scala/Product$class
at org.apache.spark.sql.execution.datasources.hbase.HBaseRelation.<init>(HBaseRelation.scala:73)
at org.apache.spark.sql.execution.datasources.hbase.DefaultSource.createRelation(HBaseRelation.scala:59)




What should I do to fix the error? I tried to find other connector. However, only found SHC connector. Im not using any Maven repo here. But, not sure if there is missing dependencies or other error.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.