New Contributor
Posts: 2
Registered: ‎02-08-2018

How to make HBase find scala-library?

[ Edited ]

 I'm on Spark 1.6.0, HBase 1.2.0 (cdh 5.7).


I can read an entire HBase table from the scala spark-shell just by doing:


user@node:~$ export SPARK_CLASSPATH=$(hbase classpath)
user@node:~$ spark-shell --master local


val df ="org.apache.hadoop.hbase.spark")
                              .option("hbase.columns.mapping", "rowkey STRING :key, anothercol STRING cf:anothercol")


But whenever I filter the rows to retrieve based on a string, such as by doing:


df.where("rowkey >= \"1-2018\"").show()

I get java.lang.NoClassDefFoundError: scala/collection/immutable/StringOps


This actually works fine with a numeric type, for instance, or if I cache the entire table upon loading it. Also notice that if I simply do 


import scala.collection.immutable.StringOps

the import runs fine. I suspect that because the hbase-spark library is part of HBase, this is just HBase not being able to find a scala-library.jar.


Is there any way I can make this class available to hbase-spark?



Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: How to make HBase find scala-library?

Do you face this even with --master yarn? I've not been able to reproduce this on recent CDH versions, but I only tried with --master yarn.
New Contributor
Posts: 1
Registered: ‎06-13-2018

Re: How to make HBase find scala-library?

I am also facing the exact issue. I tried master as yarn too but still the issue persists. Were you able to resolve this? If so please let me know the solution

New Contributor
Posts: 2
Registered: ‎02-08-2018

Re: How to make HBase find scala-library?

I could not solve this specific error, so I gave up on using this spark-base connector. I wrote recently about other possible strategies, to connect to hbase via pyspark, hope it helps:

Our community is getting a little larger. And a lot better.

Learn More about the Cloudera and Hortonworks community merger planned for late July and early August.