Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HBase access in Spark code in a Kerberozied Cluster

HBase access in Spark code in a Kerberozied Cluster


I want to access HBase inside my spark code.


For accessing the HBase inside the spark i have used UserGroupInformation


Configuration conf = HBaseConfiguration.create();
conf.set("", "2181");




I am unable to access the HBase from spark code. It is getting below error and finally, job is failed after the 


709d40c03cb, negotiated timeout = 60000
17/08/06 18:10:01 WARN spark.SparkContext: Killing executors is only supported in coarse-grained mode
17/08/06 18:10:01 WARN spark.ExecutorAllocationManager: Unable to reach the cluster manager to kill executor driver!
17/08/06 18:10:36 INFO client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=48580 ms ago, cancelled=false, msg=



But if I try to access HBase alone (without spark code ) using simple java program, I am able to access HBase in the kerborized cluster.


I am using CDH 5.8 with Spark 1.6.


I am executing the spark job by passing principal and keytab and inside spark code , I used UserGroupInformation for HBase access.


 spark-submit --keytab /home/centos/spark_on_yarn.keytab --principal spark/  --class SparkReadAndPrint --deploy-mode client  --master local /home/centos/newspark.jar /user/centos/bank.txt


I tried by passing hbase-site.xml as file parameter also.


 spark-submit --keytab /home/centos/spark_on_yarn.keytab --principal spark/  --files "hbase-site.xml,hbase.keytab,hdfs-site.xml,hdfs.keytab"  --class SparkReadAndPrint --deploy-mode client  --master local /home/centos/newspark.jar /user/centos/bank.txt


Even checked the /etc/spark/ configuration and hbase-site.xml is found in the config path.


it would be great if you can answer below queries

a) What is the issue regarding above approach?


b) Is there any other way to access HBase from Spark code? 


c) I am using the principal and keytab generated by Cloudera Manager for both spark and hbase access.

    Is there any other approach by creating keytab with both these principal and try to access?


d) Please share the best approach if you want to access multiple services like spark,hbase, kafka in a kerberoized cluster? Do we need to create a single keytab and principal for accessing these services?


It would be great if you can clarify above points. Thanks in Advance.






Re: HBase access in Spark code in a Kerberozied Cluster

Cloudera Employee

Can you try setting the spark classpath to include the hbase classpath before the spark-submit.


export SPARK_CLASSPATH=`hbase classpath`:$CLASSPATH

Re: HBase access in Spark code in a Kerberozied Cluster


Hi Gopa,


Is this issue resolved. If so, what is the solutions?

Re: HBase access in Spark code in a Kerberozied Cluster


I face the same prolbem, Do you resolve that ?

Don't have an account?
Coming from Hortonworks? Activate your account here