Support Questions
Find answers, ask questions, and share your expertise

Spark local mode to kerberized cluster.

Spark local mode to kerberized cluster.

New Contributor

We've our secured (Kerberized) development clusters. Now we are trying to run some integration tests which actually connects Intellij idea to HBASE/HDFS and HIVE. It was working perfectly alright with Simple authentication and with security we are facing some issues while writing RDDs/DataFrame to hdfs. We have followed the recommended guidelines:-

In spark-defaults.conf (in an HDP cluster: /etc/spark/conf/spark-defaults.conf);
spark.history.kerberos.enabled true
spark.history.kerberos.keytab /your/path/to/spark.headless.keytab
spark.history.kerberos.principal your-principal@YOUR.DOMAIN
In spark-env.sh make sure you have
export HADOOP_CONF_DIR=/your/path/to/hadoop/conf
In core-site.xml
hadoop.security.authentication: Kerberos

But still don't have any success so far and program is failing with the error specified below:-

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]

Note: I even tried UserGroupInformation class which helped me to run few cases but failed while writing RDDs/DataFrames to HDFS

2 REPLIES 2

Re: Spark local mode to kerberized cluster.

Expert Contributor

Hi @Ashish Singh,

Can you show the command that you used to submit your spark application?

Michel

Re: Spark local mode to kerberized cluster.

Mentor

@Ashish Singh,

If you have a keytab file to authenticate to the cluster, this is one way I've done it:

val conf: Configuration: = new Configuration()
conf.set("hadoop.security.authentication", "Kerberos")
UserGroupInformation.setConfiguration(conf)
UserGroupInformation.loginUserFromKeytab("user-name", "path/to/keytab/on/local/machine")
FileSystem.get(conf)

I believe to do this, you might also need some configuration xml docs. Namely core-site.xml, hdfs-site.xml, and mapred-site.xml. These are somewhere usually under /etc/hadoop/conf/.

You would put those under a directory in your program and mark it as Resources directory in IntelliJ.