How to use Apache Spark to query Hive table with Kerberos?


New Contributor

I am attempting to use Scala with Apache Spark locally to query Hive table which is secured with Kerberos. I have no issues connecting and querying the data programmatically without Spark. However, the problem comes when I try to connect and query in Spark.


My code when run locally without spark:


    System.setProperty("kerberos.keytab", keytab)
    System.setProperty("kerberos.principal", keytab)
    System.setProperty("", krb5.conf)
    System.setProperty("", jaas.conf)
    val conf = new Configuration
    conf.set("", "Kerberos")
    UserGroupInformation.createProxyUser("user", UserGroupInformation.getLoginUser)
    UserGroupInformation.loginUserFromKeytab(user, keytab)
    if (UserGroupInformation.isLoginKeytabBased) {
    else if (UserGroupInformation.isLoginTicketBased) UserGroupInformation.getLoginUser.reloginFromTicketCache()
    val con = DriverManager.getConnection("jdbc:hive://", user, password)
    val ps = con.prepareStatement("select * from table limit 5").executeQuery();



Does anyone know how I could include the keytab, krb5.conf and jaas.conf into my Spark initialization function so that I am able to authenticate with Kerberos to get the TGT?


My Spark initialization function:


conf = new SparkConf().setAppName("mediumData")
      .set("", "localhost")
      .set("spark.ui.enabled","true") //enable spark UI
    sparkSession = SparkSession.builder.config(conf).enableHiveSupport().getOrCreate()



I do not have files such as hive-site.xml, core-site.xml.

Thank you!


