Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

SparkSQL and Hive with Kerberos and Sentry

avatar
Contributor

Hello,

            with Sentry enabled and Kerberos enabled too, normally it's best practice to lock the metastore of hive 1 with restriction policies on proxy user list and with a firewall to close the port of hive server 1.

By the way if it's started the spark-shell , the shell will try to connect to metastore at 9083 port and obviously it could not connect because it's not in proxy user  or it0's firewalled and it gives the error.

The spark shell will not try to connec tto port 10000 and it's very strange, I expect that with Hive server 2 and hive gateway defined on spark server, it will try to connecto to hive server 2 port.

 

I don't find so much documentation on this, I verify also that spark thrift server is not provided in the CDH Hadoop distribution, so it's not possile with secure cluster to access the hive metastore with SparkSQL.

 

The only way is to bypass the security and use proxy user .

 

Any idea?

 

Kind Regards

Who agreed with this topic