Support Questions

Find answers, ask questions, and share your expertise

Spark SQL action fails in Kerberos secured cluster

avatar
Expert Contributor

Hello everyone

 

I need to run a Spark-SQL action on an Hive table. I am having problems on authentication (the cluster is Kerberos-secured). 

I've tried first with hive2 credentials because they work with my other hive2 actions, my I got a failure (I suppose this type of credentials can only be used with hive2 actions?):

 

2018-04-06 08:37:21,831 [Driver] INFO  org.apache.hadoop.hive.ql.session.SessionState  - No Tez session required at this point. hive.execution.engine=mr.
2018-04-06 08:37:22,117 [Driver] INFO  hive.metastore  - Trying to connect to metastore with URI thrift://trmas-fc2d552a.azcloud.local:9083
2018-04-06 08:37:22,153 [Driver] ERROR org.apache.thrift.transport.TSaslTransport  - SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
	at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
[...]

I've also tried with hcat credentials, but with this one I got a START_RETRY state of the actin with the following error:

 

JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException

This is the workflow.xml:

 

<workflow-app
    xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
    <credentials>
        <credential name="hive2_credentials" type="hive2">
            <property>
                <name>hive2.jdbc.url</name>
                <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
            </property>
            <property>
                <name>hive2.server.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
        <credential name="hcat_cred" type="hcat">
            <property>
                <name>hcat.metastore.uri</name>
                <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
            </property>
            <property>
                <name>hcat.metastore.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
    </credentials>
    <start to="spark_action"/>
    <action cred="hcat_cred" name="spark_action">
        <spark
            xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/icon0104/output"/>
            </prepare>
            <master>yarn-cluster</master>
            <mode>cluster</mode>
            <name>OozieSpark</name>
            <class>my.Main</class>
            <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>
            <spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xml</spark-opts>
        </spark>
        <ok to="END_NODE"/>
        <error to="KILL_NODE"/>
    </action>
    <kill name="KILL_NODE">
        <message>${wf:errorMessage(wf:lastErrorNode())}</message>
    </kill>
    <end name="END_NODE"/>
</workflow-app>

 

This is the hive-site.xml:

 

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
  <property>
    <name>hive.metastore.uris</name>
    <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
  </property>
  <property>
    <name>hive.metastore.client.socket.timeout</name>
    <value>300</value>
  </property>
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
  </property>
  <property>
    <name>hive.warehouse.subdir.inherit.perms</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.auto.convert.join</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.auto.convert.join.noconditionaltask.size</name>
    <value>20971520</value>
  </property>
  <property>
    <name>hive.optimize.bucketmapjoin.sortedmerge</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.smbjoin.cache.rows</name>
    <value>10000</value>
  </property>
  <property>
    <name>hive.server2.logging.operation.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/var/log/hive/operation_logs</value>
  </property>
  <property>
    <name>mapred.reduce.tasks</name>
    <value>-1</value>
  </property>
  <property>
    <name>hive.exec.reducers.bytes.per.reducer</name>
    <value>67108864</value>
  </property>
  <property>
    <name>hive.exec.copyfile.maxsize</name>
    <value>33554432</value>
  </property>
  <property>
    <name>hive.exec.reducers.max</name>
    <value>1099</value>
  </property>
  <property>
    <name>hive.vectorized.groupby.checkinterval</name>
    <value>4096</value>
  </property>
  <property>
    <name>hive.vectorized.groupby.flush.percent</name>
    <value>0.1</value>
  </property>
  <property>
    <name>hive.compute.query.using.stats</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.vectorized.execution.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.vectorized.execution.reduce.enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.merge.mapfiles</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.merge.mapredfiles</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.cbo.enable</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.fetch.task.conversion</name>
    <value>minimal</value>
  </property>
  <property>
    <name>hive.fetch.task.conversion.threshold</name>
    <value>268435456</value>
  </property>
  <property>
    <name>hive.limit.pushdown.memory.usage</name>
    <value>0.1</value>
  </property>
  <property>
    <name>hive.merge.sparkfiles</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.merge.smallfiles.avgsize</name>
    <value>16777216</value>
  </property>
  <property>
    <name>hive.merge.size.per.task</name>
    <value>268435456</value>
  </property>
  <property>
    <name>hive.optimize.reducededuplication</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.optimize.reducededuplication.min.reducer</name>
    <value>4</value>
  </property>
  <property>
    <name>hive.map.aggr</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.map.aggr.hash.percentmemory</name>
    <value>0.5</value>
  </property>
  <property>
    <name>hive.optimize.sort.dynamic.partition</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.execution.engine</name>
    <value>mr</value>
  </property>
  <property>
    <name>spark.executor.memory</name>
    <value>268435456</value>
  </property>
  <property>
    <name>spark.driver.memory</name>
    <value>268435456</value>
  </property>
  <property>
    <name>spark.executor.cores</name>
    <value>4</value>
  </property>
  <property>
    <name>spark.yarn.driver.memoryOverhead</name>
    <value>26</value>
  </property>
  <property>
    <name>spark.yarn.executor.memoryOverhead</name>
    <value>26</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.initialExecutors</name>
    <value>1</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.minExecutors</name>
    <value>1</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.maxExecutors</name>
    <value>2147483647</value>
  </property>
  <property>
    <name>hive.metastore.execute.setugi</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.support.concurrency</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.zookeeper.quorum</name>
    <value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value>
  </property>
  <property>
    <name>hive.zookeeper.client.port</name>
    <value>2181</value>
  </property>
  <property>
    <name>hive.zookeeper.namespace</name>
    <value>hive_zookeeper_namespace_CD-HIVE-LTqXUcrR</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>
  <property>
    <name>hive.cluster.delegation.token.store.class</name>
    <value>org.apache.hadoop.hive.thrift.MemoryTokenStore</value>
  </property>
  <property>
    <name>hive.server2.enable.doAs</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.metastore.sasl.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.server2.authentication</name>
    <value>kerberos</value>
  </property>
  <property>
    <name>hive.metastore.kerberos.principal</name>
    <value>hive/_HOST@AZCLOUD.LOCAL</value>
  </property>
  <property>
    <name>hive.server2.authentication.kerberos.principal</name>
    <value>hive/_HOST@AZCLOUD.LOCAL</value>
  </property>
  <property>
    <name>hive.server2.use.SSL</name>
    <value>true</value>
  </property>
  <property>
    <name>spark.shuffle.service.enabled</name>
    <value>true</value>
  </property>
</configuration>

 

 

In Oozie configuration I have the following credentials classes enabled:

 

hcat=org.apache.oozie.action.hadoop.HCatCredentials,hbase=org.apache.oozie.action.hadoop.HbaseCredentials,hive2=org.apache.oozie.action.hadoop.Hive2Credentials 

 

Can anyone help? What am I missing?

 

 

15 REPLIES 15

avatar
Mentor
> Do I need to contact an administrator or are other ways to get this
keytab file?

A keytab stores a form of your password in it. When you already have the
password on hand, you may place it into a keytab as such, without requiring
any further rights:

~> ktutil
> addent -password -p your-principal@REALM -k 1 -e aes256-cts
Password for your-principal@REALM: [enter your password when prompted]
> wkt your-principal.keytab
> quit
~> ls
your-principal.keytab
~> klist -ekt your-principal.keytab


As to your original issue state in this thread, see
http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Hue-Oozie-Spark-Kerberos-Delegation...

avatar
Expert Contributor

 

@Harsh J

 

Thank you very much for the answer, I was able to generate the keytab file and put it in the lib folder of the Oozie application, but when I run I got the following error:

 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1130)
	at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:562)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:178)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:90)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:81)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:57)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:235)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

I am also a bit confused about all this procedure, I mean shouldn't Oozie be able to get a Kerberos delegation token on behalf of my user without the need for me to provide the keytab file?

 

 

I have also tried (again) to use hcat credentials specifiying the configuration you suggested in the other post with the following workflow:

 

<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
<credentials>
<credential name="hive2_credentials" type="hive2">
            <property>
                <name>hive2.jdbc.url</name>
                <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
            </property>
            <property>
                <name>hive2.server.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>

        <credential name="hcat_cred" type="hcat">
            <property>
                <name>hcat.metastore.uri</name>
                <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
            </property>
            <property>
                <name>hcat.metastore.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
</credentials>


	<start to="spark_action"/>
    
    <action cred="hcat_cred" name="spark_action">
        <spark xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
               <delete path="hdfs://trmas-6b8bc78c.azcloud.local:8020/user/icon0104/spark_hive_output"/>               
            </prepare>
            <configuration>
                <property>
                    <name>spark.yarn.security.tokens.hive.enabled</name>
                    <value>false</value>
                </property>                
            </configuration>

            <master>yarn-cluster</master>		      
            <name>OozieSparkAction</name>
            <class>my.Main</class>
            <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>            
	<spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xmlL</spark-opts>
        </spark>
        <ok to="END_NODE"/>
        <error to="KILL_NODE"/>
    </action>
    
    <kill name="KILL_NODE">
        <message>${wf:errorMessage(wf:lastErrorNode())}</message>
    </kill>

    <end name="END_NODE"/>
</workflow-app>

 

But the Spark action goes in START_RETRY state with the same error :

 

 

JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException

 

Thanks for the support!

 

 

 

avatar
Mentor
Can you check your Oozie server log around the time of the START_RETRY failure? The HCat (HMS) credentials are obtained by the Oozie server communicating directly with the configured HMS URI before the jobs are submitted to the cluster - so the Oozie server log and the HMS log would have more details behind the generic 'TTransportException' message that appears in the frontend.

avatar
Expert Contributor

@Harsh J

 

 

In Cloudera Manager I went to Oozie server instance and check logs from there but there is nothing useful in Stdout and Stderr, are these the logs you were talking about?

 

 

 

oozie_Server_logs

 

 

Also I am not sure where can I find logs about HMS, can you provide some details?

 

 

avatar
Mentor
Apologies for the lack of details.

The role logs typically lie under the component-named directories under
/var/log. For Oozie this would therefore be /var/log/oozie/ on the Oozie
server host, and /var/log/hive/ for Hive on the HMS host.

avatar
Expert Contributor

@Harsh J

 

Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.