Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark SQL action fails in Kerberos secured cluster

avatar
Expert Contributor

Hello everyone

 

I need to run a Spark-SQL action on an Hive table. I am having problems on authentication (the cluster is Kerberos-secured). 

I've tried first with hive2 credentials because they work with my other hive2 actions, my I got a failure (I suppose this type of credentials can only be used with hive2 actions?):

 

2018-04-06 08:37:21,831 [Driver] INFO  org.apache.hadoop.hive.ql.session.SessionState  - No Tez session required at this point. hive.execution.engine=mr.
2018-04-06 08:37:22,117 [Driver] INFO  hive.metastore  - Trying to connect to metastore with URI thrift://trmas-fc2d552a.azcloud.local:9083
2018-04-06 08:37:22,153 [Driver] ERROR org.apache.thrift.transport.TSaslTransport  - SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
	at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
[...]

I've also tried with hcat credentials, but with this one I got a START_RETRY state of the actin with the following error:

 

JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException

This is the workflow.xml:

 

<workflow-app
    xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
    <credentials>
        <credential name="hive2_credentials" type="hive2">
            <property>
                <name>hive2.jdbc.url</name>
                <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
            </property>
            <property>
                <name>hive2.server.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
        <credential name="hcat_cred" type="hcat">
            <property>
                <name>hcat.metastore.uri</name>
                <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
            </property>
            <property>
                <name>hcat.metastore.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
    </credentials>
    <start to="spark_action"/>
    <action cred="hcat_cred" name="spark_action">
        <spark
            xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/icon0104/output"/>
            </prepare>
            <master>yarn-cluster</master>
            <mode>cluster</mode>
            <name>OozieSpark</name>
            <class>my.Main</class>
            <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>
            <spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xml</spark-opts>
        </spark>
        <ok to="END_NODE"/>
        <error to="KILL_NODE"/>
    </action>
    <kill name="KILL_NODE">
        <message>${wf:errorMessage(wf:lastErrorNode())}</message>
    </kill>
    <end name="END_NODE"/>
</workflow-app>

 

This is the hive-site.xml:

 

<?xml version="1.0" encoding="UTF-8"?>

<!--Autogenerated by Cloudera Manager-->
<configuration>
  <property>
    <name>hive.metastore.uris</name>
    <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
  </property>
  <property>
    <name>hive.metastore.client.socket.timeout</name>
    <value>300</value>
  </property>
  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
  </property>
  <property>
    <name>hive.warehouse.subdir.inherit.perms</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.auto.convert.join</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.auto.convert.join.noconditionaltask.size</name>
    <value>20971520</value>
  </property>
  <property>
    <name>hive.optimize.bucketmapjoin.sortedmerge</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.smbjoin.cache.rows</name>
    <value>10000</value>
  </property>
  <property>
    <name>hive.server2.logging.operation.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/var/log/hive/operation_logs</value>
  </property>
  <property>
    <name>mapred.reduce.tasks</name>
    <value>-1</value>
  </property>
  <property>
    <name>hive.exec.reducers.bytes.per.reducer</name>
    <value>67108864</value>
  </property>
  <property>
    <name>hive.exec.copyfile.maxsize</name>
    <value>33554432</value>
  </property>
  <property>
    <name>hive.exec.reducers.max</name>
    <value>1099</value>
  </property>
  <property>
    <name>hive.vectorized.groupby.checkinterval</name>
    <value>4096</value>
  </property>
  <property>
    <name>hive.vectorized.groupby.flush.percent</name>
    <value>0.1</value>
  </property>
  <property>
    <name>hive.compute.query.using.stats</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.vectorized.execution.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.vectorized.execution.reduce.enabled</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.merge.mapfiles</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.merge.mapredfiles</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.cbo.enable</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.fetch.task.conversion</name>
    <value>minimal</value>
  </property>
  <property>
    <name>hive.fetch.task.conversion.threshold</name>
    <value>268435456</value>
  </property>
  <property>
    <name>hive.limit.pushdown.memory.usage</name>
    <value>0.1</value>
  </property>
  <property>
    <name>hive.merge.sparkfiles</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.merge.smallfiles.avgsize</name>
    <value>16777216</value>
  </property>
  <property>
    <name>hive.merge.size.per.task</name>
    <value>268435456</value>
  </property>
  <property>
    <name>hive.optimize.reducededuplication</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.optimize.reducededuplication.min.reducer</name>
    <value>4</value>
  </property>
  <property>
    <name>hive.map.aggr</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.map.aggr.hash.percentmemory</name>
    <value>0.5</value>
  </property>
  <property>
    <name>hive.optimize.sort.dynamic.partition</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.execution.engine</name>
    <value>mr</value>
  </property>
  <property>
    <name>spark.executor.memory</name>
    <value>268435456</value>
  </property>
  <property>
    <name>spark.driver.memory</name>
    <value>268435456</value>
  </property>
  <property>
    <name>spark.executor.cores</name>
    <value>4</value>
  </property>
  <property>
    <name>spark.yarn.driver.memoryOverhead</name>
    <value>26</value>
  </property>
  <property>
    <name>spark.yarn.executor.memoryOverhead</name>
    <value>26</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.initialExecutors</name>
    <value>1</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.minExecutors</name>
    <value>1</value>
  </property>
  <property>
    <name>spark.dynamicAllocation.maxExecutors</name>
    <value>2147483647</value>
  </property>
  <property>
    <name>hive.metastore.execute.setugi</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.support.concurrency</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.zookeeper.quorum</name>
    <value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value>
  </property>
  <property>
    <name>hive.zookeeper.client.port</name>
    <value>2181</value>
  </property>
  <property>
    <name>hive.zookeeper.namespace</name>
    <value>hive_zookeeper_namespace_CD-HIVE-LTqXUcrR</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
  </property>
  <property>
    <name>hive.cluster.delegation.token.store.class</name>
    <value>org.apache.hadoop.hive.thrift.MemoryTokenStore</value>
  </property>
  <property>
    <name>hive.server2.enable.doAs</name>
    <value>false</value>
  </property>
  <property>
    <name>hive.metastore.sasl.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>hive.server2.authentication</name>
    <value>kerberos</value>
  </property>
  <property>
    <name>hive.metastore.kerberos.principal</name>
    <value>hive/_HOST@AZCLOUD.LOCAL</value>
  </property>
  <property>
    <name>hive.server2.authentication.kerberos.principal</name>
    <value>hive/_HOST@AZCLOUD.LOCAL</value>
  </property>
  <property>
    <name>hive.server2.use.SSL</name>
    <value>true</value>
  </property>
  <property>
    <name>spark.shuffle.service.enabled</name>
    <value>true</value>
  </property>
</configuration>

 

 

In Oozie configuration I have the following credentials classes enabled:

 

hcat=org.apache.oozie.action.hadoop.HCatCredentials,hbase=org.apache.oozie.action.hadoop.HbaseCredentials,hive2=org.apache.oozie.action.hadoop.Hive2Credentials 

 

Can anyone help? What am I missing?

 

 

15 REPLIES 15

avatar
Mentor
> Do I need to contact an administrator or are other ways to get this
keytab file?

A keytab stores a form of your password in it. When you already have the
password on hand, you may place it into a keytab as such, without requiring
any further rights:

~> ktutil
> addent -password -p your-principal@REALM -k 1 -e aes256-cts
Password for your-principal@REALM: [enter your password when prompted]
> wkt your-principal.keytab
> quit
~> ls
your-principal.keytab
~> klist -ekt your-principal.keytab


As to your original issue state in this thread, see
http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Hue-Oozie-Spark-Kerberos-Delegation...

avatar
Expert Contributor

 

@Harsh J

 

Thank you very much for the answer, I was able to generate the keytab file and put it in the lib folder of the Oozie application, but when I run I got the following error:

 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1130)
	at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:562)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:178)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:90)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:81)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:57)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:235)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

I am also a bit confused about all this procedure, I mean shouldn't Oozie be able to get a Kerberos delegation token on behalf of my user without the need for me to provide the keytab file?

 

 

I have also tried (again) to use hcat credentials specifiying the configuration you suggested in the other post with the following workflow:

 

<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
<credentials>
<credential name="hive2_credentials" type="hive2">
            <property>
                <name>hive2.jdbc.url</name>
                <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
            </property>
            <property>
                <name>hive2.server.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>

        <credential name="hcat_cred" type="hcat">
            <property>
                <name>hcat.metastore.uri</name>
                <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
            </property>
            <property>
                <name>hcat.metastore.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
</credentials>


	<start to="spark_action"/>
    
    <action cred="hcat_cred" name="spark_action">
        <spark xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
               <delete path="hdfs://trmas-6b8bc78c.azcloud.local:8020/user/icon0104/spark_hive_output"/>               
            </prepare>
            <configuration>
                <property>
                    <name>spark.yarn.security.tokens.hive.enabled</name>
                    <value>false</value>
                </property>                
            </configuration>

            <master>yarn-cluster</master>		      
            <name>OozieSparkAction</name>
            <class>my.Main</class>
            <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>            
	<spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xmlL</spark-opts>
        </spark>
        <ok to="END_NODE"/>
        <error to="KILL_NODE"/>
    </action>
    
    <kill name="KILL_NODE">
        <message>${wf:errorMessage(wf:lastErrorNode())}</message>
    </kill>

    <end name="END_NODE"/>
</workflow-app>

 

But the Spark action goes in START_RETRY state with the same error :

 

 

JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException

 

Thanks for the support!

 

 

 

avatar
Mentor
Can you check your Oozie server log around the time of the START_RETRY failure? The HCat (HMS) credentials are obtained by the Oozie server communicating directly with the configured HMS URI before the jobs are submitted to the cluster - so the Oozie server log and the HMS log would have more details behind the generic 'TTransportException' message that appears in the frontend.

avatar
Expert Contributor

@Harsh J

 

 

In Cloudera Manager I went to Oozie server instance and check logs from there but there is nothing useful in Stdout and Stderr, are these the logs you were talking about?

 

 

 

oozie_Server_logs

 

 

Also I am not sure where can I find logs about HMS, can you provide some details?

 

 

avatar
Mentor
Apologies for the lack of details.

The role logs typically lie under the component-named directories under
/var/log. For Oozie this would therefore be /var/log/oozie/ on the Oozie
server host, and /var/log/hive/ for Hive on the HMS host.

avatar
Expert Contributor

@Harsh J

 

Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.