Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Spark SQL action fails in Kerberos secured cluster

Re: Spark SQL action fails in Kerberos secured cluster

Master Guru
> Do I need to contact an administrator or are other ways to get this
keytab file?

A keytab stores a form of your password in it. When you already have the
password on hand, you may place it into a keytab as such, without requiring
any further rights:

~> ktutil
> addent -password -p your-principal@REALM -k 1 -e aes256-cts
Password for your-principal@REALM: [enter your password when prompted]
> wkt your-principal.keytab
> quit
~> ls
your-principal.keytab
~> klist -ekt your-principal.keytab


As to your original issue state in this thread, see
http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Hue-Oozie-Spark-Kerberos-Delegation...

Re: Spark SQL action fails in Kerberos secured cluster

Expert Contributor

 

@Harsh J

 

Thank you very much for the answer, I was able to generate the keytab file and put it in the lib folder of the Oozie application, but when I run I got the following error:

 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1130)
	at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:562)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:178)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:90)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:81)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:57)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:235)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

I am also a bit confused about all this procedure, I mean shouldn't Oozie be able to get a Kerberos delegation token on behalf of my user without the need for me to provide the keytab file?

 

 

I have also tried (again) to use hcat credentials specifiying the configuration you suggested in the other post with the following workflow:

 

<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
<credentials>
<credential name="hive2_credentials" type="hive2">
            <property>
                <name>hive2.jdbc.url</name>
                <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
            </property>
            <property>
                <name>hive2.server.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>

        <credential name="hcat_cred" type="hcat">
            <property>
                <name>hcat.metastore.uri</name>
                <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
            </property>
            <property>
                <name>hcat.metastore.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
</credentials>


	<start to="spark_action"/>
    
    <action cred="hcat_cred" name="spark_action">
        <spark xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
               <delete path="hdfs://trmas-6b8bc78c.azcloud.local:8020/user/icon0104/spark_hive_output"/>               
            </prepare>
            <configuration>
                <property>
                    <name>spark.yarn.security.tokens.hive.enabled</name>
                    <value>false</value>
                </property>                
            </configuration>

            <master>yarn-cluster</master>		      
            <name>OozieSparkAction</name>
            <class>my.Main</class>
            <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>            
	<spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xmlL</spark-opts>
        </spark>
        <ok to="END_NODE"/>
        <error to="KILL_NODE"/>
    </action>
    
    <kill name="KILL_NODE">
        <message>${wf:errorMessage(wf:lastErrorNode())}</message>
    </kill>

    <end name="END_NODE"/>
</workflow-app>

 

But the Spark action goes in START_RETRY state with the same error :

 

 

JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException

 

Thanks for the support!

 

 

 

Re: Spark SQL action fails in Kerberos secured cluster

Master Guru
Can you check your Oozie server log around the time of the START_RETRY failure? The HCat (HMS) credentials are obtained by the Oozie server communicating directly with the configured HMS URI before the jobs are submitted to the cluster - so the Oozie server log and the HMS log would have more details behind the generic 'TTransportException' message that appears in the frontend.

Re: Spark SQL action fails in Kerberos secured cluster

Expert Contributor

@Harsh J

 

 

In Cloudera Manager I went to Oozie server instance and check logs from there but there is nothing useful in Stdout and Stderr, are these the logs you were talking about?

 

 

 

oozie_Server_logs

 

 

Also I am not sure where can I find logs about HMS, can you provide some details?

 

 

Highlighted

Re: Spark SQL action fails in Kerberos secured cluster

Master Guru
Apologies for the lack of details.

The role logs typically lie under the component-named directories under
/var/log. For Oozie this would therefore be /var/log/oozie/ on the Oozie
server host, and /var/log/hive/ for Hive on the HMS host.

Re: Spark SQL action fails in Kerberos secured cluster

Expert Contributor

@Harsh J

 

Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.