Reply
Posts: 1,730
Kudos: 356
Solutions: 273
Registered: ‎07-31-2013

Re: Spark SQL action fails in Kerberos secured cluster

> Do I need to contact an administrator or are other ways to get this
keytab file?

A keytab stores a form of your password in it. When you already have the
password on hand, you may place it into a keytab as such, without requiring
any further rights:

~> ktutil
> addent -password -p your-principal@REALM -k 1 -e aes256-cts
Password for your-principal@REALM: [enter your password when prompted]
> wkt your-principal.keytab
> quit
~> ls
your-principal.keytab
~> klist -ekt your-principal.keytab


As to your original issue state in this thread, see
http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Hue-Oozie-Spark-Kerberos-Delegation...
Expert Contributor
Posts: 69
Registered: ‎11-24-2017

Re: Spark SQL action fails in Kerberos secured cluster

 

@Harsh J

 

Thank you very much for the answer, I was able to generate the keytab file and put it in the lib folder of the Oozie application, but when I run I got the following error:

 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1130)
	at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:562)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:178)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:90)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:81)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:57)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:235)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

I am also a bit confused about all this procedure, I mean shouldn't Oozie be able to get a Kerberos delegation token on behalf of my user without the need for me to provide the keytab file?

 

 

I have also tried (again) to use hcat credentials specifiying the configuration you suggested in the other post with the following workflow:

 

<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
<credentials>
<credential name="hive2_credentials" type="hive2">
            <property>
                <name>hive2.jdbc.url</name>
                <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
            </property>
            <property>
                <name>hive2.server.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>

        <credential name="hcat_cred" type="hcat">
            <property>
                <name>hcat.metastore.uri</name>
                <value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
            </property>
            <property>
                <name>hcat.metastore.principal</name>
                <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
            </property>
        </credential>
</credentials>


	<start to="spark_action"/>
    
    <action cred="hcat_cred" name="spark_action">
        <spark xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
               <delete path="hdfs://trmas-6b8bc78c.azcloud.local:8020/user/icon0104/spark_hive_output"/>               
            </prepare>
            <configuration>
                <property>
                    <name>spark.yarn.security.tokens.hive.enabled</name>
                    <value>false</value>
                </property>                
            </configuration>

            <master>yarn-cluster</master>		      
            <name>OozieSparkAction</name>
            <class>my.Main</class>
            <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>            
	<spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xmlL</spark-opts>
        </spark>
        <ok to="END_NODE"/>
        <error to="KILL_NODE"/>
    </action>
    
    <kill name="KILL_NODE">
        <message>${wf:errorMessage(wf:lastErrorNode())}</message>
    </kill>

    <end name="END_NODE"/>
</workflow-app>

 

But the Spark action goes in START_RETRY state with the same error :

 

 

JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException

 

Thanks for the support!

 

 

 

Highlighted
Posts: 1,730
Kudos: 356
Solutions: 273
Registered: ‎07-31-2013

Re: Spark SQL action fails in Kerberos secured cluster

Can you check your Oozie server log around the time of the START_RETRY failure? The HCat (HMS) credentials are obtained by the Oozie server communicating directly with the configured HMS URI before the jobs are submitted to the cluster - so the Oozie server log and the HMS log would have more details behind the generic 'TTransportException' message that appears in the frontend.
Expert Contributor
Posts: 69
Registered: ‎11-24-2017

Re: Spark SQL action fails in Kerberos secured cluster

@Harsh J

 

 

In Cloudera Manager I went to Oozie server instance and check logs from there but there is nothing useful in Stdout and Stderr, are these the logs you were talking about?

 

 

 

oozie_Server_logs

 

 

Also I am not sure where can I find logs about HMS, can you provide some details?

 

 

Posts: 1,730
Kudos: 356
Solutions: 273
Registered: ‎07-31-2013

Re: Spark SQL action fails in Kerberos secured cluster

Apologies for the lack of details.

The role logs typically lie under the component-named directories under
/var/log. For Oozie this would therefore be /var/log/oozie/ on the Oozie
server host, and /var/log/hive/ for Hive on the HMS host.
Expert Contributor
Posts: 69
Registered: ‎11-24-2017

Re: Spark SQL action fails in Kerberos secured cluster

@Harsh J

 

Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.

Announcements