Created on 04-06-2018 01:59 AM - edited 09-16-2022 06:04 AM
Hello everyone
I need to run a Spark-SQL action on an Hive table. I am having problems on authentication (the cluster is Kerberos-secured).
I've tried first with hive2 credentials because they work with my other hive2 actions, my I got a failure (I suppose this type of credentials can only be used with hive2 actions?):
2018-04-06 08:37:21,831 [Driver] INFO org.apache.hadoop.hive.ql.session.SessionState - No Tez session required at this point. hive.execution.engine=mr. 2018-04-06 08:37:22,117 [Driver] INFO hive.metastore - Trying to connect to metastore with URI thrift://trmas-fc2d552a.azcloud.local:9083 2018-04-06 08:37:22,153 [Driver] ERROR org.apache.thrift.transport.TSaslTransport - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) [...]
I've also tried with hcat credentials, but with this one I got a START_RETRY state of the actin with the following error:
JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException
This is the workflow.xml:
<workflow-app
xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
<credentials>
<credential name="hive2_credentials" type="hive2">
<property>
<name>hive2.jdbc.url</name>
<value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
</property>
<property>
<name>hive2.server.principal</name>
<value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
</property>
</credential>
<credential name="hcat_cred" type="hcat">
<property>
<name>hcat.metastore.uri</name>
<value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
</property>
<property>
<name>hcat.metastore.principal</name>
<value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
</property>
</credential>
</credentials>
<start to="spark_action"/>
<action cred="hcat_cred" name="spark_action">
<spark
xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/icon0104/output"/>
</prepare>
<master>yarn-cluster</master>
<mode>cluster</mode>
<name>OozieSpark</name>
<class>my.Main</class>
<jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>
<spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xml</spark-opts>
</spark>
<ok to="END_NODE"/>
<error to="KILL_NODE"/>
</action>
<kill name="KILL_NODE">
<message>${wf:errorMessage(wf:lastErrorNode())}</message>
</kill>
<end name="END_NODE"/>
</workflow-app>
This is the hive-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!--Autogenerated by Cloudera Manager-->
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
</property>
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>300</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>
</property>
<property>
<name>hive.auto.convert.join</name>
<value>false</value>
</property>
<property>
<name>hive.auto.convert.join.noconditionaltask.size</name>
<value>20971520</value>
</property>
<property>
<name>hive.optimize.bucketmapjoin.sortedmerge</name>
<value>false</value>
</property>
<property>
<name>hive.smbjoin.cache.rows</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.logging.operation.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/var/log/hive/operation_logs</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>-1</value>
</property>
<property>
<name>hive.exec.reducers.bytes.per.reducer</name>
<value>67108864</value>
</property>
<property>
<name>hive.exec.copyfile.maxsize</name>
<value>33554432</value>
</property>
<property>
<name>hive.exec.reducers.max</name>
<value>1099</value>
</property>
<property>
<name>hive.vectorized.groupby.checkinterval</name>
<value>4096</value>
</property>
<property>
<name>hive.vectorized.groupby.flush.percent</name>
<value>0.1</value>
</property>
<property>
<name>hive.compute.query.using.stats</name>
<value>false</value>
</property>
<property>
<name>hive.vectorized.execution.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.vectorized.execution.reduce.enabled</name>
<value>false</value>
</property>
<property>
<name>hive.merge.mapfiles</name>
<value>true</value>
</property>
<property>
<name>hive.merge.mapredfiles</name>
<value>false</value>
</property>
<property>
<name>hive.cbo.enable</name>
<value>false</value>
</property>
<property>
<name>hive.fetch.task.conversion</name>
<value>minimal</value>
</property>
<property>
<name>hive.fetch.task.conversion.threshold</name>
<value>268435456</value>
</property>
<property>
<name>hive.limit.pushdown.memory.usage</name>
<value>0.1</value>
</property>
<property>
<name>hive.merge.sparkfiles</name>
<value>true</value>
</property>
<property>
<name>hive.merge.smallfiles.avgsize</name>
<value>16777216</value>
</property>
<property>
<name>hive.merge.size.per.task</name>
<value>268435456</value>
</property>
<property>
<name>hive.optimize.reducededuplication</name>
<value>true</value>
</property>
<property>
<name>hive.optimize.reducededuplication.min.reducer</name>
<value>4</value>
</property>
<property>
<name>hive.map.aggr</name>
<value>true</value>
</property>
<property>
<name>hive.map.aggr.hash.percentmemory</name>
<value>0.5</value>
</property>
<property>
<name>hive.optimize.sort.dynamic.partition</name>
<value>false</value>
</property>
<property>
<name>hive.execution.engine</name>
<value>mr</value>
</property>
<property>
<name>spark.executor.memory</name>
<value>268435456</value>
</property>
<property>
<name>spark.driver.memory</name>
<value>268435456</value>
</property>
<property>
<name>spark.executor.cores</name>
<value>4</value>
</property>
<property>
<name>spark.yarn.driver.memoryOverhead</name>
<value>26</value>
</property>
<property>
<name>spark.yarn.executor.memoryOverhead</name>
<value>26</value>
</property>
<property>
<name>spark.dynamicAllocation.enabled</name>
<value>true</value>
</property>
<property>
<name>spark.dynamicAllocation.initialExecutors</name>
<value>1</value>
</property>
<property>
<name>spark.dynamicAllocation.minExecutors</name>
<value>1</value>
</property>
<property>
<name>spark.dynamicAllocation.maxExecutors</name>
<value>2147483647</value>
</property>
<property>
<name>hive.metastore.execute.setugi</name>
<value>true</value>
</property>
<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value>
</property>
<property>
<name>hive.zookeeper.client.port</name>
<value>2181</value>
</property>
<property>
<name>hive.zookeeper.namespace</name>
<value>hive_zookeeper_namespace_CD-HIVE-LTqXUcrR</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hive.cluster.delegation.token.store.class</name>
<value>org.apache.hadoop.hive.thrift.MemoryTokenStore</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/_HOST@AZCLOUD.LOCAL</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST@AZCLOUD.LOCAL</value>
</property>
<property>
<name>hive.server2.use.SSL</name>
<value>true</value>
</property>
<property>
<name>spark.shuffle.service.enabled</name>
<value>true</value>
</property>
</configuration>
In Oozie configuration I have the following credentials classes enabled:
hcat=org.apache.oozie.action.hadoop.HCatCredentials,hbase=org.apache.oozie.action.hadoop.HbaseCredentials,hive2=org.apache.oozie.action.hadoop.Hive2Credentials
Can anyone help? What am I missing?
Created 04-16-2018 07:18 AM
Created 04-16-2018 07:58 AM
Thank you very much for the answer, I was able to generate the keytab file and put it in the lib folder of the Oozie application, but when I run I got the following error:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24) org.apache.hadoop.security.KerberosAuthException: Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1130) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:562) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:178) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:90) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:81) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:57) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:235) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
I am also a bit confused about all this procedure, I mean shouldn't Oozie be able to get a Kerberos delegation token on behalf of my user without the need for me to provide the keytab file?
I have also tried (again) to use hcat credentials specifiying the configuration you suggested in the other post with the following workflow:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf">
<credentials>
<credential name="hive2_credentials" type="hive2">
<property>
<name>hive2.jdbc.url</name>
<value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value>
</property>
<property>
<name>hive2.server.principal</name>
<value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
</property>
</credential>
<credential name="hcat_cred" type="hcat">
<property>
<name>hcat.metastore.uri</name>
<value>thrift://trmas-fc2d552a.azcloud.local:9083</value>
</property>
<property>
<name>hcat.metastore.principal</name>
<value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value>
</property>
</credential>
</credentials>
<start to="spark_action"/>
<action cred="hcat_cred" name="spark_action">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="hdfs://trmas-6b8bc78c.azcloud.local:8020/user/icon0104/spark_hive_output"/>
</prepare>
<configuration>
<property>
<name>spark.yarn.security.tokens.hive.enabled</name>
<value>false</value>
</property>
</configuration>
<master>yarn-cluster</master>
<name>OozieSparkAction</name>
<class>my.Main</class>
<jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar>
<spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xmlL</spark-opts>
</spark>
<ok to="END_NODE"/>
<error to="KILL_NODE"/>
</action>
<kill name="KILL_NODE">
<message>${wf:errorMessage(wf:lastErrorNode())}</message>
</kill>
<end name="END_NODE"/>
</workflow-app>
But the Spark action goes in START_RETRY state with the same error :
JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException
Thanks for the support!
Created 04-16-2018 06:44 PM
Created 04-17-2018 04:23 AM
In Cloudera Manager I went to Oozie server instance and check logs from there but there is nothing useful in Stdout and Stderr, are these the logs you were talking about?
Also I am not sure where can I find logs about HMS, can you provide some details?
Created 04-17-2018 05:14 AM
Created 04-17-2018 06:37 AM
Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.