Created on 04-06-2018 01:59 AM - edited 09-16-2022 06:04 AM
Hello everyone
I need to run a Spark-SQL action on an Hive table. I am having problems on authentication (the cluster is Kerberos-secured).
I've tried first with hive2 credentials because they work with my other hive2 actions, my I got a failure (I suppose this type of credentials can only be used with hive2 actions?):
2018-04-06 08:37:21,831 [Driver] INFO org.apache.hadoop.hive.ql.session.SessionState - No Tez session required at this point. hive.execution.engine=mr. 2018-04-06 08:37:22,117 [Driver] INFO hive.metastore - Trying to connect to metastore with URI thrift://trmas-fc2d552a.azcloud.local:9083 2018-04-06 08:37:22,153 [Driver] ERROR org.apache.thrift.transport.TSaslTransport - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) [...]
I've also tried with hcat credentials, but with this one I got a START_RETRY state of the actin with the following error:
JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException
This is the workflow.xml:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf"> <credentials> <credential name="hive2_credentials" type="hive2"> <property> <name>hive2.jdbc.url</name> <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value> </property> <property> <name>hive2.server.principal</name> <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value> </property> </credential> <credential name="hcat_cred" type="hcat"> <property> <name>hcat.metastore.uri</name> <value>thrift://trmas-fc2d552a.azcloud.local:9083</value> </property> <property> <name>hcat.metastore.principal</name> <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value> </property> </credential> </credentials> <start to="spark_action"/> <action cred="hcat_cred" name="spark_action"> <spark xmlns="uri:oozie:spark-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/icon0104/output"/> </prepare> <master>yarn-cluster</master> <mode>cluster</mode> <name>OozieSpark</name> <class>my.Main</class> <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar> <spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xml</spark-opts> </spark> <ok to="END_NODE"/> <error to="KILL_NODE"/> </action> <kill name="KILL_NODE"> <message>${wf:errorMessage(wf:lastErrorNode())}</message> </kill> <end name="END_NODE"/> </workflow-app>
This is the hive-site.xml:
<?xml version="1.0" encoding="UTF-8"?> <!--Autogenerated by Cloudera Manager--> <configuration> <property> <name>hive.metastore.uris</name> <value>thrift://trmas-fc2d552a.azcloud.local:9083</value> </property> <property> <name>hive.metastore.client.socket.timeout</name> <value>300</value> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <property> <name>hive.warehouse.subdir.inherit.perms</name> <value>true</value> </property> <property> <name>hive.auto.convert.join</name> <value>false</value> </property> <property> <name>hive.auto.convert.join.noconditionaltask.size</name> <value>20971520</value> </property> <property> <name>hive.optimize.bucketmapjoin.sortedmerge</name> <value>false</value> </property> <property> <name>hive.smbjoin.cache.rows</name> <value>10000</value> </property> <property> <name>hive.server2.logging.operation.enabled</name> <value>true</value> </property> <property> <name>hive.server2.logging.operation.log.location</name> <value>/var/log/hive/operation_logs</value> </property> <property> <name>mapred.reduce.tasks</name> <value>-1</value> </property> <property> <name>hive.exec.reducers.bytes.per.reducer</name> <value>67108864</value> </property> <property> <name>hive.exec.copyfile.maxsize</name> <value>33554432</value> </property> <property> <name>hive.exec.reducers.max</name> <value>1099</value> </property> <property> <name>hive.vectorized.groupby.checkinterval</name> <value>4096</value> </property> <property> <name>hive.vectorized.groupby.flush.percent</name> <value>0.1</value> </property> <property> <name>hive.compute.query.using.stats</name> <value>false</value> </property> <property> <name>hive.vectorized.execution.enabled</name> <value>true</value> </property> <property> <name>hive.vectorized.execution.reduce.enabled</name> <value>false</value> </property> <property> <name>hive.merge.mapfiles</name> <value>true</value> </property> <property> <name>hive.merge.mapredfiles</name> <value>false</value> </property> <property> <name>hive.cbo.enable</name> <value>false</value> </property> <property> <name>hive.fetch.task.conversion</name> <value>minimal</value> </property> <property> <name>hive.fetch.task.conversion.threshold</name> <value>268435456</value> </property> <property> <name>hive.limit.pushdown.memory.usage</name> <value>0.1</value> </property> <property> <name>hive.merge.sparkfiles</name> <value>true</value> </property> <property> <name>hive.merge.smallfiles.avgsize</name> <value>16777216</value> </property> <property> <name>hive.merge.size.per.task</name> <value>268435456</value> </property> <property> <name>hive.optimize.reducededuplication</name> <value>true</value> </property> <property> <name>hive.optimize.reducededuplication.min.reducer</name> <value>4</value> </property> <property> <name>hive.map.aggr</name> <value>true</value> </property> <property> <name>hive.map.aggr.hash.percentmemory</name> <value>0.5</value> </property> <property> <name>hive.optimize.sort.dynamic.partition</name> <value>false</value> </property> <property> <name>hive.execution.engine</name> <value>mr</value> </property> <property> <name>spark.executor.memory</name> <value>268435456</value> </property> <property> <name>spark.driver.memory</name> <value>268435456</value> </property> <property> <name>spark.executor.cores</name> <value>4</value> </property> <property> <name>spark.yarn.driver.memoryOverhead</name> <value>26</value> </property> <property> <name>spark.yarn.executor.memoryOverhead</name> <value>26</value> </property> <property> <name>spark.dynamicAllocation.enabled</name> <value>true</value> </property> <property> <name>spark.dynamicAllocation.initialExecutors</name> <value>1</value> </property> <property> <name>spark.dynamicAllocation.minExecutors</name> <value>1</value> </property> <property> <name>spark.dynamicAllocation.maxExecutors</name> <value>2147483647</value> </property> <property> <name>hive.metastore.execute.setugi</name> <value>true</value> </property> <property> <name>hive.support.concurrency</name> <value>true</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value> </property> <property> <name>hive.zookeeper.client.port</name> <value>2181</value> </property> <property> <name>hive.zookeeper.namespace</name> <value>hive_zookeeper_namespace_CD-HIVE-LTqXUcrR</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>trmas-6b8bc78c.azcloud.local,trmas-c9471d78.azcloud.local,trmas-fc2d552a.azcloud.local</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hive.cluster.delegation.token.store.class</name> <value>org.apache.hadoop.hive.thrift.MemoryTokenStore</value> </property> <property> <name>hive.server2.enable.doAs</name> <value>false</value> </property> <property> <name>hive.metastore.sasl.enabled</name> <value>true</value> </property> <property> <name>hive.server2.authentication</name> <value>kerberos</value> </property> <property> <name>hive.metastore.kerberos.principal</name> <value>hive/_HOST@AZCLOUD.LOCAL</value> </property> <property> <name>hive.server2.authentication.kerberos.principal</name> <value>hive/_HOST@AZCLOUD.LOCAL</value> </property> <property> <name>hive.server2.use.SSL</name> <value>true</value> </property> <property> <name>spark.shuffle.service.enabled</name> <value>true</value> </property> </configuration>
In Oozie configuration I have the following credentials classes enabled:
hcat=org.apache.oozie.action.hadoop.HCatCredentials,hbase=org.apache.oozie.action.hadoop.HbaseCredentials,hive2=org.apache.oozie.action.hadoop.Hive2Credentials
Can anyone help? What am I missing?
Created 04-16-2018 07:18 AM
Created 04-16-2018 07:58 AM
Thank you very much for the answer, I was able to generate the keytab file and put it in the lib folder of the Oozie application, but when I run I got the following error:
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24) org.apache.hadoop.security.KerberosAuthException: Login failure for user: icon0104@AZCLOUD.LOCAL from keytab icon0104.keytab javax.security.auth.login.LoginException: Pre-authentication information was invalid (24) at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1130) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:562) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:154) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:178) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:90) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:81) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:57) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:235) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
I am also a bit confused about all this procedure, I mean shouldn't Oozie be able to get a Kerberos delegation token on behalf of my user without the need for me to provide the keytab file?
I have also tried (again) to use hcat credentials specifiying the configuration you suggested in the other post with the following workflow:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="oozie_spark_wf"> <credentials> <credential name="hive2_credentials" type="hive2"> <property> <name>hive2.jdbc.url</name> <value>jdbc:hive2://trmas-fc2d552a.azcloud.local:10000/default;ssl=true</value> </property> <property> <name>hive2.server.principal</name> <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value> </property> </credential> <credential name="hcat_cred" type="hcat"> <property> <name>hcat.metastore.uri</name> <value>thrift://trmas-fc2d552a.azcloud.local:9083</value> </property> <property> <name>hcat.metastore.principal</name> <value>hive/trmas-fc2d552a.azcloud.local@AZCLOUD.LOCAL</value> </property> </credential> </credentials> <start to="spark_action"/> <action cred="hcat_cred" name="spark_action"> <spark xmlns="uri:oozie:spark-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="hdfs://trmas-6b8bc78c.azcloud.local:8020/user/icon0104/spark_hive_output"/> </prepare> <configuration> <property> <name>spark.yarn.security.tokens.hive.enabled</name> <value>false</value> </property> </configuration> <master>yarn-cluster</master> <name>OozieSparkAction</name> <class>my.Main</class> <jar>/home/icon0104/oozie/ooziespark/lib/ooziespark-1.0.jar</jar> <spark-opts>--files ${nameNode}/user/icon0104/oozie/ooziespark/hive-site.xmlL</spark-opts> </spark> <ok to="END_NODE"/> <error to="KILL_NODE"/> </action> <kill name="KILL_NODE"> <message>${wf:errorMessage(wf:lastErrorNode())}</message> </kill> <end name="END_NODE"/> </workflow-app>
But the Spark action goes in START_RETRY state with the same error :
JA009: org.apache.hive.hcatalog.common.HCatException : 9001 : Exception occurred while processing HCat request : TException while getting delegation token.. Cause : org.apache.thrift.transport.TTransportException
Thanks for the support!
Created 04-16-2018 06:44 PM
Created 04-17-2018 04:23 AM
In Cloudera Manager I went to Oozie server instance and check logs from there but there is nothing useful in Stdout and Stderr, are these the logs you were talking about?
Also I am not sure where can I find logs about HMS, can you provide some details?
Created 04-17-2018 05:14 AM
Created 04-17-2018 06:37 AM
Thank you, unfortunately I have access only to edge node (I can't ssh to masters and workers). I have access to web interfaces though (CM, HUE, Yarn, etc) thus if there is anything I can check from there let me know.