01-29-2016 06:41 PM
I am runing a workfow of spark actions from oozie, but it failed with "com/fasterxml/jackson/module/scala/DefaultScalaModule",
the log:
Using properties file: null Parsed arguments: master local[4] deployMode client executorMemory null executorCores null totalExecutorCores null propertiesFile null driverMemory null driverCores null driverExtraClassPath /opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/jets3t-0.9.0.jar driverExtraLibraryPath null driverExtraJavaOptions null supervise false queue null numExecutors null files null pyFiles null archives null mainClass com.gridx.spark.MeterReadingLoader primaryResource file:/user/admin/wei-workspaces/lib/spark-all.jar name MeterReadingLoader childArgs [-i 1]
jars a lot of jars here packages null packagesExclusions null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file null: spark.executor.extraClassPath -> /opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar:/opt/cloudera/parcels/CDH/jars/jackson-databind-2.3.1.jar Main class: com.gridx.spark.MeterReadingLoader Arguments: -i 1 System properties: SPARK_SUBMIT -> true spark.app.name -> MeterReadingLoader spark.jars -> a lot of jars here
spark.submit.deployMode -> client spark.executor.extraClassPath -> /opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar:/opt/cloudera/parcels/CDH/jars/jackson-databind-2.3.1.jar spark.master -> local[4] spark.driver.extraClassPath -> /opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar:/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars/jets3t-0.9.0.jar Classpath elements: file:/user/admin/wei-workspaces/lib/spark-all.jar file:/opt/cloudera/parcels/CDH/jars/guava-16.0.1.jar
.
. a lot of jars here Warning: Local jar /user/admin/wei-workspaces/lib/spark-all.jar does not exist, skipping.
.
.
skip a lot of jars
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, com/fasterxml/jackson/module/scala/DefaultScalaModule$ java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/DefaultScalaModule$ at org.apache.spark.SparkContext.withScope(SparkContext.scala:709) at org.apache.spark.SparkContext.textFile(SparkContext.scala:825) at com.gridx.spark.MeterReadingLoader$.load(MeterReadingLoader.scala:110) at com.gridx.spark.MeterReadingLoader$.main(MeterReadingLoader.scala:98) at com.gridx.spark.MeterReadingLoader.main(MeterReadingLoader.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:185) at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:176) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49) at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:378) at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:296) at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181) at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.module.scala.DefaultScalaModule$ at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 35 more log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
there are so many additional jars in --jars option, why oozie adds these jars? i wonder whether this causes the problem?
02-28-2016 10:42 PM
09-08-2016 12:05 AM - edited 09-08-2016 02:54 AM
Hello Harsh,
I am using CDH 5.7.0 with Kerbarose authentication. While trying to execute spark oozie action, I am getting following authentication error while trying to connect to metastore:
2016-09-07 23:04:37,712 INFO [Thread-9] client.ClientWrapper (Logging.scala:logInfo(58)) - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.7.0 2016-09-07 23:04:37,928 INFO [Thread-9] hive.metastore (HiveMetaStoreClient.java:open(376)) - Trying to connect to metastore with URI thrift://<MASKED>:9083 2016-09-07 23:04:37,954 ERROR [Thread-9] transport.TSaslTransport (TSaslTransport.java:open(315)) - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) ... at java.lang.Thread.run(Thread.java:745) Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ... ... 49 more 2016-09-07 23:04:37,957 WARN [Thread-9] hive.metastore (HiveMetaStoreClient.java:open(429)) - Failed to connect to the MetaStore Server... 2016-09-07 23:04:37,957 INFO [Thread-9] hive.metastore (HiveMetaStoreClient.java:open(460)) - Waiting 1 seconds before next connection attempt. 2016-09-07 23:04:38,958 INFO [Thread-9] hive.metastore (HiveMetaStoreClient.java:open(376)) - Trying to connect to metastore with URI thrift://<MASKED>:9083 2016-09-07 23:04:38,959 ERROR [Thread-9] transport.TSaslTransport (TSaslTransport.java:open(315)) - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
In the same Oozie workflow, other shell action is working fine. Also if I do not generate Kerberos ticket before submitting oozie workflow, the oozie workflow does not starts, it means that oozie is getting kerberos authentication however not able to pass on that to Spark, here is the code and more logs:
workflow.xml (there might be some extra parameters):
<workflow-app name="test_Wf" xmlns="uri:oozie:workflow:0.5"> <global> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapreduce.job.queuename</name> <value>${queueName}</value> </property> <property> <name>oozie.launcher.yarn.app.mapreduce.am.env</name> <value>SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark</value> </property> </configuration> </global> <start to="spark-node"/> <action name='spark-node'> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <job-xml>/user/TESTUSER/hive-site.xml</job-xml> <configuration> <property> <name>mapreduce.job.queuename</name> <value>root.dm.test</value> </property> <property> <name>oozie.launcher.yarn.app.mapreduce.am.env</name> <value>SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark</value> </property> <property> <name>conf</name> <value>PYSPARK_PYTHON=/common/python/bin/python2.7</value> </property> <property> <name>conf</name> <value>PYSPARK_DRIVER_PYTHON=/common/python/bin/python2.7</value> </property> </configuration> <master>yarn-cluster</master> <mode>cluster</mode> <name>TestSparkNew</name> <class>clear</class> <jar>hdfs://<nameNode>:8020/user/TESTUSER/oozie/test.py</jar> <spark-opts>--queue root.dm.test --files /etc/hive/conf/hive-site.xml --conf spark.yarn.security.tokens.hive.enabled=false --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/dhcommon/dhpython/python/bin/python2.7 --conf spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=/dhcommon/dhpython/python/bin/python2.7 --conf spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p1464.1349/lib/hadoop/lib/native:/opt/cloudera/parcels/GPLEXTRAS-5.7.0-1.cdh5.7.0.p0.40/lib/hadoop/lib/native --conf spark.executorEnv.PYTHONPATH=/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip:/common/python/bin/python2.7</spark-opts> <arg></arg> </spark> <ok to="End" /> <error to="Kill" /> </action> <kill name="Kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="End"/> </workflow-app>
test.py:
#! /usr/bin/env /common/python/bin/python2.7 import os,sys,traceback from pyspark import SparkContext from pyspark.sql import HiveContext from pyspark.sql.types import * print 'inside file' # BREAKS HERE ... sc = SparkContext() sqlContext=HiveContext(sc) #working #sc = SparkContext("local", "SimpleApp") # ...
job_properties.xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration> <property><name>nameNode</name><value>MASKED</value></property> <property><name>oozie.wf.application.path</name><value>${nameNode}/user/TESTUSER/oozie</value></property> <property><name>oozie.use.system.libpath</name><value>true</value></property> <property><name>queueName</name><value>root.dm.test</value></property> <property><name>jobTracker</name><value>MASKED:8032</value></property> </configuration>
When executes, here is the stdout log (most of it except complete jar lists etc, Can provide if required) of Spark driver :
Log Upload Time: Wed Sep 07 23:05:09 -0400 2016 Log Length: 301807 2016-09-07 23:04:25,398 INFO [main] yarn.ApplicationMaster (SignalLogger.scala:register(47)) - Registered signal handlers for [TERM, HUP, INT] 2016-09-07 23:04:26,080 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - ApplicationAttemptId: appattempt_1473078373726_0204_000001 2016-09-07 23:04:26,686 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: TESTUSER 2016-09-07 23:04:26,686 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: TESTUSER 2016-09-07 23:04:26,687 INFO [main] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(TESTUSER); users with modify permissions: Set(TESTUSER) 2016-09-07 23:04:26,706 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Starting the user application in a separate Thread 2016-09-07 23:04:26,718 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization 2016-09-07 23:04:26,719 INFO [main] yarn.ApplicationMaster (Logging.scala:logInfo(58)) - Waiting for spark context initialization ... inside file 2016-09-07 23:04:29,222 INFO [Thread-9] spark.SparkContext (Logging.scala:logInfo(58)) - Running Spark version 1.6.0 2016-09-07 23:04:29,243 INFO [Thread-9] spark.SparkContext (Logging.scala:logInfo(58)) - Spark configuration: spark.app.name=SimpleApp spark.authenticate=false spark.driver.cores=2 spark.driver.extraClassPath=RoaringBitmap-0.5.11.jar:ST4-4.0.4.jar:activation-1.1.jar ..... spark.driver.extraLibraryPath=/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p1464.1349/lib/hadoop/lib/native:/opt/cloudera/parcels/GPLEXTRAS-5.7.0-1.cdh5.7.0.p0.40/lib/hadoop/lib/native spark.driver.memory=2g spark.dynamicAllocation.enabled=true spark.dynamicAllocation.executorIdleTimeout=60 spark.dynamicAllocation.initialExecutors=1 spark.dynamicAllocation.maxExecutors=36 spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.schedulerBacklogTimeout=1 spark.eventLog.dir=hdfs://MASKED/user/spark/applicationHistory spark.eventLog.enabled=true spark.executor.cores=5 spark.executor.extraClassPath=RoaringBitmap-0.5.11.jar:ST4-4.0.4.jar:activation-1.1.jar .... park.executor.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/conf/log4j.properties spark.executor.extraLibraryPath=/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p1464.1349/lib/hadoop/lib/native:/opt/cloudera/parcels/GPLEXTRAS-5.7.0-1.cdh5.7.0.p0.40/lib/hadoop/lib/native spark.executor.memory=15g spark.executorEnv.PYTHONPATH={{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-0.9-src.zip spark.hadoop.cloneConf=true spark.kryoserializer.buffer.max=1g spark.logConf=true spark.master=yarn-cluster spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_HOSTS=host03.company.co.uk,host04.company.co.uk spark.org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.param.PROXY_URI_BASES=http://host03.company.co.uk:8088/proxy/application_1473078373726_0204,http://host04.company.co.uk:8088/proxy/application_1473078373726_0204 spark.python.worker.memory=2g spark.rdd.compress=true spark.serializer=org.apache.spark.serializer.KryoSerializer spark.serializer.objectStreamReset=100 spark.shuffle.service.enabled=true spark.shuffle.service.port=7337 spark.submit.deployMode=cluster spark.ui.filters=org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter spark.ui.port=0 spark.yarn.am.extraLibraryPath=/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p1464.1349/lib/hadoop/lib/native:/opt/cloudera/parcels/GPLEXTRAS-5.7.0-1.cdh5.7.0.p0.40/lib/hadoop/lib/native spark.yarn.app.attemptId=1 spark.yarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1473078373726_0204/container_e31_1473078373726_0204_01_000001 spark.yarn.app.id=application_1473078373726_0204 spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=/dhcommon/dhpython/python/bin/python2.7 spark.yarn.appMasterEnv.PYSPARK_PYTHON=/dhcommon/dhpython/python/bin/python2.7 spark.yarn.config.gatewayPath=/opt/cloudera/parcels spark.yarn.config.replacementPath={{HADOOP_COMMON_HOME}}/../../.. spark.yarn.historyServer.address=http://host03.company.co.uk:18088 spark.yarn.isPython=true spark.yarn.jar=local:/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p1464.1349/lib/spark/lib/spark-assembly.jar spark.yarn.localizeConfig=false spark.yarn.secondary.jars=RoaringBitmap-0.5.11.jar,ST4-4.0.4.jar,activation-1.1.jar ... spark.yarn.security.tokens.hive.enabled=false 2016-09-07 23:04:29,256 INFO [Thread-9] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing view acls to: TESTUSER 2016-09-07 23:04:29,256 INFO [Thread-9] spark.SecurityManager (Logging.scala:logInfo(58)) - Changing modify acls to: TESTUSER 2016-09-07 23:04:29,256 INFO [Thread-9] spark.SecurityManager (Logging.scala:logInfo(58)) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(TESTUSER); users with modify permissions: Set(TESTUSER) 2016-09-07 23:04:29,402 INFO [Thread-9] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriver' on port 15030. 2016-09-07 23:04:29,690 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-3] slf4j.Slf4jLogger (Slf4jLogger.scala:applyOrElse(80)) - Slf4jLogger started 2016-09-07 23:04:29,720 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Starting remoting 2016-09-07 23:04:29,823 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-2] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.8.6:25240] 2016-09-07 23:04:29,824 INFO [sparkDriverActorSystem-akka.actor.default-dispatcher-3] Remoting (Slf4jLogger.scala:apply$mcV$sp(74)) - Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem@192.168.8.6:25240] 2016-09-07 23:04:29,829 INFO [Thread-9] util.Utils (Logging.scala:logInfo(58)) - Successfully started service 'sparkDriverActorSystem' on port 25240. 2016-09-07 23:04:29,848 INFO [Thread-9] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering MapOutputTracker 2016-09-07 23:04:29,863 INFO [Thread-9] spark.SparkEnv (Logging.scala:logInfo(58)) - Registering BlockManagerMaster 2016-09-07 23:04:29,875 INFO [Thread-9] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /u12/hadoop/yarn/nm/usercache/TESTUSER/appcache/application_1473078373726_0204/blockmgr-8a4bc500-5821-4ef9-91b1-b2cf86796217 2016-09-07 23:04:29,875 INFO [Thread-9] storage.DiskBlockManager (Logging.scala:logInfo(58)) - Created local directory at /u11/hadoop/yarn/nm/usercache/TESTUSER/appcache/application_1473078373726_0204/blockmgr-66a45cd3-382a-42ac-ba1a-1173b0c4b1bd ..... 2016-09-07 23:04:30,996 INFO [ContainerLauncher-0] impl.ContainerManagementProtocolProxy (ContainerManagementProtocolProxy.java:newProxy(260)) - Opening proxy : host07.company.co.uk:8041 2016-09-07 23:04:36,713 INFO [dispatcher-event-loop-54] cluster.YarnClusterSchedulerBackend (Logging.scala:logInfo(58)) - Registered executor NettyRpcEndpointRef(null) (host07.company.co.uk:10352) with ID 1 2016-09-07 23:04:36,732 INFO [SparkListenerBus] spark.ExecutorAllocationManager (Logging.scala:logInfo(58)) - New executor 1 has registered (new total is 1) 2016-09-07 23:04:36,748 INFO [dispatcher-event-loop-57] storage.BlockManagerMasterEndpoint (Logging.scala:logInfo(58)) - Registering block manager host07.company.co.uk:21215 with 7.8 GB RAM, BlockManagerId(1, host07.company.co.uk, 21215) 2016-09-07 23:04:36,754 INFO [Thread-9] cluster.YarnClusterSchedulerBackend (Logging.scala:logInfo(58)) - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 2016-09-07 23:04:36,755 INFO [Thread-9] cluster.YarnClusterScheduler (Logging.scala:logInfo(58)) - YarnClusterScheduler.postStartHook done 2016-09-07 23:04:37,558 INFO [Thread-9] hive.HiveContext (Logging.scala:logInfo(58)) - Initializing execution hive, version 1.1.0 2016-09-07 23:04:37,711 INFO [Thread-9] client.ClientWrapper (Logging.scala:logInfo(58)) - Inspected Hadoop version: 2.6.0-cdh5.7.0 2016-09-07 23:04:37,712 INFO [Thread-9] client.ClientWrapper (Logging.scala:logInfo(58)) - Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.7.0 2016-09-07 23:04:37,928 INFO [Thread-9] hive.metastore (HiveMetaStoreClient.java:open(376)) - Trying to connect to metastore with URI thrift://host04.company.co.uk:9083 2016-09-07 23:04:37,954 ERROR [Thread-9] transport.TSaslTransport (TSaslTransport.java:open(315)) - SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
One more observation, if I do spark-submit without oozie, I can see "spark.yarn.keytab" and "spark.yarn.principal" in spark UI > Enviornment > Spark Properties, however these are missing in oozie case, so that might be the core issue.
Could you please advise what could be the issue, Thanks a lot !
09-09-2016 01:42 AM
I resolved this error by adding hcat "credentials", here is the final workflow.xml
<workflow-app name="TestWf" xmlns="uri:oozie:workflow:0.5"> <credentials> <credential name="hcat" type="hcat"> <property> <name>hcat.metastore.uri</name> <value>thrift://<HOST>:9083</value> </property> <property> <name>hcat.metastore.principal</name> <value>hive/<HOST>@<DOMAIN></value> </property> </credential> </credentials> <start to="spark-3b4d"/> <kill name="Kill"> <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <action name="spark-3b4d" cred="hcat"> <spark xmlns="uri:oozie:spark-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <job-xml>/user/TESTUSER/hive-site.xml</job-xml> <configuration> <property> <name>mapreduce.job.queuename</name> <value>root.testqueue</value> </property> <property> <name>oozie.use.system.libpath</name> <value>true</value> </property> <property> <name>oozie.launcher.yarn.app.mapreduce.am.env</name> <value>SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark</value> </property> </configuration> <master>yarn-cluster</master> <mode>cluster</mode> <name>MySpark</name> <class>clear</class> <jar>hdfs://HOST:8020/user/TESTUSER/oozie/test.py</jar> <spark-opts>--queue root.dm.office_depot --files=/etc/hive/conf/hive-site.xml --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/common/python/python2.7 --conf spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON=/common/python/python2.7</spark-opts> </spark> <ok to="End"/> <error to="Kill"/> </action> <end name="End"/> </workflow-app>
05-25-2018 09:25 AM
Hi,
can you please help me finding right hcat credentials?
I can't find the correct one, I am always facing issue.
Thank you
Regards
Armando
05-31-2018 01:47 PM
<credential name='my-hcat-creds' type='hcat'>
<property>
<name>hcat.metastore.uri</name>
<value>${hcat_uri}</value>
</property>
<property>
<name>hcat.metastore.principal</name>
<value>${hcat_principal}</value>
</property>
</credential>
here - hcat_uri should be thrift://<replace with host name of metashore>:9083
hcat_principal | hive/_HOST@<Replace with domain>.COM |
06-21-2018 01:30 PM
Hi mukeshchandra,
thank you.
We solved the issue adding oozie user in hcat metastore.
Regards
Armando