Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Run Oozie Shell Action (spark2-submit) cannot access hive warehouse

avatar
New Contributor

We try to run oozie shell action on  kerberos environment (Cloudera Enterprise 5.16.2) with command spark2-submit. In script we use HiveContext(sc) to show databases  and the result shown that it shows only default database. 

Result

moth_1-1646808521914.png

 

Here is my script:

job.properties

nameNode=hdfs://namenode:8020
jobTracker=yarnRM
queueName=jobqueuename
concurrency_level=1
execution_order=FIFO  oozie.coord.application.path=${nameNode}/workflow/test/ workflowAppPath=${oozie.coord.application.path}
oozie.use.system.libpath=True
 
workflow.xml
<workflow-app name="spark-test" xmlns="uri:oozie:workflow:0.5">
<start to="spark-node"/>
<kill name="Kill"/>
<action name="spark-node">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapreduce.job.queuename</name>
<value>nrtcdr.imeiimsi</value>
</property>
</configuration>
<exec>sparksqltest.sh</exec>
<argument>${wf:user()}</argument>
<argument>${NOMINAL_TIME}</argument> ]
<argument>principal</argument>
<argument>keytab.keytab</argument>
<argument>${JOB_QUEUE}</argument>
<file>/scripts/sparksqltest.sh#sparksqltest.sh</file> <file>/configs/keytab.keytab#keytab.keytab</file> <file>/scripts/SparkSQLTest.jar#SparkSQLTest.jar</file> 
<capture-output/>
</shell>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
 
sparksqltest.sh
#!/bin/bash
WF_ID="$1"
NOMINAL_TIME="$2"
PRINCIPAL="${3}"
KEYTAB="${4}"
VALIDATE_JAR="`ls -1 SparkSQLTest.jar`"
 
export LIBHDFS_OPTS="-Dlog4j.configuration=./log4j.properties"
kinit -k -t "${KEYTAB}" "${PRINCIPAL}"
spark2-submit \ --name SparkSQL-test \
--master yarn --deploy-mode client \
--principal "${PRINCIPAL}" --keytab "${KEYTAB}" \
--queue "${JOB_QUEUE}" \
--class SparkSQLTest \ "${VALIDATE_JAR}"
 
SparkSQLTest.scala
val sc = new SparkContext(new SparkConf().setAppName("spark-test"))
var conf: SparkConf = new SparkConf()
val spark = SparkSession.builder.config(conf).enableHiveSupport().getOrCreate()
spark.sql("show databases").show()
val sqlContext = new HiveContext(sc)
sqlContext.sql("show databases").show()
 
There is no error just show only default databases also, please suggest what am I doing wrong?
moth_0-1646808491787.png

 

 

 

 

2 REPLIES 2

avatar
Cloudera Employee

Hello @moth 

Can you please verify if a hive gateway role (CM > Hive Service > Instances > Add role instance > Gateway) and Spark2 gateway role (CM > Spark2 Service > Instances > Add role instance > Gateway) are installed on every Nodemanager host? Oozie shell actions run as yarn jobs, so the spark2-submit may be run on any NM host in the cluster, so the corresponding gateways need to be installed as well

avatar
New Contributor

I have the same problem, any solution?