Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Oozie spark action vs spark-submit (command line)

Highlighted

Oozie spark action vs spark-submit (command line)

New Contributor

Oozie Spark-Action workflow taking more time to finish compared to spark-submit:

I'm trying to run Spark-SQL Scala jar , if I execute the jar using spark-submit it is taking 2 min for the same jar if i execute it through Oozie Spark -Action job is running for 12 min .

I'm running the jar with same number of executors/ exec memory/ driver memory /driver cores / yarn mode in both spark-submit and Oozie spark-action.

I even tried giving different memory settings while running Oozie Spark -Action (OOZIE_MAP_REDUCE_JAVA_OPTS ) but nothing worked.

Do we need to give more memory for oozie while running Oozie SPARK-Action Job or are there any oozie -spark settings ?

Thanks for the help!!

1 REPLY 1

Re: Oozie spark action vs spark-submit (command line)

@Bmwer Bmwer

Oozie spark action will use the same resources as spark-submit command. Additionally oozie runs a launcher job which internally submits the job. You may want to compare both the runs and see where exactly the job is taking time and try to mitigate that.

Don't have an account?
Coming from Hortonworks? Activate your account here