Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How to pass --files to spark action with oozie

Rising Star

Hi,

I created a spark application that is configured thanks to a bunch of properties files that I specify at runtime with --files option of spark-submit command.

These local files are automatically copied to spark containers so that my job running in executors can read them to adjust its behavior.

Great, this works like a charm.

Now, I want to schedule this spark-submit action every hour with oozie, but couldn't find how to proceed to pass these configuration files properly to my spark job thanks to oozie... I guess I have to copy these files to HDFS and ask oozie to launch the spark action and pass it thoses hdfs files, but cannot figure out how to achieve this... Does anyone have a clue about this ? Thanks a lot for your help Sebastien

2 REPLIES 2

New Contributor

Hi @Sebastien Chausson,

I'm facing the same issue, did you find any way to pass files to an oozie spark action please ?

Thanks

Contributor

@Sebastien Chausson

You can do this by adding --files in the spark-opts tag of your spark action.

<spark-opts>--executor-memory 20G --num-executors 50 --files hdfs://(complete hdfs path)</spark-opts>

As an alternative you could use a shell action and pass your spark submit command directly to it.