Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to pass --files to spark action with oozie

How to pass --files to spark action with oozie

Rising Star

Hi,

I created a spark application that is configured thanks to a bunch of properties files that I specify at runtime with --files option of spark-submit command.

These local files are automatically copied to spark containers so that my job running in executors can read them to adjust its behavior.

Great, this works like a charm.

Now, I want to schedule this spark-submit action every hour with oozie, but couldn't find how to proceed to pass these configuration files properly to my spark job thanks to oozie... I guess I have to copy these files to HDFS and ask oozie to launch the spark action and pass it thoses hdfs files, but cannot figure out how to achieve this... Does anyone have a clue about this ? Thanks a lot for your help Sebastien

2 REPLIES 2

Re: How to pass --files to spark action with oozie

New Contributor

Hi @Sebastien Chausson,

I'm facing the same issue, did you find any way to pass files to an oozie spark action please ?

Thanks

Highlighted

Re: How to pass --files to spark action with oozie

Contributor

@Sebastien Chausson

You can do this by adding --files in the spark-opts tag of your spark action.

<spark-opts>--executor-memory 20G --num-executors 50 --files hdfs://(complete hdfs path)</spark-opts>

As an alternative you could use a shell action and pass your spark submit command directly to it.