- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to pass --files to spark action with oozie
- Labels:
-
Apache Hadoop
-
Apache Oozie
-
Apache Spark
Created ‎04-13-2017 03:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I created a spark application that is configured thanks to a bunch of properties files that I specify at runtime with --files option of spark-submit command.
These local files are automatically copied to spark containers so that my job running in executors can read them to adjust its behavior.
Great, this works like a charm.
Now, I want to schedule this spark-submit action every hour with oozie, but couldn't find how to proceed to pass these configuration files properly to my spark job thanks to oozie... I guess I have to copy these files to HDFS and ask oozie to launch the spark action and pass it thoses hdfs files, but cannot figure out how to achieve this... Does anyone have a clue about this ? Thanks a lot for your help Sebastien
Created ‎06-20-2017 02:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm facing the same issue, did you find any way to pass files to an oozie spark action please ?
Thanks
Created ‎06-29-2017 09:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can do this by adding --files in the spark-opts tag of your spark action.
<spark-opts>--executor-memory 20G --num-executors 50 --files hdfs://(complete hdfs path)</spark-opts>
As an alternative you could use a shell action and pass your spark submit command directly to it.
