I have a pool of pig scripts. I keep on adding new scripts in this pool.
I would like to schedule a job using oozie which will execute the pig script andwhenever I will add a new script in the pool oozie job should pick the new script and execute just the new script.
Is it possible using oozie ?
if yes, then what should be my approach to get this done ?
Just a rough guess here for your query
see how it is feasible for your use case
Oozie Shell action >> (which triggers poller kind of script for a directory and monitors for new pig scripts) >> Once new pig script available >> trigger the pig script
But in above case oozie-shell action just helps to trigger the poller. Would you like your each pig script should be scheduled with oozie ? (just that part is not clear for me)
Thanks for your input.
You can design an application like this (kind of high level idea which you may use)
Poller (shell script which monitors the directory where you place your Pig scripts) >> Once the poller script notices a new Pig script on your directory It will start generating the Oozie related files (Example: job.properties, workflow.xml , and other any additonal inputs needed by your script ) >> Finally call oozie command to trigger the workflow.xml which points to your newly added Pig script
( But we need to do some more tweaks in your script if your pig script's arguments are more dynamic in nature )
Hope this helps.