Created 05-19-2016 08:46 AM
I have a pig script which I want to automatically execute every week or every day on a hadoop cluster. how Can I do this t?what is the best solution to do this ? please help
Created 05-19-2016 09:23 AM
You can achieve this in two ways.
1. Create a wrapper shell script and call "pig <pig script path>" inside it. After that you can create an Unix cron entry to schedule it as per your requirement.
2. Another way is through Oozie scheduler, for that you can either create a pig action and along with recurring a coordinator service(see below link) or you can also create an Oozie shell action and call the same wrapper shell script inside your Oozie shell action( point 1).
http://rogerhosto.com/apache-oozie-shell-script-example/
https://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a3.2.3_Pig_Action
http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/
Thanks
Created 05-19-2016 09:23 AM
You can achieve this in two ways.
1. Create a wrapper shell script and call "pig <pig script path>" inside it. After that you can create an Unix cron entry to schedule it as per your requirement.
2. Another way is through Oozie scheduler, for that you can either create a pig action and along with recurring a coordinator service(see below link) or you can also create an Oozie shell action and call the same wrapper shell script inside your Oozie shell action( point 1).
http://rogerhosto.com/apache-oozie-shell-script-example/
https://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a3.2.3_Pig_Action
http://blog.cloudera.com/blog/2013/01/how-to-schedule-recurring-hadoop-jobs-with-apache-oozie/
Thanks