Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Automate the process of Pig, Hive, Sqoop.

Solved Go to solution

Automate the process of Pig, Hive, Sqoop.

Explorer

I have data in HDFS(Azure HDInsight) in csv format. I am using Pig to process this Data. After processing the Summarise data will be stored in Hive. And then Hive table is exported in RDBMS using Sqoop. Now I need to automate all this process. Is this possible that I will write particular method for all these 3 task in MapReduce, then run this MapReduce job, and all these task execute one by one.

For create MapReduce job , I want to use .Net SDK. So my question is this possible, and if YES than suggest some steps and reference link for this Question. Thank You.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Automate the process of Pig, Hive, Sqoop.

@Ishvari Dhimmar

Have you evaluated oozie ? I believe you would need to run these repeatedly at some interval. oozie provides support all the above mentioned components i.e pig , hive and sqoop and can be defined as seperate actions in oozie.

You do not need to create seperate MR job ( using .NET SDK ) if you go this route.

View solution in original post

3 REPLIES 3
Highlighted

Re: Automate the process of Pig, Hive, Sqoop.

@Ishvari Dhimmar

Have you evaluated oozie ? I believe you would need to run these repeatedly at some interval. oozie provides support all the above mentioned components i.e pig , hive and sqoop and can be defined as seperate actions in oozie.

You do not need to create seperate MR job ( using .NET SDK ) if you go this route.

View solution in original post

Highlighted

Re: Automate the process of Pig, Hive, Sqoop.

Explorer

Thanks for reply. It will really help me. By mistake I wrote MapReduce Job, I should have to use HiveJob, PigJob, SqoopJob. Thanks again. I just go through Oozie. I didn't find exact Link for Oozie. If I write a Pig script and than want to transfer those data in Hive. Then using Sqoop, export this data to SQL Server. How to connect all these process using Oozie. Can you provide some reference Link?

Don't have an account?
Coming from Hortonworks? Activate your account here