Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does Anyone know,How to define data pipeline to load data into hive table using apache falcon?

Highlighted

Does Anyone know,How to define data pipeline to load data into hive table using apache falcon?

Expert Contributor

Assumtion,

1) Hive table is created and partitioned already.

2)data is present somewhere on hdfs with correct date-time pattern and landing on hourly basis.

My Questions are:

1) Do you have link for such example?.if yes,please share with me.

2) Do we require to write hive script.

If you have implemented such a example then it's good and you can share that with me.

2 REPLIES 2

Re: Does Anyone know,How to define data pipeline to load data into hive table using apache falcon?

Highlighted

Re: Does Anyone know,How to define data pipeline to load data into hive table using apache falcon?

@Manoj Dhake, find an interesting link on Falcon process flow on multiple hive tables which might help you https://dzone.com/articles/apache-falcon-defining-a-process-with-multiple-hiv

To answer your second question, Falcon supports three process engines Oozie/hive/pig you can choose one among them based on your requirement. For oozie engine, you have to define oozie workflow. If your requirement is just to load data from HDFS to Hive partitioned table then writing hive script and using Hive engine(Falcon) would be easier.

Don't have an account?
Coming from Hortonworks? Activate your account here