Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Oozie - Schedule Flume job

Oozie - Schedule Flume job

Explorer

Dear Friends

 

We are planning to use Flume to transfer files 2-5 GB (in different directories) on a weekly basis and we want to make sure we will be notified (preferably by email) if any of our Flume jobs fail.

Can we use Ooize (Workflow or coordinator in Hue)? if not, any other Hadoop tool available to provide the above functionality (job scheduling and error notification)?

 

Any help/ link much appreciated.

 

Thanks much in advance and please let me know if you need more info.

 

Kind regards

Andy

4 REPLIES 4

Re: Oozie - Schedule Flume job

Champion Alumni

Hello,

 

 

- For job scheduling you can use oozie. However, this is after you have integrated your data into HDFS. (for pig, hive, java jobs).

- In order to receive emails from your platform (if anything goes wrong) you have to configure the SNTP ==> http://www.cloudera.com/content/cloudera/en/documentation/archives/cloudera-manager-4/v4-5-4/Clouder...

- Flume: flume is not working on a schedule. Flume is treating the data when it receives it. 

  • your flume configuration will tell for example that each time there is a file in a certain directory he will put it in hdfs (if you use the spooldir source)
  • your flume configuration will tell when the fille will be rolled (if you receive a lot of small files, you can roll the file in hdfs less often in order to have bigger files)
  • ...

 

Alina

 

GHERMAN Alina

Re: Oozie - Schedule Flume job

New Contributor

Thanks for detailed explaination about Flume. Can you please tell me how to keep flume running for all the time.

Re: Oozie - Schedule Flume job

Master Guru
Flume is a service and by its definition runs continuously and provides metrics for monitoring just like any other service does.

Have you tried the Flume tutorial in the QuickStart VM? https://www.cloudera.com/documentation/other/tutorial/CDH5/topics/ht_flume_to_hdfs.html

User guide of Flume can be found at http://archive.cloudera.com/cdh5/cdh/5/flume-ng/FlumeUserGuide.html
Highlighted

Re: Oozie - Schedule Flume job

Rising Star

What you'll usually find is that a Flume Agent, not to be confused with Flume itself, will be setup and execute through Cloudera Manager and run as a service there

Don't have an account?
Coming from Hortonworks? Activate your account here