Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hadoop tool(s) for weekly file transfer - Flume?

Hadoop tool(s) for weekly file transfer - Flume?

Explorer

Dear Friends

 

I need to transfer files 2-5 GB (in different directories). 

  1. Can I use Flume? 
  2. If not, any other Hadoop tool available? Is there a recommended tool/best practices?
  3. Is it possibile to automate each file transfer flume job (schedule it weekly directly or use other tool)
  4. Can I set up Flume (or any other job scheduling tool which runs Flume job) the way that I get notified (preferably by email) if an specific file transfer job failed?

 

Any help/ link much appreciated.

 

Thanks much in advance and please let me know if you need more info.

 

Kind regards

Andy

 

2 REPLIES 2

Re: Hadoop tool(s) for weekly file transfer - Flume?

Contributor
Flume is a great tool for transferring records from large files, but not large files themselves. For example, if your files are CSV or some other format that has a single record per line, then Flume will be able to handle that just fine. If you need to send the files as-is, then Flume isn't a good fit.

If you need to send the files as-is, I'd recommend checking out Apache NiFi (http://nifi.incubator.apache.org) which is in the ASF incubator.

-Joey

Re: Hadoop tool(s) for weekly file transfer - Flume?

Rising Star

Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera.

 

Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component.

 

Hence the exisint answer already points you in the right direction: NiFi is likely the best fit for solving these usecases.