Reply
Explorer
Posts: 17
Registered: ‎09-09-2014

Hadoop tool(s) for weekly file transfer - Flume?

Dear Friends

 

I need to transfer files 2-5 GB (in different directories). 

  1. Can I use Flume? 
  2. If not, any other Hadoop tool available? Is there a recommended tool/best practices?
  3. Is it possibile to automate each file transfer flume job (schedule it weekly directly or use other tool)
  4. Can I set up Flume (or any other job scheduling tool which runs Flume job) the way that I get notified (preferably by email) if an specific file transfer job failed?

 

Any help/ link much appreciated.

 

Thanks much in advance and please let me know if you need more info.

 

Kind regards

Andy

 

Cloudera Employee
Posts: 26
Registered: ‎07-08-2013

Re: Hadoop tool(s) for weekly file transfer - Flume?

Flume is a great tool for transferring records from large files, but not large files themselves. For example, if your files are CSV or some other format that has a single record per line, then Flume will be able to handle that just fine. If you need to send the files as-is, then Flume isn't a good fit.

If you need to send the files as-is, I'd recommend checking out Apache NiFi (http://nifi.incubator.apache.org) which is in the ASF incubator.

-Joey
Highlighted
Cloudera Employee
Posts: 40
Registered: ‎01-07-2019

Re: Hadoop tool(s) for weekly file transfer - Flume?

Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera.

 

Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component.

 

Hence the exisint answer already points you in the right direction: NiFi is likely the best fit for solving these usecases.