Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Flume: how to create (HDFS) target dir from ingested filename?

Solved Go to solution

Flume: how to create (HDFS) target dir from ingested filename?

Guru

Hi,

I'm curious if it is possible to solve this problem with Flume:

I have a SpoolingDir source where files with names in the format "prefixA.prefixB.importantPart.csv" will be moved to

The files shall be put into HDFS (with its original filename) into the corresponding directory "hdfs://basepath/importantPart/", so that the absolute path for a file is "hdfs://basepath/importantPart/prefixA.prefixB.importantPart.csv".

a) how can I parse the filename to extract "importantPart" to create the output HDFS path accordingly, or is this possible at all with Flume?

b) how to preserve the original filename so that the HDFS sink writes to the file with the same filename, again, possible at all?

 

Yes, I know, Flume isn't the right tool for such "file copy" approaches it's working on events, but nevertheless it is interesting if it is possible or if someone did this already.

 

Any hint highly appreciated....

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Flume: how to create (HDFS) target dir from ingested filename?

Master Guru
You could do (a) with the SpoolingDirectory source, as it allows for the event to carry the original filename (via a custom sink wrapper that looks for it) but doing (b) doesn't fit in with the event delivery mechanism of Flume and AFAICT, its not possible to do directly.
2 REPLIES 2

Re: Flume: how to create (HDFS) target dir from ingested filename?

Master Guru
You could do (a) with the SpoolingDirectory source, as it allows for the event to carry the original filename (via a custom sink wrapper that looks for it) but doing (b) doesn't fit in with the event delivery mechanism of Flume and AFAICT, its not possible to do directly.

Re: Flume: how to create (HDFS) target dir from ingested filename?

Guru
Hi,
many thanks for your explanation. I'll check out the custom sink wrapper stuff...
Don't have an account?
Coming from Hortonworks? Activate your account here