- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Flume: how to create (HDFS) target dir from ingested filename?
- Labels:
-
Apache Flume
-
HDFS
Created on ‎12-12-2013 01:23 AM - edited ‎09-16-2022 01:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm curious if it is possible to solve this problem with Flume:
I have a SpoolingDir source where files with names in the format "prefixA.prefixB.importantPart.csv" will be moved to
The files shall be put into HDFS (with its original filename) into the corresponding directory "hdfs://basepath/importantPart/", so that the absolute path for a file is "hdfs://basepath/importantPart/prefixA.prefixB.importantPart.csv".
a) how can I parse the filename to extract "importantPart" to create the output HDFS path accordingly, or is this possible at all with Flume?
b) how to preserve the original filename so that the HDFS sink writes to the file with the same filename, again, possible at all?
Yes, I know, Flume isn't the right tool for such "file copy" approaches it's working on events, but nevertheless it is interesting if it is possible or if someone did this already.
Any hint highly appreciated....
Created ‎12-28-2013 08:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎12-28-2013 08:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎01-06-2014 01:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
many thanks for your explanation. I'll check out the custom sink wrapper stuff...
