I'm curious if it is possible to solve this problem with Flume:
I have a SpoolingDir source where files with names in the format "prefixA.prefixB.importantPart.csv" will be moved to
The files shall be put into HDFS (with its original filename) into the corresponding directory "hdfs://basepath/importantPart/", so that the absolute path for a file is "hdfs://basepath/importantPart/prefixA.prefixB.importantPart.csv".
a) how can I parse the filename to extract "importantPart" to create the output HDFS path accordingly, or is this possible at all with Flume?
b) how to preserve the original filename so that the HDFS sink writes to the file with the same filename, again, possible at all?
Yes, I know, Flume isn't the right tool for such "file copy" approaches it's working on events, but nevertheless it is interesting if it is possible or if someone did this already.
Any hint highly appreciated....