I see there's already a Flume morphline solr sink but how can I use a morphline with Flume to write to HDFS without Solr. I'm currently using Flume to partition avro data (AvroFlumeEventSerializer) but I'd like to use the ExtractAvroPath morphline to flatten the complex types so IMPALA can query them. Is this possible?
Ok, I see, I can embed JAVA in the config file to open a file and write the records. But will I not be opening lots of files? How can I batch it and roll the file like the HDFS Flume sink?
Thanks, I'll have a look.
I do however think you missed a trick by not having this as a stock command. Some of my colleagues dismissed morphines as only applicable to Solr. They wanted to do the same as me (have Flume flatten Avro when ingesting the data) but by-passed them and have PIG scripts running instead.
HDFS is the backbone of Hadoop so standard way for morphlines to write to it outside Solr, would have been great!