I'm currently evaluating NiFi for a project where we're likely to use lots of CSVs for a while (until they're phased out). The standard answer of ReplaceText, etc. isn't really suitable as I'll need to deal with optionally quoted fields, etc. I've setup a simple test flow which is ExecuteSQL (to pull data
direct from SQL) -> SplitAvro (to split the data up) ->
UpdateAttribute (update filename) -> ConvertAvroToJson ->
ExecuteScript (to convert JSON to the required CSV format using the CSV
module in ruby) -> PutFile Since there may be a fair few of
these which are subtly different I was hoping to combine the
ConvertAvroToJson and ExecuteScript into 1 ExecuteScript that reads the
Avro and converts it straight into my required format. I also wanted to
use it as an excuse to see how to pull in additional dependencies for an
ExecuteScript. What I'm unclear on is how to properly import the Avro jar's. I originally assumed since they were included in NiFi's Avro processors I could just require 'java'
// blah blah
Java::OrgApacheAvroGeneric::GenericDatumReader.new but I've been getting java.lang.NoClassDefFoundError: Could not initialize class org.apache.avro.generic.GenericData, no matter what I try (I've tried pointing the ExecuteScript at a directory of jars, assuming I could use the built-in ones in the avro processors, etc.) Any pointers would be really appreciated.
... View more