Support Questions

Find answers, ask questions, and share your expertise

How do I add additional columns to Flow File content

avatar
Cloudera Employee

Goal is to read the data from RDBMS table and store it in HDFS in avro format with additional column , let say source table has 5 columns , as part of the ingestion I would like to add additional column say "ingest_datetime" with current_time value before nifi stores the file in HDFS finally HDFS should have avro file with 6 columns in the end .

Currently I am using ExecuteSQL --> PutHDFS processors

1 ACCEPTED SOLUTION

avatar
Master Guru

Currently there aren't any processors that perform direct manipulation of Avro, although we definitely would like to have some.

Possible options to work around this...

  • Use ConvertAvroToJson followed by the new JOLT transform processor followed by ConvertJsonToAvro (involves a lot of conversion and may lose some of the initial schema)
  • Use ExecuteScript processor to manipulate the Avro (I am not sure if any of NiFi's supported scripting languages have good Avro support)
  • Write a custom Java processor to manipulate the Avro (https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions)

Happy to help answer any questions if going the custom Java processor route.

View solution in original post

4 REPLIES 4

avatar
Master Guru

Currently there aren't any processors that perform direct manipulation of Avro, although we definitely would like to have some.

Possible options to work around this...

  • Use ConvertAvroToJson followed by the new JOLT transform processor followed by ConvertJsonToAvro (involves a lot of conversion and may lose some of the initial schema)
  • Use ExecuteScript processor to manipulate the Avro (I am not sure if any of NiFi's supported scripting languages have good Avro support)
  • Write a custom Java processor to manipulate the Avro (https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions)

Happy to help answer any questions if going the custom Java processor route.

avatar
Cloudera Employee

Thanks for the response ..

Do you know when the new JOLT transform processor is going to be releasing ? existing 0.61. or 0.7 does not have this new processor you are talking about, but NIFI-361 ticket is talking about it.

avatar
Master Guru

avatar

@Sreekanth Munigati , As @Bryan Bende mentioned, there is no direct way of manipulating Avro data, but in your case you can try modifying SQL being executed by ExecuteSQL processor to add an additional column in SQL itself.