- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How do I add additional columns to Flow File content
- Labels:
-
Apache NiFi
Created 07-18-2016 08:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Goal is to read the data from RDBMS table and store it in HDFS in avro format with additional column , let say source table has 5 columns , as part of the ingestion I would like to add additional column say "ingest_datetime" with current_time value before nifi stores the file in HDFS finally HDFS should have avro file with 6 columns in the end .
Currently I am using ExecuteSQL --> PutHDFS processors
Created 07-18-2016 09:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Currently there aren't any processors that perform direct manipulation of Avro, although we definitely would like to have some.
Possible options to work around this...
- Use ConvertAvroToJson followed by the new JOLT transform processor followed by ConvertJsonToAvro (involves a lot of conversion and may lose some of the initial schema)
- Use ExecuteScript processor to manipulate the Avro (I am not sure if any of NiFi's supported scripting languages have good Avro support)
- Write a custom Java processor to manipulate the Avro (https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions)
Happy to help answer any questions if going the custom Java processor route.
Created 07-18-2016 09:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Currently there aren't any processors that perform direct manipulation of Avro, although we definitely would like to have some.
Possible options to work around this...
- Use ConvertAvroToJson followed by the new JOLT transform processor followed by ConvertJsonToAvro (involves a lot of conversion and may lose some of the initial schema)
- Use ExecuteScript processor to manipulate the Avro (I am not sure if any of NiFi's supported scripting languages have good Avro support)
- Write a custom Java processor to manipulate the Avro (https://cwiki.apache.org/confluence/display/NIFI/Maven+Projects+for+Extensions)
Happy to help answer any questions if going the custom Java processor route.
Created 07-19-2016 06:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the response ..
Do you know when the new JOLT transform processor is going to be releasing ? existing 0.61. or 0.7 does not have this new processor you are talking about, but NIFI-361 ticket is talking about it.
Created 07-19-2016 07:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is in the 0.7.0 release, part of the stanadard bundle:
Created 07-19-2016 01:51 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Sreekanth Munigati , As @Bryan Bende mentioned, there is no direct way of manipulating Avro data, but in your case you can try modifying SQL being executed by ExecuteSQL processor to add an additional column in SQL itself.
