- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to do JOLT replace on all JSON keys in Nifi
- Labels:
-
Apache NiFi
Created ‎03-13-2018 12:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a requirement for transforming JSON on Nifi which seems simple but I haven't been able to solve:
Input json:
{ "agent-submit-time" : -1, "agent-end-time" : 123445, "agent-name" : "Marie Bayer-Smith" }
Desired:
{ "agent_submit_time" : -1, "agent_end_time" : 123445, "agent_name" : "Marie Bayer-Smith" }
-I don't want to use the ReplaceText processor since replacing "-" for "_" might impact values too
-I need this to be able to infer AVRO schema on incoming JSON records (AVRO does not like the dashes at all)
-since I already use a Jolt processor for another transformation in the JSON it makes sense to include it in the same processor to prevent unnecessary Nifi overhead.
I think I would need the JoltTransformJSON processor as it is very powerful (but the syntax evades me) for this but open for other options too.
Created ‎03-13-2018 05:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use the following JOLT spec in the JoltTransformJSON processor:
[ { "operation": "shift", "spec": { "*-*-*": "&(0,1)_&(0,2)_&(0,3)", "*-*": "&(0,1)_&(0,2)", "*": "&" } } ]
Note that you have to add an entry for each "level" of name matching, so if you could have elements with 4 dashes, you'd need an extra line at the top, following the same pattern.
Created ‎03-13-2018 01:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe, one more thing:
-I would preferably don't want to do any hardcoding, since the JSON has many keys I am specifically looking for a dynamic way
I came across JOLT functions like =toLower / =join and was kind of hoping for =replace but that one seems missing.
Created ‎03-13-2018 05:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use the following JOLT spec in the JoltTransformJSON processor:
[ { "operation": "shift", "spec": { "*-*-*": "&(0,1)_&(0,2)_&(0,3)", "*-*": "&(0,1)_&(0,2)", "*": "&" } } ]
Note that you have to add an entry for each "level" of name matching, so if you could have elements with 4 dashes, you'd need an extra line at the top, following the same pattern.
Created ‎03-13-2018 07:32 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Matt Burgess I don't get the syntax, but it works like a charm! Thank you
Created ‎03-13-2018 07:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
JOLT can be fairly complicated. In this case the * on the left is a non-greedy match, the &(x,y) on the right is to grab the individual pieces.
Created ‎03-28-2019 02:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Matt Burgess It works fine if there is just one object in the input tree if there are more it makes them as an array rather than separate records. Like
{ "agent_submit_time" : [ -1, -1 ], "agent_end_time" : [ 123445, 123445 ], "agent_name" : [ "Marie Bayer-Smith", "Marie Bayer-Smith" ] }
I would like to to be something like
[
{
"agent_submit_time" : -1,
"agent_end_time" : 123445,
"agent_name" : "Marie Bayer-Smith"
},
{
"agent_submit_time" : -1,
"agent_end_time" : 123445,
"agent_name" : "Marie Bayer-Smith"
}
]
How to do that. I tried but I couldnt replaceing "*": "&" with "@": "[&]" makes it separate but the transformation of - to _ doesnt takes place.
Created on ‎03-13-2018 05:57 PM - edited ‎08-18-2019 02:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jasper
You can use the JoltTransformJson processor to get it done. Follows how should your processor config look like.
Follows the complete Jolt specification.
[ { "operation": "shift", "spec": { "agent-submit-time": "agent_submit_time", "agent-end-time": "agent_end_time", "agent-name": "agent_name", "*": { "@": "&" } } } ]
Follows a snippet of the output I got using your input.
Hope that helps.
Created ‎03-13-2018 06:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Rahul Soni Thanks for your answer but I really don't want to hardcode all of my 50+ JSON fields
Created ‎08-14-2018 06:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How to do this for more complex example like below: Please help.
{
"firstname" : "John"
"address" : { "Convert-this": [ "not-this","also-not-this "],
"Again-convert-this":[ "but-not-this","also-not-this","no-for-this-too" ]
}
}
Created ‎08-14-2018 06:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This one is only more complex because you want to convert the field names at the second level not the first, so you want to match "address" first, then use the above spec for each field in there, and then also transfer any fields at the top level over as-is (namely "firstname", the spec (which is specific for this example) is:
[ { "operation": "shift", "spec": { "address": { "*-*-*": "&(0,1)_&(0,2)_&(0,3)", "*-*": "&(0,1)_&(0,2)", "*": "&" }, "*": "&" } } ]
