Created 03-13-2018 12:17 PM
Hi,
I have a requirement for transforming JSON on Nifi which seems simple but I haven't been able to solve:
Input json:
{ "agent-submit-time" : -1, "agent-end-time" : 123445, "agent-name" : "Marie Bayer-Smith" }
Desired:
{ "agent_submit_time" : -1, "agent_end_time" : 123445, "agent_name" : "Marie Bayer-Smith" }
-I don't want to use the ReplaceText processor since replacing "-" for "_" might impact values too
-I need this to be able to infer AVRO schema on incoming JSON records (AVRO does not like the dashes at all)
-since I already use a Jolt processor for another transformation in the JSON it makes sense to include it in the same processor to prevent unnecessary Nifi overhead.
I think I would need the JoltTransformJSON processor as it is very powerful (but the syntax evades me) for this but open for other options too.
Created 03-13-2018 05:12 PM
You can use the following JOLT spec in the JoltTransformJSON processor:
[ { "operation": "shift", "spec": { "*-*-*": "&(0,1)_&(0,2)_&(0,3)", "*-*": "&(0,1)_&(0,2)", "*": "&" } } ]
Note that you have to add an entry for each "level" of name matching, so if you could have elements with 4 dashes, you'd need an extra line at the top, following the same pattern.
Created 03-13-2018 01:24 PM
Maybe, one more thing:
-I would preferably don't want to do any hardcoding, since the JSON has many keys I am specifically looking for a dynamic way
I came across JOLT functions like =toLower / =join and was kind of hoping for =replace but that one seems missing.
Created 03-13-2018 05:12 PM
You can use the following JOLT spec in the JoltTransformJSON processor:
[ { "operation": "shift", "spec": { "*-*-*": "&(0,1)_&(0,2)_&(0,3)", "*-*": "&(0,1)_&(0,2)", "*": "&" } } ]
Note that you have to add an entry for each "level" of name matching, so if you could have elements with 4 dashes, you'd need an extra line at the top, following the same pattern.
Created 03-13-2018 07:32 PM
@Matt Burgess I don't get the syntax, but it works like a charm! Thank you
Created 03-13-2018 07:55 PM
JOLT can be fairly complicated. In this case the * on the left is a non-greedy match, the &(x,y) on the right is to grab the individual pieces.
Created 03-28-2019 02:01 PM
@Matt Burgess It works fine if there is just one object in the input tree if there are more it makes them as an array rather than separate records. Like
{ "agent_submit_time" : [ -1, -1 ], "agent_end_time" : [ 123445, 123445 ], "agent_name" : [ "Marie Bayer-Smith", "Marie Bayer-Smith" ] }
I would like to to be something like
[
{
"agent_submit_time" : -1,
"agent_end_time" : 123445,
"agent_name" : "Marie Bayer-Smith"
},
{
"agent_submit_time" : -1,
"agent_end_time" : 123445,
"agent_name" : "Marie Bayer-Smith"
}
]
How to do that. I tried but I couldnt replaceing "*": "&" with "@": "[&]" makes it separate but the transformation of - to _ doesnt takes place.
Created on 03-13-2018 05:57 PM - edited 08-18-2019 02:32 AM
Hi Jasper
You can use the JoltTransformJson processor to get it done. Follows how should your processor config look like.
Follows the complete Jolt specification.
[ { "operation": "shift", "spec": { "agent-submit-time": "agent_submit_time", "agent-end-time": "agent_end_time", "agent-name": "agent_name", "*": { "@": "&" } } } ]
Follows a snippet of the output I got using your input.
Hope that helps.
Created 03-13-2018 06:13 PM
@Rahul Soni Thanks for your answer but I really don't want to hardcode all of my 50+ JSON fields
Created 08-14-2018 06:24 PM
How to do this for more complex example like below: Please help.
{
"firstname" : "John"
"address" : { "Convert-this": [ "not-this","also-not-this "],
"Again-convert-this":[ "but-not-this","also-not-this","no-for-this-too" ]
}
}
Created 08-14-2018 06:42 PM
This one is only more complex because you want to convert the field names at the second level not the first, so you want to match "address" first, then use the above spec for each field in there, and then also transfer any fields at the top level over as-is (namely "firstname", the spec (which is specific for this example) is:
[ { "operation": "shift", "spec": { "address": { "*-*-*": "&(0,1)_&(0,2)_&(0,3)", "*-*": "&(0,1)_&(0,2)", "*": "&" }, "*": "&" } } ]