Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to do JOLT replace on all JSON keys in Nifi

avatar
Super Collaborator

Hi,

I have a requirement for transforming JSON on Nifi which seems simple but I haven't been able to solve:

Input json:

{
	"agent-submit-time" : -1,
	"agent-end-time" : 123445,
	"agent-name" : "Marie Bayer-Smith"
}

Desired:

{
	"agent_submit_time" : -1,
	"agent_end_time" : 123445,
	"agent_name" : "Marie Bayer-Smith"
}

-I don't want to use the ReplaceText processor since replacing "-" for "_" might impact values too

-I need this to be able to infer AVRO schema on incoming JSON records (AVRO does not like the dashes at all)

-since I already use a Jolt processor for another transformation in the JSON it makes sense to include it in the same processor to prevent unnecessary Nifi overhead.

I think I would need the JoltTransformJSON processor as it is very powerful (but the syntax evades me) for this but open for other options too.

1 ACCEPTED SOLUTION

avatar
Master Guru

You can use the following JOLT spec in the JoltTransformJSON processor:

[
  {
    "operation": "shift",
    "spec": {
      "*-*-*": "&(0,1)_&(0,2)_&(0,3)",
      "*-*": "&(0,1)_&(0,2)",
      "*": "&"
    }
  }
]

Note that you have to add an entry for each "level" of name matching, so if you could have elements with 4 dashes, you'd need an extra line at the top, following the same pattern.

View solution in original post

10 REPLIES 10

avatar
Super Collaborator

@Shu

Maybe, one more thing:

-I would preferably don't want to do any hardcoding, since the JSON has many keys I am specifically looking for a dynamic way

I came across JOLT functions like =toLower / =join and was kind of hoping for =replace but that one seems missing.

avatar
Master Guru

You can use the following JOLT spec in the JoltTransformJSON processor:

[
  {
    "operation": "shift",
    "spec": {
      "*-*-*": "&(0,1)_&(0,2)_&(0,3)",
      "*-*": "&(0,1)_&(0,2)",
      "*": "&"
    }
  }
]

Note that you have to add an entry for each "level" of name matching, so if you could have elements with 4 dashes, you'd need an extra line at the top, following the same pattern.

avatar
Super Collaborator

@Matt Burgess I don't get the syntax, but it works like a charm! Thank you

avatar
Master Guru

JOLT can be fairly complicated. In this case the * on the left is a non-greedy match, the &(x,y) on the right is to grab the individual pieces.

avatar
New Contributor

@Matt Burgess It works fine if there is just one object in the input tree if there are more it makes them as an array rather than separate records. Like

{
  "agent_submit_time" : [ -1, -1 ],
  "agent_end_time" : [ 123445, 123445 ],
  "agent_name" : [ "Marie Bayer-Smith", "Marie Bayer-Smith" ]
}

I would like to to be something like


[
{
  "agent_submit_time" : -1,
  "agent_end_time" : 123445,
  "agent_name" : "Marie Bayer-Smith"
},
{
  "agent_submit_time" : -1,
  "agent_end_time" : 123445,
  "agent_name" : "Marie Bayer-Smith"
}
]

How to do that. I tried but I couldnt replaceing "*": "&" with "@": "[&]" makes it separate but the transformation of - to _ doesnt takes place.

avatar

Hi Jasper

You can use the JoltTransformJson processor to get it done. Follows how should your processor config look like.

64596-jolt-specification.png

Follows the complete Jolt specification.

[ 
 { "operation": "shift", 
 "spec": { 
  "agent-submit-time": "agent_submit_time", 
  "agent-end-time": "agent_end_time", 
  "agent-name": "agent_name", 
  "*": { 
   "@": "&" 
  } 
 } 
 } 
] 

Follows a snippet of the output I got using your input.

64597-screen-shot-2018-03-13-at-15315-pm.png

Hope that helps.

avatar
Super Collaborator

@Rahul Soni Thanks for your answer but I really don't want to hardcode all of my 50+ JSON fields

avatar
New Contributor

@Matt Burgess

How to do this for more complex example like below: Please help.


{
"firstname" : "John"
"address" : { "Convert-this": [ "not-this","also-not-this "],
"Again-convert-this":[ "but-not-this","also-not-this","no-for-this-too" ]
}
}

avatar
Master Guru

This one is only more complex because you want to convert the field names at the second level not the first, so you want to match "address" first, then use the above spec for each field in there, and then also transfer any fields at the top level over as-is (namely "firstname", the spec (which is specific for this example) is:

[
  {
    "operation": "shift",
    "spec": {
      "address": {
        "*-*-*": "&(0,1)_&(0,2)_&(0,3)",
        "*-*": "&(0,1)_&(0,2)",
        "*": "&"
      },
      "*": "&"
    }
  }
]