Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Dynamically renaming JSON field names in NiFi

avatar
Expert Contributor

I have json content streaming in, and I would like to rename the field names, specifically I need to remove "." as I am inserting into MongoDB which does not support periods in key names.

Requirements:

  • Only field names are modified, not content. So {"host.ip":"192.168.1.1"} -> {"host_ip":"192.168.1.1"}
  • Field names are unknown, so I can't explicitly replace
  • I would prefer not to split the json array into individual flow files.

Has anyone tackled an issue like this before?

7 REPLIES 7

avatar
Master Guru

If you knew the schema of the incoming content, I believe you can use schema aliases in conjunction with ConvertRecord, but since you mention field names are unknown, I'm guessing you won't know the schema(s) either 🙂

You can do this with the JoltTransformJSON processor, although I don't think it supports arbitrary numbers of periods as its matching operator is as non-greedy as possible. Here is a spec that works for 1-3 periods:

[
  {
    "operation": "shift",
    "spec": {
      "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)",
      "*.*.*": "&(0,1)_&(0,2)_&(0,3)",
      "*.*": "&(0,1)_&(0,2)",
      "*": "&"
    }
  }
]

Note you could continue this pattern for any discrete number of periods. Also note that the above spec works for "flat" JSON files. For nested fields you'd have to go "one level deeper" and apply the same pattern, here's a spec that works for 1-3 periods, 1-2 fields deep:

[
  {
    "operation": "shift",
    "spec": {
      "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)",
      "*.*.*": "&(0,1)_&(0,2)_&(0,3)",
      "*.*": "&(0,1)_&(0,2)",
      "*": {
        "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)",
        "*.*.*": "&(0,1)_&(0,2)_&(0,3)",
        "*.*": "&1.&(0,1)_&(0,2)",
        "*": "&"
      }
    }
  }
]

If the incoming JSON is truly "wild west", you could use Groovy in an ExecuteScript processor along with a JsonSlurper (and JsonOutput) to change the keys at arbitrary depths with arbitrary numbers of periods.

avatar
Expert Contributor

Thank you @Matt Burgess. I will give this a shot. My alternate strategy is to infer an avro schema, and then use @Timothy Spann 's attribute cleaner and then convert back to JSON. https://github.com/tspannhw/nifi-attributecleaner-processor

avatar
Expert Contributor

Unfortunately the jolt spec isn't working as intended. Is is replacing the field names with their values. On to Plan B...

avatar
Master Guru

Sorry about that, I had a copy-paste error, each of those "*...." entries should be followed by a ".&", so

"*.*.*.*":"&(0,1)_&(0,2)_&(0,3)_&(0,4).&"

avatar
Master Guru

No wait, that won't work. Hmm, it worked in the Jolt preview. Do you have an array of objects at the top level, vs the single object you have in your example?

avatar
Contributor

Hello @mburgess ,

I have similar problem here & i am newbie in Nifi.

Actually, i know the structure of my json flowflie.

Please, how can i use schema aliases in conjunction with ConvertRecord for my problem.

The screenshot of my json structure is attached here.

Thank you so much.

Screenshot 2022-02-02 123236.png

avatar
Community Manager

@rafy, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: