Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Dynamically renaming JSON field names in NiFi

Dynamically renaming JSON field names in NiFi

Rising Star

I have json content streaming in, and I would like to rename the field names, specifically I need to remove "." as I am inserting into MongoDB which does not support periods in key names.

Requirements:

  • Only field names are modified, not content. So {"host.ip":"192.168.1.1"} -> {"host_ip":"192.168.1.1"}
  • Field names are unknown, so I can't explicitly replace
  • I would prefer not to split the json array into individual flow files.

Has anyone tackled an issue like this before?

5 REPLIES 5

Re: Dynamically renaming JSON field names in NiFi

If you knew the schema of the incoming content, I believe you can use schema aliases in conjunction with ConvertRecord, but since you mention field names are unknown, I'm guessing you won't know the schema(s) either :)

You can do this with the JoltTransformJSON processor, although I don't think it supports arbitrary numbers of periods as its matching operator is as non-greedy as possible. Here is a spec that works for 1-3 periods:

[
  {
    "operation": "shift",
    "spec": {
      "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)",
      "*.*.*": "&(0,1)_&(0,2)_&(0,3)",
      "*.*": "&(0,1)_&(0,2)",
      "*": "&"
    }
  }
]

Note you could continue this pattern for any discrete number of periods. Also note that the above spec works for "flat" JSON files. For nested fields you'd have to go "one level deeper" and apply the same pattern, here's a spec that works for 1-3 periods, 1-2 fields deep:

[
  {
    "operation": "shift",
    "spec": {
      "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)",
      "*.*.*": "&(0,1)_&(0,2)_&(0,3)",
      "*.*": "&(0,1)_&(0,2)",
      "*": {
        "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)",
        "*.*.*": "&(0,1)_&(0,2)_&(0,3)",
        "*.*": "&1.&(0,1)_&(0,2)",
        "*": "&"
      }
    }
  }
]

If the incoming JSON is truly "wild west", you could use Groovy in an ExecuteScript processor along with a JsonSlurper (and JsonOutput) to change the keys at arbitrary depths with arbitrary numbers of periods.

Highlighted

Re: Dynamically renaming JSON field names in NiFi

Rising Star

Thank you @Matt Burgess. I will give this a shot. My alternate strategy is to infer an avro schema, and then use @Timothy Spann 's attribute cleaner and then convert back to JSON. https://github.com/tspannhw/nifi-attributecleaner-processor

Re: Dynamically renaming JSON field names in NiFi

Rising Star

Unfortunately the jolt spec isn't working as intended. Is is replacing the field names with their values. On to Plan B...

Re: Dynamically renaming JSON field names in NiFi

Sorry about that, I had a copy-paste error, each of those "*...." entries should be followed by a ".&", so

"*.*.*.*":"&(0,1)_&(0,2)_&(0,3)_&(0,4).&"

Re: Dynamically renaming JSON field names in NiFi

No wait, that won't work. Hmm, it worked in the Jolt preview. Do you have an array of objects at the top level, vs the single object you have in your example?