Created 02-28-2018 09:34 PM
I have json content streaming in, and I would like to rename the field names, specifically I need to remove "." as I am inserting into MongoDB which does not support periods in key names.
Requirements:
Has anyone tackled an issue like this before?
Created 02-28-2018 09:58 PM
If you knew the schema of the incoming content, I believe you can use schema aliases in conjunction with ConvertRecord, but since you mention field names are unknown, I'm guessing you won't know the schema(s) either 🙂
You can do this with the JoltTransformJSON processor, although I don't think it supports arbitrary numbers of periods as its matching operator is as non-greedy as possible. Here is a spec that works for 1-3 periods:
[ { "operation": "shift", "spec": { "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)", "*.*.*": "&(0,1)_&(0,2)_&(0,3)", "*.*": "&(0,1)_&(0,2)", "*": "&" } } ]
Note you could continue this pattern for any discrete number of periods. Also note that the above spec works for "flat" JSON files. For nested fields you'd have to go "one level deeper" and apply the same pattern, here's a spec that works for 1-3 periods, 1-2 fields deep:
[ { "operation": "shift", "spec": { "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)", "*.*.*": "&(0,1)_&(0,2)_&(0,3)", "*.*": "&(0,1)_&(0,2)", "*": { "*.*.*.*": "&(0,1)_&(0,2)_&(0,3)_&(0,4)", "*.*.*": "&(0,1)_&(0,2)_&(0,3)", "*.*": "&1.&(0,1)_&(0,2)", "*": "&" } } } ]
If the incoming JSON is truly "wild west", you could use Groovy in an ExecuteScript processor along with a JsonSlurper (and JsonOutput) to change the keys at arbitrary depths with arbitrary numbers of periods.
Created 02-28-2018 10:51 PM
Thank you @Matt Burgess. I will give this a shot. My alternate strategy is to infer an avro schema, and then use @Timothy Spann 's attribute cleaner and then convert back to JSON. https://github.com/tspannhw/nifi-attributecleaner-processor
Created 02-28-2018 11:02 PM
Unfortunately the jolt spec isn't working as intended. Is is replacing the field names with their values. On to Plan B...
Created 02-28-2018 11:58 PM
Sorry about that, I had a copy-paste error, each of those "*...." entries should be followed by a ".&", so
"*.*.*.*":"&(0,1)_&(0,2)_&(0,3)_&(0,4).&"
Created 03-01-2018 12:00 AM
No wait, that won't work. Hmm, it worked in the Jolt preview. Do you have an array of objects at the top level, vs the single object you have in your example?
Created 02-02-2022 03:33 AM
Hello @mburgess ,
I have similar problem here & i am newbie in Nifi.
Actually, i know the structure of my json flowflie.
Please, how can i use schema aliases in conjunction with ConvertRecord for my problem.
The screenshot of my json structure is attached here.
Thank you so much.
Created 02-02-2022 10:59 PM
@rafy, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
Regards,
Vidya Sargur,