Created 07-02-2018 09:28 PM
Hello,
I'm trying to convert JSON to Avro in Nifi. The problem is that some of the fields have ":" in their names -- which is not allowed in avro. (JSON was translated from XML that had namespaces)
EX:
{
"sourceTimeStamp": "2018-06-21T17:35:48Z",
"msgType": "blah",
"Ref": 71768908,
"fdm:Information": { ... }
}
Would it be easier to just go from XML (with name spaces) to Avro?
Any suggestions? Thanks.
Created 07-02-2018 09:54 PM
If they only have one colon, you could use the following transform in JoltTransformJSON to change the names to be valid in Avro:
[ { "operation": "shift", "spec": { "*:*": "&(0,1)_&(0,2)", "*": "&" } } ]
If you are translating from XML using a stylesheet, perhaps you could do a similar thing with the XSLT. Also as of NiFi 1.7.0 there is an XML record reader, so you could also use UpdateRecord to replace the values from the namespaced fields into a field with an Avro-valid name.
Created 07-02-2018 09:54 PM
If they only have one colon, you could use the following transform in JoltTransformJSON to change the names to be valid in Avro:
[ { "operation": "shift", "spec": { "*:*": "&(0,1)_&(0,2)", "*": "&" } } ]
If you are translating from XML using a stylesheet, perhaps you could do a similar thing with the XSLT. Also as of NiFi 1.7.0 there is an XML record reader, so you could also use UpdateRecord to replace the values from the namespaced fields into a field with an Avro-valid name.
Created 07-03-2018 01:24 AM
Matt,
this works great - but is there a way to apply the same shift operation recursively to nested fields with colons as well.
EX:
{
"sourceTimeStamp": "2018-06-21T17:35:48Z",
"msgType": "blah",
"Ref": 71768908,
"fdm:Information": {
"fdm:first":{
"fdm:last": etc
}
}
}
Thanks, for the XML tips for 1.7.
Created 07-10-2018 02:02 AM
You can't do it recursively but you can manually do it if you know the max depth of the nested fields, you just need a similar spec to match each nested field, on the level that it matches. So instead of "*": "&", you'd try to match the nested fields underneath, with a sub-spec pretty much identical to the top-level one.