- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Converting JSON to Avro (Field names have special chars ":") in NiFI.
- Labels:
-
Apache NiFi
Created ‎07-02-2018 09:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm trying to convert JSON to Avro in Nifi. The problem is that some of the fields have ":" in their names -- which is not allowed in avro. (JSON was translated from XML that had namespaces)
EX:
{
"sourceTimeStamp": "2018-06-21T17:35:48Z",
"msgType": "blah",
"Ref": 71768908,
"fdm:Information": { ... }
}
Would it be easier to just go from XML (with name spaces) to Avro?
Any suggestions? Thanks.
Created ‎07-02-2018 09:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If they only have one colon, you could use the following transform in JoltTransformJSON to change the names to be valid in Avro:
[ { "operation": "shift", "spec": { "*:*": "&(0,1)_&(0,2)", "*": "&" } } ]
If you are translating from XML using a stylesheet, perhaps you could do a similar thing with the XSLT. Also as of NiFi 1.7.0 there is an XML record reader, so you could also use UpdateRecord to replace the values from the namespaced fields into a field with an Avro-valid name.
Created ‎07-02-2018 09:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If they only have one colon, you could use the following transform in JoltTransformJSON to change the names to be valid in Avro:
[ { "operation": "shift", "spec": { "*:*": "&(0,1)_&(0,2)", "*": "&" } } ]
If you are translating from XML using a stylesheet, perhaps you could do a similar thing with the XSLT. Also as of NiFi 1.7.0 there is an XML record reader, so you could also use UpdateRecord to replace the values from the namespaced fields into a field with an Avro-valid name.
Created ‎07-03-2018 01:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Matt,
this works great - but is there a way to apply the same shift operation recursively to nested fields with colons as well.
EX:
{
"sourceTimeStamp": "2018-06-21T17:35:48Z",
"msgType": "blah",
"Ref": 71768908,
"fdm:Information": {
"fdm:first":{
"fdm:last": etc
}
}
}
Thanks, for the XML tips for 1.7.
Created ‎07-10-2018 02:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can't do it recursively but you can manually do it if you know the max depth of the nested fields, you just need a similar spec to match each nested field, on the level that it matches. So instead of "*": "&", you'd try to match the nested fields underneath, with a sub-spec pretty much identical to the top-level one.
