Created 04-18-2019 06:17 PM
I have the following example JSON in a flowfile (data has been randomly imputed for this example) :
{ "TrackingRequestFirstIp":"192.10.01.01", "TrackingRequestLastIp":"100.120.123.456", "TrackingRequestCreationTime":"05:49:55", "TrackingRequestCreationDate":"20171013", "TrackingRequestTrackingNumber":"12349876" }
Here is my AVRO schema:
{ "name": "IpReports", "type": "record", "namespace": "com.acme.avro", "fields": [ {"name": "DataSourceInformation", "type":{ "type":"map", "values": { "name":"DataSourceInformation", "type":"record", "fields": [ {"name":"DataSourceUuid", "type":["null","string"], "default":null}, {"name":"DataOwnerUuid", "type":["null","string"], "default":null}, {"name":"RecordUuid", "type":["null","string"], "default":null] }} }, {"name": "TrackingRequestInformation", "type": { "type":"map", "values": { "name":"TrackingRequestInformation", "type":"record", "fields": [ {"name": "TrackingRequestCreationTime", "type":["null", "string"], "default":null, "aliases":["time_cst"]}, {"name": "TrackingRequestCreationDate", "type":["null", "string"], "default":null, "aliases":["date_cst"]}, {"name": "TrackingRequestTrackingNumber", "type":["null", "string"], "default":null, "aliases":["fedex_airbill"]}, {"name": "TrackingRequestFirstIp", "type":["null", "string"], "default":null, "aliases":["source_ip"]}, {"name": "TrackingRequestLastIp", "type":["null", "string"], "default":null, "aliases":["dest_ip"]}, {"name": "TrackingRequestFirstCityName", "type":["null", "string"], "default":null, "aliases":["source_ip_city"]}, {"name": "TrackingRequestFirstIpLat", "type":["null", "double"], "default":null, "aliases":["source_ip_lat"]}, {"name": "TrackingRequestFirstIpLong", "type":["null", "double"], "default":null, "aliases":["source_ip_long"]}, {"name": "TrackingRequestFirstIpCountryName", "type":["null", "string"], "default":null, "aliases":["source_ip_country"]}, {"name": "TrackingRequestFirstIpCountryCode", "type":["null", "string"], "default":null, "aliases":["source_ip_country_iso"]}, {"name": "TrackingRequestFirstIpPostalCode", "type":["null", "string"], "default":null, "aliases":["source_ip_country_postal"]}, {"name": "TrackingRequestFirstIpGeoLocation", "type":["null", "string"], "default":null, "aliases":["source_location"]}, {"name": "TrackingRequestLastCityName", "type":["null", "string"], "default":null, "aliases":["dest_ip_city"]}, {"name": "TrackingRequestLastIpLat", "type":["null", "double"], "default":null, "aliases":["dest_ip_lat"]}, {"name": "TrackingRequestLastIpLong", "type":["null", "double"], "default":null, "aliases":["dest_ip_long"]}, {"name": "TrackingRequestLastIpCountryName", "type":["null", "string"], "default":null, "aliases":["dest_ip_country"]}, {"name": "TrackingRequestLastIpCountryCode", "type":["null", "string"], "default":null, "aliases":["dest_ip_country_iso"]}, {"name": "TrackingRequestLastIpPostalCode", "type":["null", "string"], "default":null, "aliases":["dest_ip_country_postal"]}, {"name": "TrackingRequestLastIpGeoLocation", "type":["null", "string"], "default":null, "aliases":["dest_location"]}] }} }] }
Further on in my nifi process group, I will be enriching the original JSON and adding the missing fields.
I would like to convert the original JSON above into the JSON below using my AVRO schema:
{ "DataSourceInformation": { "DataSourceUuid":null, "DataOwnerUuid":null, "RecordUuid":null }, "TrackingRequestInformation": { "TrackingRequestFirstIp":"192.10.01.01", "TrackingRequestLastIp":"100.120.123.456", "TrackingRequestCreationTime":"05:49:55", "TrackingRequestCreationDate":"20171013", "TrackingRequestTrackingNumber":"12349876", "TrackingRequestFirstIpLat":null, "TrackingRequestFirstIpLong":null, ... ... ... ... ... "TrackingRequestLastIpGeoLocation":null } }
However, when I use a convert record processor, I get the following output:
[ { "DataSourceInformation" : null, "TrackingRequestInformation" : null } ]
Any thoughts on what I am doing wrong or suggestions on how to correct it?
Created 04-18-2019 07:15 PM
Try JoltTransformJSON with the following spec:
[ { "operation": "shift", "spec": { "*": "TrackingRequestInformation.&" } }, { "operation": "default", "spec": { "TrackingRequestInformation": { "TrackingRequestFirstIp": null, "TrackingRequestLastIp": null, "TrackingRequestCreationTime": null, "TrackingRequestCreationDate": null, "TrackingRequestTrackingNumber": null, "TrackingRequestFirstCityName": null, "TrackingRequestFirstIpLat": null, "TrackingRequestFirstIpLong": null, "TrackingRequestFirstIpCountryName": null, "TrackingRequestFirstIpCountryCode": null, "TrackingRequestFirstIpPostalCode": null, "TrackingRequestFirstIpGeoLocation": null, "TrackingRequestLastCityName": null, "TrackingRequestLastIpLat": null, "TrackingRequestLastIpLong": null, "TrackingRequestLastIpCountryName": null, "TrackingRequestLastIpCountryCode": null, "TrackingRequestLastIpPostalCode": null, "TrackingRequestLastIpGeoLocation": null }, "DataSourceInformation": { "DataSourceUuid": null, "DataOwnerUuid": null, "RecordUuid": null } } } ]
This spec moves the tracking data into TrackingRequestInformation, adds a default DataSourceInformation object, and adds any fields that aren't in the original input, seems to output what you describe above.
Created 04-18-2019 07:15 PM
Try JoltTransformJSON with the following spec:
[ { "operation": "shift", "spec": { "*": "TrackingRequestInformation.&" } }, { "operation": "default", "spec": { "TrackingRequestInformation": { "TrackingRequestFirstIp": null, "TrackingRequestLastIp": null, "TrackingRequestCreationTime": null, "TrackingRequestCreationDate": null, "TrackingRequestTrackingNumber": null, "TrackingRequestFirstCityName": null, "TrackingRequestFirstIpLat": null, "TrackingRequestFirstIpLong": null, "TrackingRequestFirstIpCountryName": null, "TrackingRequestFirstIpCountryCode": null, "TrackingRequestFirstIpPostalCode": null, "TrackingRequestFirstIpGeoLocation": null, "TrackingRequestLastCityName": null, "TrackingRequestLastIpLat": null, "TrackingRequestLastIpLong": null, "TrackingRequestLastIpCountryName": null, "TrackingRequestLastIpCountryCode": null, "TrackingRequestLastIpPostalCode": null, "TrackingRequestLastIpGeoLocation": null }, "DataSourceInformation": { "DataSourceUuid": null, "DataOwnerUuid": null, "RecordUuid": null } } } ]
This spec moves the tracking data into TrackingRequestInformation, adds a default DataSourceInformation object, and adds any fields that aren't in the original input, seems to output what you describe above.
Created 04-18-2019 07:50 PM
@Matt Burgess I certainly need some practice using the Jolt spec, but this does exactly what I needed. Thank you for you help!
Created 04-19-2019 02:21 AM
The JOLT DSL takes a lot of practice, but once it clicks, it's like seeing the Matrix lol. I'm no expert but I've put my time in 😉