Support Questions

Find answers, ask questions, and share your expertise

How do I create a nested JSON from a flat JSON that conforms to my schema in NiFi?

avatar
Explorer

I have the following example JSON in a flowfile (data has been randomly imputed for this example) :

{
"TrackingRequestFirstIp":"192.10.01.01",
"TrackingRequestLastIp":"100.120.123.456",
"TrackingRequestCreationTime":"05:49:55",
"TrackingRequestCreationDate":"20171013",
"TrackingRequestTrackingNumber":"12349876"
}

Here is my AVRO schema:

{
  "name": "IpReports",
  "type": "record",
  "namespace": "com.acme.avro",
  "fields": [
    {"name": "DataSourceInformation", 
      "type":{
        "type":"map", 
        "values": {
          "name":"DataSourceInformation",
          "type":"record", 
          "fields": [
            {"name":"DataSourceUuid", "type":["null","string"], "default":null},
            {"name":"DataOwnerUuid", "type":["null","string"], "default":null},
            {"name":"RecordUuid", "type":["null","string"], "default":null]
        }}
    },
    {"name": "TrackingRequestInformation", 
      "type": {
        "type":"map", 
        "values": {
          "name":"TrackingRequestInformation",
          "type":"record",
          "fields": [
            {"name": "TrackingRequestCreationTime", "type":["null", "string"], "default":null, "aliases":["time_cst"]},
            {"name": "TrackingRequestCreationDate", "type":["null", "string"], "default":null, "aliases":["date_cst"]},
            {"name": "TrackingRequestTrackingNumber", "type":["null", "string"], "default":null, "aliases":["fedex_airbill"]},
            {"name": "TrackingRequestFirstIp", "type":["null", "string"], "default":null, "aliases":["source_ip"]},
            {"name": "TrackingRequestLastIp", "type":["null", "string"], "default":null, "aliases":["dest_ip"]},
            {"name": "TrackingRequestFirstCityName", "type":["null", "string"], "default":null, "aliases":["source_ip_city"]},
            {"name": "TrackingRequestFirstIpLat", "type":["null", "double"], "default":null, "aliases":["source_ip_lat"]},
            {"name": "TrackingRequestFirstIpLong", "type":["null", "double"], "default":null, "aliases":["source_ip_long"]},
            {"name": "TrackingRequestFirstIpCountryName", "type":["null", "string"], "default":null, "aliases":["source_ip_country"]},
            {"name": "TrackingRequestFirstIpCountryCode", "type":["null", "string"], "default":null, "aliases":["source_ip_country_iso"]},
            {"name": "TrackingRequestFirstIpPostalCode", "type":["null", "string"], "default":null, "aliases":["source_ip_country_postal"]},
            {"name": "TrackingRequestFirstIpGeoLocation", "type":["null", "string"], "default":null, "aliases":["source_location"]},
            {"name": "TrackingRequestLastCityName", "type":["null", "string"], "default":null, "aliases":["dest_ip_city"]},
            {"name": "TrackingRequestLastIpLat", "type":["null", "double"], "default":null, "aliases":["dest_ip_lat"]},
            {"name": "TrackingRequestLastIpLong", "type":["null", "double"], "default":null, "aliases":["dest_ip_long"]},
            {"name": "TrackingRequestLastIpCountryName", "type":["null", "string"], "default":null, "aliases":["dest_ip_country"]},
            {"name": "TrackingRequestLastIpCountryCode", "type":["null", "string"], "default":null, "aliases":["dest_ip_country_iso"]},
            {"name": "TrackingRequestLastIpPostalCode", "type":["null", "string"], "default":null, "aliases":["dest_ip_country_postal"]},
            {"name": "TrackingRequestLastIpGeoLocation", "type":["null", "string"], "default":null, "aliases":["dest_location"]}]
        }}
      }]
}

Further on in my nifi process group, I will be enriching the original JSON and adding the missing fields.

I would like to convert the original JSON above into the JSON below using my AVRO schema:

{ 
"DataSourceInformation": {
    "DataSourceUuid":null,
    "DataOwnerUuid":null,
    "RecordUuid":null
    },
"TrackingRequestInformation": { 
    "TrackingRequestFirstIp":"192.10.01.01", 
    "TrackingRequestLastIp":"100.120.123.456", 
    "TrackingRequestCreationTime":"05:49:55", 
    "TrackingRequestCreationDate":"20171013", 
    "TrackingRequestTrackingNumber":"12349876",
    "TrackingRequestFirstIpLat":null,
    "TrackingRequestFirstIpLong":null,
    ...   ...   ...   ...   ...
    "TrackingRequestLastIpGeoLocation":null
    }
}


However, when I use a convert record processor, I get the following output:

[ {
  "DataSourceInformation" : null,
  "TrackingRequestInformation" : null
} ]


Any thoughts on what I am doing wrong or suggestions on how to correct it?

1 ACCEPTED SOLUTION

avatar
Master Guru

Try JoltTransformJSON with the following spec:

[
  {
    "operation": "shift",
    "spec": {
      "*": "TrackingRequestInformation.&"
    }
  },
  {
    "operation": "default",
    "spec": {
      "TrackingRequestInformation": {
        "TrackingRequestFirstIp": null,
        "TrackingRequestLastIp": null,
        "TrackingRequestCreationTime": null,
        "TrackingRequestCreationDate": null,
        "TrackingRequestTrackingNumber": null,
        "TrackingRequestFirstCityName": null,
        "TrackingRequestFirstIpLat": null,
        "TrackingRequestFirstIpLong": null,
        "TrackingRequestFirstIpCountryName": null,
        "TrackingRequestFirstIpCountryCode": null,
        "TrackingRequestFirstIpPostalCode": null,
        "TrackingRequestFirstIpGeoLocation": null,
        "TrackingRequestLastCityName": null,
        "TrackingRequestLastIpLat": null,
        "TrackingRequestLastIpLong": null,
        "TrackingRequestLastIpCountryName": null,
        "TrackingRequestLastIpCountryCode": null,
        "TrackingRequestLastIpPostalCode": null,
        "TrackingRequestLastIpGeoLocation": null
      },
      "DataSourceInformation": {
        "DataSourceUuid": null,
        "DataOwnerUuid": null,
        "RecordUuid": null
      }
    }
  }
]

This spec moves the tracking data into TrackingRequestInformation, adds a default DataSourceInformation object, and adds any fields that aren't in the original input, seems to output what you describe above.

View solution in original post

3 REPLIES 3

avatar
Master Guru

Try JoltTransformJSON with the following spec:

[
  {
    "operation": "shift",
    "spec": {
      "*": "TrackingRequestInformation.&"
    }
  },
  {
    "operation": "default",
    "spec": {
      "TrackingRequestInformation": {
        "TrackingRequestFirstIp": null,
        "TrackingRequestLastIp": null,
        "TrackingRequestCreationTime": null,
        "TrackingRequestCreationDate": null,
        "TrackingRequestTrackingNumber": null,
        "TrackingRequestFirstCityName": null,
        "TrackingRequestFirstIpLat": null,
        "TrackingRequestFirstIpLong": null,
        "TrackingRequestFirstIpCountryName": null,
        "TrackingRequestFirstIpCountryCode": null,
        "TrackingRequestFirstIpPostalCode": null,
        "TrackingRequestFirstIpGeoLocation": null,
        "TrackingRequestLastCityName": null,
        "TrackingRequestLastIpLat": null,
        "TrackingRequestLastIpLong": null,
        "TrackingRequestLastIpCountryName": null,
        "TrackingRequestLastIpCountryCode": null,
        "TrackingRequestLastIpPostalCode": null,
        "TrackingRequestLastIpGeoLocation": null
      },
      "DataSourceInformation": {
        "DataSourceUuid": null,
        "DataOwnerUuid": null,
        "RecordUuid": null
      }
    }
  }
]

This spec moves the tracking data into TrackingRequestInformation, adds a default DataSourceInformation object, and adds any fields that aren't in the original input, seems to output what you describe above.

avatar
Explorer

@Matt Burgess I certainly need some practice using the Jolt spec, but this does exactly what I needed. Thank you for you help!

avatar
Master Guru

The JOLT DSL takes a lot of practice, but once it clicks, it's like seeing the Matrix lol. I'm no expert but I've put my time in 😉