Support Questions

Find answers, ask questions, and share your expertise

Converting Nested JSON to Flat JSON using JOLT

avatar
Explorer

Hi team,

I'm struggling while converting nested JSON to Flat JSON using JOLT.

The Nested JSON is in the below format:

 

{
  "data": {
    "getCarListing": {
      "edges": [
        {
          "node": {
            "id": "1001",
            "identifier": 500001,
            "carClusterIds": [
              {
                "element": {
                  "name": "SEDAN1000",
                  "businessRelations": [
                    {
                      "countries": [
                        {
                          "countryCode": "US"
                        }
                      ],
                      "code": "US01"
                    },
                    {
                      "countries": [
                        {
                          "countryCode": "CZ"
                        },
                        {
                          "countryCode": "SK"
                        }
                      ],
                      "code": "CZ01"
                    },
                    {
                      "countries": [
                        {
                          "countryCode": "CA"
                        }
                      ],
                      "code": "CA01"
                    }
                  ]
                },
                "metadata": [
                  {
                    "name": "clusterId",
                    "value": "200011"
                  }
                ]
              },
              {
                "element": {
                  "name": "SUV2020",
                  "businessRelations": [
                    {
                      "countries": [
                        {
                          "countryCode": "MX"
                        }
                      ],
                      "code": "MX01"
                    }
                  ]
                },
                "metadata": [
                  {
                    "name": "clusterId",
                    "value": "200012"
                  }
                ]
              }
            ],
            "parentItems": [
              {
                "identifier": 400050
              }
            ]
          }
        },
        {
          "node": {
            "id": "1002",
            "identifier": 500002,
            "carClusterIds": [
              {
                "element": {
                  "name": "TRUCK500",
                  "businessRelations": [
                    {
                      "countries": [
                        {
                          "countryCode": "FR"
                        }
                      ],
                      "code": "FR01"
                    },
                    {
                      "countries": [
                        {
                          "countryCode": "DE"
                        }
                      ],
                      "code": "DE01"
                    }
                  ]
                },
                "metadata": [
                  {
                    "name": "clusterId",
                    "value": "200021"
                  }
                ]
              },
              {
                "element": {
                  "name": "COUPE3000",
                  "businessRelations": [
                    {
                      "countries": [
                        {
                          "countryCode": "JP"
                        }
                      ],
                      "code": "JP01"
                    }
                  ]
                },
                "metadata": [
                  {
                    "name": "clusterId",
                    "value": "200022"
                  }
                ]
              }
            ],
            "parentItems": [
              {
                "identifier": 400051
              }
            ]
          }
        }
      ]
    }
  }
}

 

I want the output in this format:

 

[
  {
    "Element" : "SEDAN1000",
    "BR" : "US01",
    "CountryCode" : "US"
  },
  {
    "Element" : "SEDAN1000",
    "BR" : "CZ01",
    "CountryCode" : "CZ"
  },
  {
    "Element" : "SEDAN1000",
    "BR" : "CZ01",
    "CountryCode" : "SK"
  },
  {
    "Element" : "SEDAN1000",
    "BR" : "CA01",
    "CountryCode" : "CA"
  },
  {
    "Element" : "SUV2020",
    "BR" : "MX01",
    "CountryCode" : "MX"
  }
]

 

Can anyone please help me with the JOLT spec that can be used to derive the nested JSON to the required Flat JSON.

Thanks in Advance...

 

 

1 ACCEPTED SOLUTION

avatar
Super Guru

Hi @Sid17 ,

The following spec should do:

 

[
  {
    "operation": "shift",
    "spec": {
      "data": {
        "getCarListing": {
          "edges": {
            "*": {
              "node": {
                "carClusterIds": {
                  "*": {
                    "element": "[]"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
  ,
  {
    "operation": "shift",
    "spec": {
      "*": {
        "businessRelations": {
          "*": {
            "countries": {
              "*": {
                "countryCode": "[&5].[&3].[&1].CountryCode",
                "@(2,code)": "[&5].[&3].[&1].BR",
                "@(4,name)": "[&5].[&3].[&1].Element"
              }
            }
          }
        }
      }
    }
 },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*": "[]"
        }
      }
    }
  }
]

The concept is the same as before: You need to group fields at level of the leaf field taking into consideration the parent array and any sub arrays positions for grouping.

The first transformation is used just to simplify the grouping (second transformation ) by removing any parent array where no information is needed but you can have both 1st and 2ed transformation as one if you can count up to the right array\sub-array index.

As I mentioned before, the recommendation for such pattern is to use JSLT transformation as its much simpler to write and require much less lines to achieve the same result.

Also another option that might simplify the spec as well by doing the following:

1- Take advantage of the FlattenJson processor with the following configuration:

SAMSAL_0-1734468371312.png

 

This processor will flatten all nested objects as one big object using (.) as separator as configured and will included any arrays as well. The result of this processor after passing the input json will look like this:

 

{
	"data.getCarListing.edges.0.node.id": "1001",
	"data.getCarListing.edges.0.node.identifier": 500001,
	"data.getCarListing.edges.0.node.carClusterIds.0.element.name": "SEDAN1000",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.0.countries.0.countryCode": "US",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.0.code": "US01",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.1.countries.0.countryCode": "CZ",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.1.countries.1.countryCode": "SK",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.1.code": "CZ01",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.2.countries.0.countryCode": "CA",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.2.code": "CA01",
	"data.getCarListing.edges.0.node.carClusterIds.0.metadata.0.name": "clusterId",
	"data.getCarListing.edges.0.node.carClusterIds.0.metadata.0.value": "200011",
	"data.getCarListing.edges.0.node.carClusterIds.1.element.name": "SUV2020",
	"data.getCarListing.edges.0.node.carClusterIds.1.element.businessRelations.0.countries.0.countryCode": "MX",
	"data.getCarListing.edges.0.node.carClusterIds.1.element.businessRelations.0.code": "MX01",
	"data.getCarListing.edges.0.node.carClusterIds.1.metadata.0.name": "clusterId",
	"data.getCarListing.edges.0.node.carClusterIds.1.metadata.0.value": "200012",
	"data.getCarListing.edges.0.node.parentItems.0.identifier": 400050,
	"data.getCarListing.edges.1.node.id": "1002",
	"data.getCarListing.edges.1.node.identifier": 500002,
	"data.getCarListing.edges.1.node.carClusterIds.0.element.name": "TRUCK500",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.0.countries.0.countryCode": "FR",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.0.code": "FR01",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.1.countries.0.countryCode": "DE",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.1.code": "DE01",
	"data.getCarListing.edges.1.node.carClusterIds.0.metadata.0.name": "clusterId",
	"data.getCarListing.edges.1.node.carClusterIds.0.metadata.0.value": "200021",
	"data.getCarListing.edges.1.node.carClusterIds.1.element.name": "COUPE3000",
	"data.getCarListing.edges.1.node.carClusterIds.1.element.businessRelations.0.countries.0.countryCode": "JP",
	"data.getCarListing.edges.1.node.carClusterIds.1.element.businessRelations.0.code": "JP01",
	"data.getCarListing.edges.1.node.carClusterIds.1.metadata.0.name": "clusterId",
	"data.getCarListing.edges.1.node.carClusterIds.1.metadata.0.value": "200022",
	"data.getCarListing.edges.1.node.parentItems.0.identifier": 400051
}

 

Then , understanding how you can reference part of the fields in Jolt to get the index will help you group things accordingly to then produce the final result:

[
  {
    "operation": "shift",
    "spec": {
      "data.getCarListing.edges.*.node.carClusterIds.*.element.businessRelations.*.countries.*.countryCode": {
        "@(1,data\\.getCarListing\\.edges\\.&(0,1)\\.node\\.carClusterIds\\.&(0,2)\\.element\\.businessRelations\\.&(0,3)\\.code)": "&(1,1).&(1,2).&(1,3).&(1,4).BR",
        "@(1,data\\.getCarListing\\.edges\\.&(0,1)\\.node\\.carClusterIds\\.&(0,2)\\.element\\.name)": "&(1,1).&(1,2).&(1,3).&(1,4).Element",
        "@": "&(1,1).&(1,2).&(1,3).&(1,4).countryCode"
      }
    }
  },

  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*": {
            "*": "[]"
          }
        }
      }
    }
    }
]

This will be simpler in the  sense that you count horizontally vs vertically to get the index.

Try it out and see how it works. This will help you understand jolt more and give you more options for future transformation.

 

Hope that helps.

 

 

 

 

View solution in original post

4 REPLIES 4

avatar
Explorer

Also, FYI I have tried using this JOLT

 

[
  {
    "operation": "shift",
    "spec": {
      "data": {
        "getCarListing": {
          "edges": {
            "*": {
              "node": {
                "carClusterIds": {
                  "*": {
                    "element": {
                      "businessRelations": {
                        "*": {
                          "countries": {
                            "*": {
                              "countryCode": "[&6].[&1].CountryCode",
                              "@(2,code)": "[&6].[&1].BR",
                              "@(4,name)": "[&6].[&1].Element"
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
 },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*": {
            "@": "[]"
          }
        }
      }
    }
    }

]

 

Which is giving output in the form:

 

[ [ "US01", "CZ01", "CA01", "FR01", "DE01" ], [ "SEDAN1000", "SEDAN1000", "SEDAN1000", "TRUCK500", "TRUCK500" ], [ "US", "CZ", "CA", "FR", "DE" ], "CZ01", "SEDAN1000", "SK", [ "MX01", "JP01" ], [ "SUV2020", "COUPE3000" ], [ "MX", "JP" ] ]

 

Which is closer to the expected output but need some formatting. 

avatar
Master Mentor

@Sid17 
Can you try this JOLT

Spoiler
[
{
"operation": "shift",
"spec": {
"data": {
"getCarListing": {
"edges": {
"*": {
"node": {
"carClusterIds": {
"*": {
"element": {
"name": "[].element"
},
"businessRelations": {
"*": {
"countries": {
"*": {
"countryCode": "[].businessRelations[].countryCode"
}
}
}
}
}
}
}
}
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"[]": "ONE"
}
}
]

Hope it works 

avatar
Super Guru

Hi @Sid17 ,

The following spec should do:

 

[
  {
    "operation": "shift",
    "spec": {
      "data": {
        "getCarListing": {
          "edges": {
            "*": {
              "node": {
                "carClusterIds": {
                  "*": {
                    "element": "[]"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
  ,
  {
    "operation": "shift",
    "spec": {
      "*": {
        "businessRelations": {
          "*": {
            "countries": {
              "*": {
                "countryCode": "[&5].[&3].[&1].CountryCode",
                "@(2,code)": "[&5].[&3].[&1].BR",
                "@(4,name)": "[&5].[&3].[&1].Element"
              }
            }
          }
        }
      }
    }
 },
  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*": "[]"
        }
      }
    }
  }
]

The concept is the same as before: You need to group fields at level of the leaf field taking into consideration the parent array and any sub arrays positions for grouping.

The first transformation is used just to simplify the grouping (second transformation ) by removing any parent array where no information is needed but you can have both 1st and 2ed transformation as one if you can count up to the right array\sub-array index.

As I mentioned before, the recommendation for such pattern is to use JSLT transformation as its much simpler to write and require much less lines to achieve the same result.

Also another option that might simplify the spec as well by doing the following:

1- Take advantage of the FlattenJson processor with the following configuration:

SAMSAL_0-1734468371312.png

 

This processor will flatten all nested objects as one big object using (.) as separator as configured and will included any arrays as well. The result of this processor after passing the input json will look like this:

 

{
	"data.getCarListing.edges.0.node.id": "1001",
	"data.getCarListing.edges.0.node.identifier": 500001,
	"data.getCarListing.edges.0.node.carClusterIds.0.element.name": "SEDAN1000",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.0.countries.0.countryCode": "US",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.0.code": "US01",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.1.countries.0.countryCode": "CZ",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.1.countries.1.countryCode": "SK",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.1.code": "CZ01",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.2.countries.0.countryCode": "CA",
	"data.getCarListing.edges.0.node.carClusterIds.0.element.businessRelations.2.code": "CA01",
	"data.getCarListing.edges.0.node.carClusterIds.0.metadata.0.name": "clusterId",
	"data.getCarListing.edges.0.node.carClusterIds.0.metadata.0.value": "200011",
	"data.getCarListing.edges.0.node.carClusterIds.1.element.name": "SUV2020",
	"data.getCarListing.edges.0.node.carClusterIds.1.element.businessRelations.0.countries.0.countryCode": "MX",
	"data.getCarListing.edges.0.node.carClusterIds.1.element.businessRelations.0.code": "MX01",
	"data.getCarListing.edges.0.node.carClusterIds.1.metadata.0.name": "clusterId",
	"data.getCarListing.edges.0.node.carClusterIds.1.metadata.0.value": "200012",
	"data.getCarListing.edges.0.node.parentItems.0.identifier": 400050,
	"data.getCarListing.edges.1.node.id": "1002",
	"data.getCarListing.edges.1.node.identifier": 500002,
	"data.getCarListing.edges.1.node.carClusterIds.0.element.name": "TRUCK500",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.0.countries.0.countryCode": "FR",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.0.code": "FR01",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.1.countries.0.countryCode": "DE",
	"data.getCarListing.edges.1.node.carClusterIds.0.element.businessRelations.1.code": "DE01",
	"data.getCarListing.edges.1.node.carClusterIds.0.metadata.0.name": "clusterId",
	"data.getCarListing.edges.1.node.carClusterIds.0.metadata.0.value": "200021",
	"data.getCarListing.edges.1.node.carClusterIds.1.element.name": "COUPE3000",
	"data.getCarListing.edges.1.node.carClusterIds.1.element.businessRelations.0.countries.0.countryCode": "JP",
	"data.getCarListing.edges.1.node.carClusterIds.1.element.businessRelations.0.code": "JP01",
	"data.getCarListing.edges.1.node.carClusterIds.1.metadata.0.name": "clusterId",
	"data.getCarListing.edges.1.node.carClusterIds.1.metadata.0.value": "200022",
	"data.getCarListing.edges.1.node.parentItems.0.identifier": 400051
}

 

Then , understanding how you can reference part of the fields in Jolt to get the index will help you group things accordingly to then produce the final result:

[
  {
    "operation": "shift",
    "spec": {
      "data.getCarListing.edges.*.node.carClusterIds.*.element.businessRelations.*.countries.*.countryCode": {
        "@(1,data\\.getCarListing\\.edges\\.&(0,1)\\.node\\.carClusterIds\\.&(0,2)\\.element\\.businessRelations\\.&(0,3)\\.code)": "&(1,1).&(1,2).&(1,3).&(1,4).BR",
        "@(1,data\\.getCarListing\\.edges\\.&(0,1)\\.node\\.carClusterIds\\.&(0,2)\\.element\\.name)": "&(1,1).&(1,2).&(1,3).&(1,4).Element",
        "@": "&(1,1).&(1,2).&(1,3).&(1,4).countryCode"
      }
    }
  },

  {
    "operation": "shift",
    "spec": {
      "*": {
        "*": {
          "*": {
            "*": "[]"
          }
        }
      }
    }
    }
]

This will be simpler in the  sense that you count horizontally vs vertically to get the index.

Try it out and see how it works. This will help you understand jolt more and give you more options for future transformation.

 

Hope that helps.

 

 

 

 

avatar
Explorer

Thank you @SAMSAL (JOLT expert), The spec you posted worked properly and I'm able to see the flat JSON as expected.