Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to convert complicated JSON nested array with JOLT?

Highlighted

How to convert complicated JSON nested array with JOLT?

New Contributor

I'm trying to convert nested arrays into objects depending on the number of values in the second nested array. I can't seem to get the number of the value fields and use that as a key in my spec. Now this is my input JSON file:

{
 "meta": {
   "regId": "us",
   "cId": "SomeProduct",
   "weId": 15

 },
 "data": {
   "name": "R",
   "details": {
     "headers": [
       "id",
       "cityId",
       "cityName"

     ],
     "values": [
       [
         1539,
         17,
         "Moskow"
       ],
       [
         1539,
         17,
         "Berlin"
       ],
       [
        1539,
         17,
         "Vienna"
       ]
     ]
   }
 }
}

This my desired JSON Output:

[
    {"regId": "us",
        "cId": "SomeProduct",
        "weId": 15,
        "name":"R",
        "id":1539,
        "cityId":17,
        "cityName":Moskow
    },
    {"regId": "us",
        "cId": "SomeProduct",
        "weId": 15,
        "name":"R",
        "id":1540,
        "cityId":11,
        "cityName":Berlin
    },
    {"regId": "us",
        "cId": "SomeProduct",
        "weId": 15,
        "name":"R",
        "id":151,
        "cityId":18,
        "cityName":Vienna
    }
]

This is my current spec:

    [
  {
    "operation": "shift",
    "spec": {
      "meta": {
        "*": "&"
      },
      "data": {
        "name": "&",
        "details": {
          "values": {
            "*": {

              "*": "@(3,headers[&])"
            }
          }
        }
      }
    }
  }
]

I will be having multiple schemas arriving my way, is it better to have an InferAvroSchema and store the schemas into the Schema Library? What processors would that actually use?

So I've got multiple files incoming in the flow with multple JSON schemas. And I need to denormalize every one of them to a queriable ORC form.


Thanks in advance,

N