Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Extract Data from JSON Array and Merge with Parent Attribute that is not an array

Solved Go to solution
Highlighted

Extract Data from JSON Array and Merge with Parent Attribute that is not an array

Contributor

Hi,

I have the following JSON

{
  "ARTGEntryJsonResult": {
    "AnnualChargeExemptWaverDate": null
    "Conditions": [
      ""
    ],
    "ConsumerInformation": {
      "DocumentLink": ""
    },
    "EntryType": "Medicine",
    "LicenceClass": "",
    "LicenceId": "152567"
    },
    "Products": [
      {
        "AdditionalInformation": [],
        "Components": [
          {
            "DosageForm": "Drug delivery system, transdermal",
            "RouteOfAdministration": "Transdermal",
            "VisualIdentification": "Dull, homogenous"
          }
        ],
        "Containers": [
          {
            "Closure": "",
            "Conditions": [
              "Store at room temperature"
            ],
            "LifeTime": "2 Years",
            "Material": null,
            "Temperature": "Store below 25 degrees Celsius",
            "Type": "Sachet"
          }
        ],
        "EffectiveDate": "2017-09-18",
        "GMDNCode": "",
        "GMDNTerm": "",
        "Ingredients": [
          {
            "Name": "Fentanyl",
            "Strength": "6.3000 mg"
          }
        ],
        "Name": "FENTANYL SANDOZ ",
        "Packs": [
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "1"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "10"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "2"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "3"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "4"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "5"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "7"
          },
          {
            "PoisonSchedule": "(S8) Controlled Drug",
            "Size": "8"
          }
        ],
        "SpecificIndications": [
          "Management of chronic pain requiring opioid analgesia."
        ],
        "StandardIndications": [],
        "Type": "Single Medicine Product",
        "Warnings": []
      }
    ],
    
  }
}

I am able to extract it using the splitJson + EvaluateJsonPath but i get the Array Fields Data in a single row.

Example of what i managed to get considering this JSON file:

LicenceId Name PoisonSchedule Size

152567 FENTANYL SANDOZ "(S8) Controlled Drug","(S8) Controlled Drug","(S8) Controlled Drug" 1,10,2

What i actually want is :

LicenceId Name PoisonSchedule Size

152567 FENTANYL SANDOZ (S8) Controlled Drug 1

152567 FENTANYL SANDOZ (S8) Controlled Drug 10

152567 FENTANYL SANDOZ (S8) Controlled Drug 2

Any ideas are much appreciated !!!

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Extract Data from JSON Array and Merge with Parent Attribute that is not an array

Super Guru

Adrian Oprea,

Hi i have the same input json as you in generate flow file processor to test the entire flow,

{
  "ARTGEntryJsonResult": {
    "AnnualChargeExemptWaverDate": "null",
    "Conditions": [
      ""
    ],
    "ConsumerInformation": {
      "DocumentLink": ""
    },
    "EntryType": "Medicine",
    "LicenceClass": "",
    "LicenceId": "152567"
  },
  "Products": [
    {
      "AdditionalInformation": [],
      "Components": [
        {
          "DosageForm": "Drug delivery system, transdermal",
          "RouteOfAdministration": "Transdermal",
          "VisualIdentification": "Dull, homogenous"
        }
      ],
      "Containers": [
        {
          "Closure": "",
          "Conditions": [
            "Store at room temperature"
          ],
          "LifeTime": "2 Years",
          "Material": null,
          "Temperature": "Store below 25 degrees Celsius",
          "Type": "Sachet"
        }
      ],
      "EffectiveDate": "2017-09-18",
      "GMDNCode": "",
      "GMDNTerm": "",
      "Ingredients": [
        {
          "Name": "Fentanyl",
          "Strength": "6.3000 mg"
        }
      ],
      "Name": "FENTANYL SANDOZ ",
      "Packs": [
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "1"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "10"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "2"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "3"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "4"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "5"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "7"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "8"
        }
      ],
      "SpecificIndications": [
        "Management of chronic pain requiring opioid analgesia."
      ],
      "StandardIndications": [],
      "Type": "Single Medicine Product",
      "Warnings": []
    }
  ]
}

JoltTransformation:-

i use this processor because we need to extract the name attribute from products array

[
  {
    "operation": "shift",
    "spec": {
      "*": "&",
      "Products": {
        "*": "Products"
      }
    }
    }
  ]

after this processor

we get output like below:-

{
  "ARTGEntryJsonResult" : {
    "AnnualChargeExemptWaverDate" : "null",
    "Conditions" : [ "" ],
    "ConsumerInformation" : {
      "DocumentLink" : ""
    },
    "EntryType" : "Medicine",
    "LicenceClass" : "",
    "LicenceId" : "152567"
  },
  "Products" : {
    "AdditionalInformation" : [ ],
    "Components" : [ {
      "DosageForm" : "Drug delivery system, transdermal",
      "RouteOfAdministration" : "Transdermal",
      "VisualIdentification" : "Dull, homogenous"
    } ],
    "Containers" : [ {
      "Closure" : "",
      "Conditions" : [ "Store at room temperature" ],
      "LifeTime" : "2 Years",
      "Material" : null,
      "Temperature" : "Store below 25 degrees Celsius",
      "Type" : "Sachet"
    } ],
    "EffectiveDate" : "2017-09-18",
    "GMDNCode" : "",
    "GMDNTerm" : "",
    "Ingredients" : [ {
      "Name" : "Fentanyl",
      "Strength" : "6.3000 mg"
    } ],
    "Name" : "FENTANYL SANDOZ ",
    "Packs" : [ {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "1"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "10"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "2"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "3"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "4"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "5"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "7"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "8"
    } ],
    "SpecificIndications" : [ "Management of chronic pain requiring opioid analgesia." ],
    "StandardIndications" : [ ],
    "Type" : "Single Medicine Product",
    "Warnings" : [ ]
  }
}

without product as an array, now its easy to get Name attribute from the json message.

Configs for jolt:-

40571-jolt-packs.png

Evaluate Json path expression:-

to extract licenseid and name attributes from the content

i added the below properties as

licenceid as $.ARTGEntryJsonResult.LicenceId

name as $.Products.Name

EJ configs:-

40570-packs-ej2.png

Split Packs Array using splitjson processor:-

change JsonPath Expression property to

$.Products.Packs

once you split the packs array then the every message in array will be one seperate flow file, as in my input we are having 8 messages in packs array so we get 8 flowfiles having licenceid,name attributes associated with each flowfile.

Splijson Config:-

40569-packs-sj.png

Extract PoisonSchedule,Size using evaluatejson path:-

now we need to extract all the contents of flowfile as attributes as we are having

{
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "1"
    }

so we need to add 2 properties in processor and change Destination property to flowfile-attribute

PoisonSchedule as $.PoisonSchedule

Size as $.Size

Configs:-

40567-packs-ej.png

Right now we are having all the desired contents of message as attributes we can use replace text processor(if you need output as text) (or) attributestojson processor(if you want json message)

ReplaceText Processor:-

we are having all the list of attributes

${licenseid},${name},${PoisonSchedule},${Size}

here i kept , as seperator you can mention what ever you like, so keep them in Replacement Value property and change Replacement strategy to AlwaysReplace.

Output:-

flowfile1:-

152567,FENTANYL SANDOZ ,(S8) Controlled Drug,4

flowfile2:-

152567,FENTANYL SANDOZ ,(S8) Controlled Drug,3

Configs Replacetext:-

40566-packs-replace.png

AttributesToJSON:-

if you want to convert the results as json documents then use this processor and in attributes list keep property as

licenseid,name,PoisonSchedule,Size

it will converts the attributes as json message

Output:-

{"Size":"10","PoisonSchedule":"(S8) Controlled Drug","name":"FENTANYL SANDOZ ","licenseid":"152567"}

if you want to merge these flowfiles together,use Mergecontent Processor and change the properties as for your requirements.

Flow Screenshot:-

is attached in comments

Hope this Helps...!!

2 REPLIES 2

Re: Extract Data from JSON Array and Merge with Parent Attribute that is not an array

Super Guru

Adrian Oprea,

Hi i have the same input json as you in generate flow file processor to test the entire flow,

{
  "ARTGEntryJsonResult": {
    "AnnualChargeExemptWaverDate": "null",
    "Conditions": [
      ""
    ],
    "ConsumerInformation": {
      "DocumentLink": ""
    },
    "EntryType": "Medicine",
    "LicenceClass": "",
    "LicenceId": "152567"
  },
  "Products": [
    {
      "AdditionalInformation": [],
      "Components": [
        {
          "DosageForm": "Drug delivery system, transdermal",
          "RouteOfAdministration": "Transdermal",
          "VisualIdentification": "Dull, homogenous"
        }
      ],
      "Containers": [
        {
          "Closure": "",
          "Conditions": [
            "Store at room temperature"
          ],
          "LifeTime": "2 Years",
          "Material": null,
          "Temperature": "Store below 25 degrees Celsius",
          "Type": "Sachet"
        }
      ],
      "EffectiveDate": "2017-09-18",
      "GMDNCode": "",
      "GMDNTerm": "",
      "Ingredients": [
        {
          "Name": "Fentanyl",
          "Strength": "6.3000 mg"
        }
      ],
      "Name": "FENTANYL SANDOZ ",
      "Packs": [
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "1"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "10"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "2"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "3"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "4"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "5"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "7"
        },
        {
          "PoisonSchedule": "(S8) Controlled Drug",
          "Size": "8"
        }
      ],
      "SpecificIndications": [
        "Management of chronic pain requiring opioid analgesia."
      ],
      "StandardIndications": [],
      "Type": "Single Medicine Product",
      "Warnings": []
    }
  ]
}

JoltTransformation:-

i use this processor because we need to extract the name attribute from products array

[
  {
    "operation": "shift",
    "spec": {
      "*": "&",
      "Products": {
        "*": "Products"
      }
    }
    }
  ]

after this processor

we get output like below:-

{
  "ARTGEntryJsonResult" : {
    "AnnualChargeExemptWaverDate" : "null",
    "Conditions" : [ "" ],
    "ConsumerInformation" : {
      "DocumentLink" : ""
    },
    "EntryType" : "Medicine",
    "LicenceClass" : "",
    "LicenceId" : "152567"
  },
  "Products" : {
    "AdditionalInformation" : [ ],
    "Components" : [ {
      "DosageForm" : "Drug delivery system, transdermal",
      "RouteOfAdministration" : "Transdermal",
      "VisualIdentification" : "Dull, homogenous"
    } ],
    "Containers" : [ {
      "Closure" : "",
      "Conditions" : [ "Store at room temperature" ],
      "LifeTime" : "2 Years",
      "Material" : null,
      "Temperature" : "Store below 25 degrees Celsius",
      "Type" : "Sachet"
    } ],
    "EffectiveDate" : "2017-09-18",
    "GMDNCode" : "",
    "GMDNTerm" : "",
    "Ingredients" : [ {
      "Name" : "Fentanyl",
      "Strength" : "6.3000 mg"
    } ],
    "Name" : "FENTANYL SANDOZ ",
    "Packs" : [ {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "1"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "10"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "2"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "3"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "4"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "5"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "7"
    }, {
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "8"
    } ],
    "SpecificIndications" : [ "Management of chronic pain requiring opioid analgesia." ],
    "StandardIndications" : [ ],
    "Type" : "Single Medicine Product",
    "Warnings" : [ ]
  }
}

without product as an array, now its easy to get Name attribute from the json message.

Configs for jolt:-

40571-jolt-packs.png

Evaluate Json path expression:-

to extract licenseid and name attributes from the content

i added the below properties as

licenceid as $.ARTGEntryJsonResult.LicenceId

name as $.Products.Name

EJ configs:-

40570-packs-ej2.png

Split Packs Array using splitjson processor:-

change JsonPath Expression property to

$.Products.Packs

once you split the packs array then the every message in array will be one seperate flow file, as in my input we are having 8 messages in packs array so we get 8 flowfiles having licenceid,name attributes associated with each flowfile.

Splijson Config:-

40569-packs-sj.png

Extract PoisonSchedule,Size using evaluatejson path:-

now we need to extract all the contents of flowfile as attributes as we are having

{
      "PoisonSchedule" : "(S8) Controlled Drug",
      "Size" : "1"
    }

so we need to add 2 properties in processor and change Destination property to flowfile-attribute

PoisonSchedule as $.PoisonSchedule

Size as $.Size

Configs:-

40567-packs-ej.png

Right now we are having all the desired contents of message as attributes we can use replace text processor(if you need output as text) (or) attributestojson processor(if you want json message)

ReplaceText Processor:-

we are having all the list of attributes

${licenseid},${name},${PoisonSchedule},${Size}

here i kept , as seperator you can mention what ever you like, so keep them in Replacement Value property and change Replacement strategy to AlwaysReplace.

Output:-

flowfile1:-

152567,FENTANYL SANDOZ ,(S8) Controlled Drug,4

flowfile2:-

152567,FENTANYL SANDOZ ,(S8) Controlled Drug,3

Configs Replacetext:-

40566-packs-replace.png

AttributesToJSON:-

if you want to convert the results as json documents then use this processor and in attributes list keep property as

licenseid,name,PoisonSchedule,Size

it will converts the attributes as json message

Output:-

{"Size":"10","PoisonSchedule":"(S8) Controlled Drug","name":"FENTANYL SANDOZ ","licenseid":"152567"}

if you want to merge these flowfiles together,use Mergecontent Processor and change the properties as for your requirements.

Flow Screenshot:-

is attached in comments

Hope this Helps...!!

Re: Extract Data from JSON Array and Merge with Parent Attribute that is not an array

Super Guru

Adrian Oprea,

Flow:-

40572-packs-flow.png

Don't have an account?
Coming from Hortonworks? Activate your account here