Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to Combine Data from Two Flows Based on Common Attribute

Solved Go to solution

How to Combine Data from Two Flows Based on Common Attribute

New Contributor

I have a json object in the following format

{
	"Objects": [
		{
			"Item": {
				"LastUpdateTime": "2018-03-23T02:36:09.000Z",
				"Identification": {
					"id": "123"
				},
				"Inventory": {
					"Elements": {
						"Element": [
							{
								"Height": 10,
								"Weight": 56,
								"Features": {
									"Feature": {
										"FeatureId": "456",
										"Color": "white"
									}
								}
							},
							{
								"Height": 14,
								"Weight": 46,
								"Features": {
									"Feature": {
										"FeatureId": "789",
										"Color": "orange"
									}
								}
							},
							{
								"Height": 40,
								"Weight": 68,
								"Features": {
									"Feature": {
										"FeatureId": "343",
										"Color": "yellow"
									}
								}
							}
						]
					}
				},
			"Status": {
				"Id": "123",
				"ElementsStatus": {
					"ElementStatus": [
						{
							"FeatureId": "456",
							"Status": "In-Stock"
						},
						{
							"FeatureId": "789",
							"Status": "Out Of Stock"
						},
						{
							"FeatureId": "343",
							"Status": "Out Of Stock"
						}
					]
					}
				}
			}
		}
	]
}

, would like to create another object out of it in the current format

[{'Id': 123,
  'FeatureId': '456',
  'Color': 'White',
  'Height': 10,
  'Weight': 56,
  'Status': 'In-Stock'},
 {'Id': 123,
  'FeatureId': '789',
  'Color': 'orange',
  'Height': 14,
  'Weight': 46,
  'Status': 'Out Of Stock'},
 {'Id': 123,
  'FeatureId': '343',
  'Color': 'yellow',
  'Height': 40,
  'Weight': 68,
  'Status': 'Out Of Stock'}]

The "Objects" object is an array of multiple items and their statuses, each item in the Inventory>Elements>Element should have an entry in Status>ElementsStatus>ElementStatus with the same "FeatureId"

My current process includes a SplitJson processor that splits each element in the "Objects" Array into a flowfile, then I have two EvaluateJsonPath processors to extract $.Objects.Item.Inventory.Elements.Element and $.Objects.Status.ElementsStatus.ElementStatus from each flow file. In the case above, each EvaluateJsonPath will output 3 flowfiles

The Challenge is to piece that data back together and output a similar format to what I showed above.

Any ideas?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to Combine Data from Two Flows Based on Common Attribute

I wrote up a quick Chain spec you can use in a JoltTransformJSON processor, that way you can skip the Split/Merge pattern and work on the entire JSON object at once:

[
  {
    "operation": "shift",
    "spec": {
      "Objects": {
        "*": {
          "Item": {
            "Inventory": {
              "Elements": {
                "Element": {
                  "*": {
                    "Height": "[&1].Height",
                    "Weight": "[&1].Weight",
                    "Features": {
                      "Feature": {
                        "*": "[&3].&"
                      }
                    }
                  }
                }
              }
            },
            "Status": {
              "ElementsStatus": {
                "ElementStatus": {
                  "*": {
                    "@(3,Id)": "[&1].Id",
                    "Status": "[&1].Status"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
]

Note that this assumes the Element and ElementStatus arrays are parallel, meaning the first object in the Element array corresponds to the first object in the ElementStatus array (i.e. their FeatureId fields match). If that is not true, you'd either need a more complicated JOLT spec or perhaps a scripted solution using ExecuteScript.

5 REPLIES 5

Re: How to Combine Data from Two Flows Based on Common Attribute

New Contributor

@Matt Burgess I am new here and not sure what's the best way to get a response, saw you're answering many questions, thought you might be able to help

Re: How to Combine Data from Two Flows Based on Common Attribute

I wrote up a quick Chain spec you can use in a JoltTransformJSON processor, that way you can skip the Split/Merge pattern and work on the entire JSON object at once:

[
  {
    "operation": "shift",
    "spec": {
      "Objects": {
        "*": {
          "Item": {
            "Inventory": {
              "Elements": {
                "Element": {
                  "*": {
                    "Height": "[&1].Height",
                    "Weight": "[&1].Weight",
                    "Features": {
                      "Feature": {
                        "*": "[&3].&"
                      }
                    }
                  }
                }
              }
            },
            "Status": {
              "ElementsStatus": {
                "ElementStatus": {
                  "*": {
                    "@(3,Id)": "[&1].Id",
                    "Status": "[&1].Status"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
]

Note that this assumes the Element and ElementStatus arrays are parallel, meaning the first object in the Element array corresponds to the first object in the ElementStatus array (i.e. their FeatureId fields match). If that is not true, you'd either need a more complicated JOLT spec or perhaps a scripted solution using ExecuteScript.

Re: How to Combine Data from Two Flows Based on Common Attribute

New Contributor

Jolt Transformation did it for me, I didn't know how to use it and what it's meant to do, my document had more objects that I masked for public use, edited your transformation and things turned out pretty well.
What's the use of & in Jolt syntax? I noticed that it could bring up an object to a higher level based on the number you insert, is there a good reference or examples on the transformations?

Re: How to Combine Data from Two Flows Based on Common Attribute

I use the Advanced UI in the JoltTransformJSON processor or this webapp to test out specs, also there are a bunch of examples and doc in the javadoc but it can be a bit difficult to follow. You can also search the jolt tag on StackOverflow for a number of questions, answers, and examples.

Highlighted

Re: How to Combine Data from Two Flows Based on Common Attribute

New Contributor

Thanks, that was fast, yes, the arrays are parallel. Will check this out and let you know

Don't have an account?
Coming from Hortonworks? Activate your account here