Created 06-06-2018 04:45 PM
Hi,
i need to transform the input JSON below
{ "TransactionId": "-1", "Source": "ihrs", "Name": "tag-data", "Id": "126", "Records": [ { "Name": "tag-master", "PayLoad": { "file-name": "testfile.json", "plant": "1086", "collector": "testcollector", "tag-name": "testtag", "tag-description": "test desc" } }, { "Name": "tag-detail", "PayLoad": { "tag-value": "98998.55", "tag-timestamp": "2018-01-02T16:09:39.000000+00:00", "tag-quality": "100" } }, { "Name": "tag-detail", "PayLoad": { "tag-value": "91009.47", "tag-timestamp": "2018-01-02T16:09:340000000+00:00", "tag-quality": "100" } }, { "Name": "tag-detail", "PayLoad": { "tag-value": "91021.80", "tag-timestamp": "2018-01-02T16:09:41.000000+00:00", "tag-quality": "100" } } ] }
to output like below.
{ "TransactionId": "-1", "Source": "ihrs", "Name": "tag-data", "Id": "100", "Records": [ { "Name": "tag-master", "PayLoad": { "file-name": "testfile.json", "plant": "1000", "collector": "testcollector", "tag-name": "testtag", "tag-description": "test desc", "Children": [ { "Name": "tag-detail", "PayLoad": { "tag-value": "98998.55", "tag-timestamp": "2018-01-02T16:09:39.000000+00:00", "tag-quality": "100" } }, { "Name": "tag-detail", "PayLoad": { "tag-value": "98998.55", "tag-timestamp": "2018-01-02T16:09:39.000000+00:00", "tag-quality": "100" } } ] } } ] }
i need to move the details records under master as array.. basically first element of the record array (tag master)is parent to next 3 elements(tag details). i tried to use the convert record processor with JSON converters giving correct input and output schemas for files. it didnt convert. it looks like i may have to write JOLT which i need some help with. Thanks in advance!!!
attached files for reference..
Regards, Sai
Created 06-06-2018 06:41 PM
Try this Chain spec in JoltTransformJSON:
[ { "operation": "shift", "spec": { "Records": { "*": { "Name": { "tag-master": { "@2": "Records[&3]" }, "*": { "@2": "Records[#3].PayLoad.Children[]" } } } }, "*": "&" } } ]
It matches tag-master separately, puts it back into the Records array as-is, then for each tag-detail match, puts it into a Children array in the tag-master's PayLoad object.
Created 06-06-2018 06:41 PM
Try this Chain spec in JoltTransformJSON:
[ { "operation": "shift", "spec": { "Records": { "*": { "Name": { "tag-master": { "@2": "Records[&3]" }, "*": { "@2": "Records[#3].PayLoad.Children[]" } } } }, "*": "&" } } ]
It matches tag-master separately, puts it back into the Records array as-is, then for each tag-detail match, puts it into a Children array in the tag-master's PayLoad object.
Created 06-07-2018 08:57 PM
Thank you.
Created 06-11-2018 08:55 PM
Hi @Matt Burgess,
I have another challenge that i need to merge these smaller JSON files in to 1 file. lets say i need to merge 1000 files that i get from above JOLT in to one single json file and load in to HDFS.
how can i do that as the input files will be combined one after other using the MergeContent processor and the resulting file wont be a valid one.
any idea how to solve this.?
Regards,
Sai
Created 06-11-2018 09:04 PM
i think i will try to do this as suggested by Bryan some where else..
You can use MergeContent and set the Delimiter Strategy to "Text" and then enter [ , ] for the header, demarcator, and footer respectively...i think it should work.
Created 06-12-2018 01:47 PM
You're probably better served with MergeRecord, that will output the merged JSON records as an array
Created 06-19-2018 11:31 AM
Hi @Matt Burgess,
Can you please explain how can we do this using mergerecord as I am not able to to do it..
Created 06-19-2018 05:36 PM
I recommend using MergeRecord before the JoltTransformJSON as the Jolt transform should be able to be applied to the whole JSON array (after your smaller JSON objects have been merged). You'll want to use a JsonTreeRecordReader and provide an Avro schema that matches your input data above. mergerecord-example.xml is an example template where I generate data similar to yours, use MergeRecord to bundle them 20 at a time, then run the Jolt spec on it, it includes the associated Avro schema and hopefully all config to get you up and going.