Member since
01-02-2019
2
Posts
0
Kudos Received
0
Solutions
01-03-2019
07:18 AM
@anarasimham Thanks for quick reply The reason I was using mergeRecord with defragment was that the requirement is to keep all the nested json object extracted from a file to be kept in one output file. Using filename as correlation attribute, the problem is: 1. using defragment algo: I don't know upfront the size of json object extracted from nested arrays, which I could then use as fragment.count 2. using bin-packing algo: Again I don't know the predictable minimum time it will take to complete processing an input file, which I could use as bin-age I have multiple json each with below schema: [
{
"field1":"value1",
"nested-array":[
{
"inner-field1":"inner-value1",
"inner-field2":"inner-value2",
...
},
{
"inner-field1":"inner-value3",
"inner-field2":"inner-value4",
...
},
{
"inner-field1":"inner-value5",
"inner-field2":"inner-value6",
...
}
]
},
{
"field1":"value1",
"nested-array":[
{
"inner-field1":"inner-value7",
"inner-field2":"inner-value8",
...
},
{
"inner-field1":"inner-value9",
"inner-field2":"inner-value10",
...
}
]
}
] I need to write nested-array object from ALL json files to a single csv file, like below "inner-field1","inner-field2",..
"inner-value1","inner-value2",..
"inner-value3","inner-value4",..
"inner-value5","inner-value6",..
"inner-value7","inner-value8",..
"inner-value9","inner-value10",..
...
regards, Hemal
... View more
01-02-2019
10:22 AM
Hi All, Below is my use case: Flow: 1. I have multiple zip files and read it from a folder 2. I use CompressContent processor unzip content -> contains multiple json files 3. Each json file is an array of json object I use split json processor to extract individual json object 4. Each json object contains nested json array, I extract each nested array object and write to a single file using mergeRecord processor MergeRecord with defragment, csvReader, csvRecordSetWriter and schemaRegistry and updating fragment.identifier(using updateAttribute processor prior to mergeRecord) as filename so that all records from single seed file are kept in single file. My question is how to set fragment.count (giving round figure, say 1000 creates multiple files each with 1000 records but the remainder remains in the queue ) Also, how can I get summary stats like number of nested array records exratcted across all json files.
... View more