Support Questions

CRISSAEGRIM · ‎02-17-2023

My use-case is:

1) Have API credentials

2) Use UpdateAttribute to update (1) schema, (2) s3 bucket/location (my list of reports)

3) Query API endpoint for report

4) API endpoint paginates and gets more records

5) Call MergeRecord

6) Save to s3

Since 3, 4, 5, 6 are all the same, I'm re-using the processors like below (screenshot). My problem is (5) MergeRecord will try to merge different schemas together, which is obviously a problem.

How can I restructure this? I'd like to re-use processors as much as possible, but still be able to add more schemas as my needs evolve.

CRISSAEGRIM · ‎02-21-2023

I used Correlation Attribute Name , setting it to `${schema.name}`, and it's working as expected.

Quote from documentation:

> If specified, two FlowFiles will be binned together only if they have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.

View solution in original post

CRISSAEGRIM · ‎02-21-2023

I used Correlation Attribute Name , setting it to `${schema.name}`, and it's working as expected.

Quote from documentation:

> If specified, two FlowFiles will be binned together only if they have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.

Cloudera Community

Support Questions

MergeRecord based on schema; only merge records of same schema