Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

comparing two flowfiles attributes one set at the time

avatar
Expert Contributor

Hi,

When we ingest data we receive a control file for each data file. The controle file is a json file having a md5 value in it.

The md4 hash value should be the same as the file were currently ingesting other wise we will not ingest it.

So i have done following until now

1, Feching a data1.xml and data1_control.json from sftp server

2. used RouteOnAttrubute to split the flow up in two one for data1.xml file and one for the control file

3. used HashContent to get the3 md5 had value from the data12,xml file

4. used EvaluateJsonPath to get the md4 tag into a flowfile attribute

Now i got stuck, i tried to put my control file md5 value into PutDistributedMapCache and used detectDuplicate, but it wouldnt work

How can this be solved ?

1 ACCEPTED SOLUTION

avatar
@Simon Jespersen

Try this approach:

- Fetch only the json files (data1_control.json). Use filter regex for this

- Use EvaluateJSONPath to get the md5 into an attribute hash1

- Use update attribute to generate the name of the data file and store it in an attribute file_to_get. Since you have the control name (data1_control.json), you can generate the file name (data1.xml) using NiFi expression langage.

- In the same flow, fetch the corresponding data file file_to_get with fetch processor. Now you have the content of this file in you flow file.

- Use HashContent to get the md5 and store in attribute hash2

- Use Route on attribute to keep only flow file having hash1 equals to hash2

I hope this helps

View solution in original post

3 REPLIES 3

avatar
@Simon Jespersen

Try this approach:

- Fetch only the json files (data1_control.json). Use filter regex for this

- Use EvaluateJSONPath to get the md5 into an attribute hash1

- Use update attribute to generate the name of the data file and store it in an attribute file_to_get. Since you have the control name (data1_control.json), you can generate the file name (data1.xml) using NiFi expression langage.

- In the same flow, fetch the corresponding data file file_to_get with fetch processor. Now you have the content of this file in you flow file.

- Use HashContent to get the md5 and store in attribute hash2

- Use Route on attribute to keep only flow file having hash1 equals to hash2

I hope this helps

avatar
Expert Contributor

@Abdelkrim Hadjidj This was exactly what i was looking for, thankyou very much for this beautiful and simple aproach

avatar

@Simon Jespersen happy to help 🙂 Thanks