Created 10-04-2017 01:01 PM
Hi,
When we ingest data we receive a control file for each data file. The controle file is a json file having a md5 value in it.
The md4 hash value should be the same as the file were currently ingesting other wise we will not ingest it.
So i have done following until now
1, Feching a data1.xml and data1_control.json from sftp server
2. used RouteOnAttrubute to split the flow up in two one for data1.xml file and one for the control file
3. used HashContent to get the3 md5 had value from the data12,xml file
4. used EvaluateJsonPath to get the md4 tag into a flowfile attribute
Now i got stuck, i tried to put my control file md5 value into PutDistributedMapCache and used detectDuplicate, but it wouldnt work
How can this be solved ?
Created 10-08-2017 02:22 PM
Try this approach:
- Fetch only the json files (data1_control.json). Use filter regex for this
- Use EvaluateJSONPath to get the md5 into an attribute hash1
- Use update attribute to generate the name of the data file and store it in an attribute file_to_get. Since you have the control name (data1_control.json), you can generate the file name (data1.xml) using NiFi expression langage.
- In the same flow, fetch the corresponding data file file_to_get with fetch processor. Now you have the content of this file in you flow file.
- Use HashContent to get the md5 and store in attribute hash2
- Use Route on attribute to keep only flow file having hash1 equals to hash2
I hope this helps
Created 10-08-2017 02:22 PM
Try this approach:
- Fetch only the json files (data1_control.json). Use filter regex for this
- Use EvaluateJSONPath to get the md5 into an attribute hash1
- Use update attribute to generate the name of the data file and store it in an attribute file_to_get. Since you have the control name (data1_control.json), you can generate the file name (data1.xml) using NiFi expression langage.
- In the same flow, fetch the corresponding data file file_to_get with fetch processor. Now you have the content of this file in you flow file.
- Use HashContent to get the md5 and store in attribute hash2
- Use Route on attribute to keep only flow file having hash1 equals to hash2
I hope this helps
Created 10-10-2017 02:15 PM
@Abdelkrim Hadjidj This was exactly what i was looking for, thankyou very much for this beautiful and simple aproach
Created 10-10-2017 02:21 PM
@Simon Jespersen happy to help 🙂 Thanks