Created on 08-22-2018 03:50 AM - edited 08-17-2019 05:53 PM
Hi
I have a question about checking to see if a URL has already been called by the Invoke Process.
Below is the nifi data flow
Step1. GetHTTP Process
Step2: Split Json on $.checks
Step3: Evaluate Json on $.link
Step4: InvokeHttp based on $.link
Step5: PuttHDF and KafkaRecordQueue
The issue I have is that when I first called the GetHttp process the Json file looks like
First Call Returns
{"storeId": "136678",
"dob": "20180122",
"checks": [
{
"id": "20971531",
"printableId": "80001",
"marker": 636704886835750777,
"link": "https://abcd.com/136678/20180122/20971531"
} ]}
The next time I call it the, new information is appended to the bottom of the file
Second Call Returns
{"storeId": "136678",
"dob": "20180122",
"checks": [
{
"id": "20971531",
"printableId": "80001",
"marker": 636704886835750777,
"link": "https://abcd.com/136678/20180122/20971531"
},
{
"id": "20971535",
"printableId": "10001",
"marker": 636704886835789652,
"link": "https://abcd.com/136678/20180122/20971535"
}
]}
Issue:
The second call is invoking the link from the first call again, that has already been called and placed in the HDFS and Publish Kafka Record Queue.
Question:
Is there any way I can check to see if the link called in the First Call has been successfully called and is not invoked in the second call?
Hope that made sense
Many thanks
Tim
Created on 09-24-2018 12:21 PM - edited 08-17-2019 05:53 PM
Hi All
Being new to Nifi it has taken me a while to come up with a solution to this issue. Seeing as there has been a few views on this question I thought I would update members on what I have come up with and to see if there is any feedback on how I could improve the solution regarding design and robustness.
There are two parts to the design
Part one: Create a logging solution
Part two: Invoke the url from the new logging solution.
Part One:Create a logging solution
When I first call the Master URL I get the below Json file
{"storeId": "136678", "dob":"20180122", "checks":[{"id": "20971531", "printableId": "80001", “IsClosed”= “true”, "marker": 636704886835750777, link": "https://abcd.com/136678/20180122/20971531"} ]}
Part One Steps
Resulting json File
{"Link":"https://abcd.com/136678/20180122/20971531”, "PrintableId":"80001", "StoreId":"136678", "DOB":"2010122", "ProcessedFlag":“N"}
Part Two Steps
File 136678_20180122_80001.json in the directory apps/LinksProcessed has the resulting data and the detail transaction json file has been placed in the HDFS directory apps/SalesTransactions
{"Link":"https://abcd.com/136678/20180122/20971531”, "PrintableId":"80001", "StoreId":"136678", "DOB":"2010122", "ProcessedFlag":“Y"}
What happens when we call the master URL again.
We get the data
{"storeId": "136678", "dob":"20180122", "checks":[{"id": "20971531", "printableId": "80001", “IsClosed”= “true”, "marker": 636704886835750777, link": "https://abcd.com/136678/20180122/20971531"}, {"id": "20971531", "printableId": "80001", “IsClosed”= “true”, "marker": 636704886835750777, link": "https://abcd.com/136678/20180122/20971531"}, {"id": "20971531", "printableId": "80001", “IsClosed”= “true”, "marker": 636704886835750777, link": "https://abcd.com/136678/20180122/20971531"} ]}
Two new transactions have been appended to the file, we now have three transactions, one of which we processed in the first pass.
Part one
This will result in placing two new json files in the apps/LinksProcessed directory. The first file is not updated nor does the PUTHDFS create an error as the Conflict Resolution Strategy= ignore. Therefore apps/LinksProcessed will look like
Apps/ProcessLinks/136678_20180122_80001.json
Apps/ProcessLinks/136678_20180122_10001.json
Apps/ProcessLinks/136678_20180122_10002.json
Part two
The ListHDFS process will only call the latest files added to the directory since the last call. Therefore only the files 136678_20180122_10001.json and 136678_20180122_10002.json (the two new ones). This means that only the URLs for these transactions will be invoked and not the first one that we have already processed in the first pass
Resulting HDFS
Invoke URL jsons
Apps/ProcessLinks/136678_20180122_80001.json
Apps/ProcessLinks/136678_20180122_10001.json
Apps/ProcessLinks/136678_20180122_10002.json
Detail sales transcation json
Apps/SalesTransactions/136678_20180122_80001.json
Apps/SalesTransactions/136678_20180122_10001.json
Apps/SalesTransactions/136678_20180122_10002.json
I hope that is of use to people. Like I said I am new to Nifi so happy to receive any feedback that will improve the solution
Tim