Created on 07-17-2017 02:59 PM - edited 08-18-2019 01:58 AM
Hi all,
I am having problems with DetectDuplicate. It is not working as expected or I am not knowing how to configure it - Am I missing something?
Imagine this simple Json list:
[ { "ID": 101 }, { "ID": 102 }, { "ID": 103 }, { "ID": 104 }, { "ID": 105 }, { "ID": 106 }, { "ID": 107 }, { "ID": 108 }, { "ID": 109 }, { "ID": 110 } ]
Looking at the above Json list we expect that every item will be a non duplicate item in Distributed Map Cache Server. But it is not what is happening.
Here is the Detect Duplicate Propertie configuration:
When I start the process flow look what happens:
Only the first ID is detected as a non duplicate as you can see in the LogAttribute - Non Duplicate data provenance:
What Am I doing wrong? Am I missing setting some configuration?
Here is the template: detect-duplicate.xml
Any help will be much appreciated!
Thank you in advance.
Created 07-18-2017 02:16 AM
Hi @Gabriel Queiroz,
If you'd like to use ID FlowFile attribute from DetectDuplicate processor's 'Cache Entry Identifier', you need to use NiFi Attribute Expression Language syntax. Currently you have configured it as '$ID', but you need it as '${ID}' (wrap it with a curly bracket).
Created 07-18-2017 02:16 AM
Hi @Gabriel Queiroz,
If you'd like to use ID FlowFile attribute from DetectDuplicate processor's 'Cache Entry Identifier', you need to use NiFi Attribute Expression Language syntax. Currently you have configured it as '$ID', but you need it as '${ID}' (wrap it with a curly bracket).
Created on 07-18-2017 03:27 PM - edited 08-18-2019 01:58 AM
Hi @kkawamura,
you are saving me again!
In this question https://community.hortonworks.com/questions/110551/how-to-remove-a-cache-entry-identifier-from-distr... you sent an example https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772 and your example is here in my NiFi, and a looked at it several times... but I don't pay attention to this:
My fault! I'm ashamed!
Thank you very much again @kkawamura!
Created 04-27-2020 04:37 AM
Hi All,
I am having problems with DetectDuplicate. It is not working as expected or I am not knowing how to configure it - Am I missing something?
Imagine this simple Json list:
{
"data": {
"alertaID": "xxxxx",
"app": "BSS",
"node": "Weblogic",
"severity": "critical",
"type": "com.bea/CM49-Server/CM49-Server/JVMRuntime/HeapFreePercent",
"hashField1": "BSS_Pcriticalcom.bea/CM49-Server/CM49-Server/JVMRuntime/HeapFreePercentWeblogic",
"hashField2": "criticalWeblogic",
"hashField3": "criticalBSS_P",
}
}
Looking at the above Json list we expect that every item will be a non duplicate item in RedisDistributed Map Cache Client based on cache entryidenitifier. But it is not what is happening.
Here is the Detect Duplicate Propertie configuration:
CacheEntryIdentifier:
$.data.app::$.data.severity::$.data.type::$.data.node
AgeOffDuration: 5mins
I am expecting the input with same value for data.app, data.severity,data.type ,data.node should be considered as duplicate until AgeOffDuration.and remaining input with diff value for any of those filed shoul be considerd as non duplicate
Created 04-27-2020 06:08 AM
As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.