<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question DetectDuplicate is not working as expected in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181855#M144033</link>
    <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am having problems with DetectDuplicate. It is not working as expected or I am not knowing how to configure it - Am I missing something?&lt;/P&gt;&lt;P&gt;Imagine this simple Json list:&lt;/P&gt;&lt;PRE&gt;[ {
  "ID": 101
}, {
  "ID": 102
}, {
  "ID": 103
}, {
  "ID": 104
}, {
  "ID": 105
}, {
  "ID": 106
}, {
  "ID": 107
}, {
  "ID": 108
}, {
  "ID": 109
}, {
  "ID": 110
} ]
&lt;/PRE&gt;&lt;P&gt;Looking at the above Json list we expect that every item will be a non duplicate item in Distributed Map Cache Server. But it is not what is happening.&lt;/P&gt;&lt;P&gt;Here is the Detect Duplicate Propertie configuration:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20473-11-detect-duplicate-properties.png" style="width: 799px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19196i6CC42ECFAA410CA1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20473-11-detect-duplicate-properties.png" alt="20473-11-detect-duplicate-properties.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When I start the process flow look what happens:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20474-12-result.png" style="width: 1339px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19197i8B21B5F642712F27/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20474-12-result.png" alt="20474-12-result.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Only the first ID is detected as a non duplicate as you can see in the LogAttribute - Non Duplicate data provenance:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20475-13-data-provenance-log-attribute-non-duplicate.png" style="width: 617px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19198i692BAE5F5E0FD87C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20475-13-data-provenance-log-attribute-non-duplicate.png" alt="20475-13-data-provenance-log-attribute-non-duplicate.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;What Am I doing wrong? Am I missing setting some configuration?&lt;/P&gt;&lt;P&gt;Here is the template: &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/20476-detect-duplicate.xml" target="_blank"&gt;detect-duplicate.xml&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Any help will be much appreciated!&lt;/P&gt;&lt;P&gt;Thank you in advance.&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 08:58:29 GMT</pubDate>
    <dc:creator>gabrielfqueiroz</dc:creator>
    <dc:date>2019-08-18T08:58:29Z</dc:date>
    <item>
      <title>DetectDuplicate is not working as expected</title>
      <link>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181855#M144033</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am having problems with DetectDuplicate. It is not working as expected or I am not knowing how to configure it - Am I missing something?&lt;/P&gt;&lt;P&gt;Imagine this simple Json list:&lt;/P&gt;&lt;PRE&gt;[ {
  "ID": 101
}, {
  "ID": 102
}, {
  "ID": 103
}, {
  "ID": 104
}, {
  "ID": 105
}, {
  "ID": 106
}, {
  "ID": 107
}, {
  "ID": 108
}, {
  "ID": 109
}, {
  "ID": 110
} ]
&lt;/PRE&gt;&lt;P&gt;Looking at the above Json list we expect that every item will be a non duplicate item in Distributed Map Cache Server. But it is not what is happening.&lt;/P&gt;&lt;P&gt;Here is the Detect Duplicate Propertie configuration:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20473-11-detect-duplicate-properties.png" style="width: 799px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19196i6CC42ECFAA410CA1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20473-11-detect-duplicate-properties.png" alt="20473-11-detect-duplicate-properties.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When I start the process flow look what happens:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20474-12-result.png" style="width: 1339px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19197i8B21B5F642712F27/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20474-12-result.png" alt="20474-12-result.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Only the first ID is detected as a non duplicate as you can see in the LogAttribute - Non Duplicate data provenance:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20475-13-data-provenance-log-attribute-non-duplicate.png" style="width: 617px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19198i692BAE5F5E0FD87C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20475-13-data-provenance-log-attribute-non-duplicate.png" alt="20475-13-data-provenance-log-attribute-non-duplicate.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;What Am I doing wrong? Am I missing setting some configuration?&lt;/P&gt;&lt;P&gt;Here is the template: &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/20476-detect-duplicate.xml" target="_blank"&gt;detect-duplicate.xml&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Any help will be much appreciated!&lt;/P&gt;&lt;P&gt;Thank you in advance.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 08:58:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181855#M144033</guid>
      <dc:creator>gabrielfqueiroz</dc:creator>
      <dc:date>2019-08-18T08:58:29Z</dc:date>
    </item>
    <item>
      <title>Re: DetectDuplicate is not working as expected</title>
      <link>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181856#M144034</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/20327/gabrielfqueiroz.html" nodeid="20327"&gt;@Gabriel Queiroz&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;If you'd like to use ID FlowFile attribute from DetectDuplicate processor's 'Cache Entry Identifier', you need to use &lt;A href="https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#structure"&gt;NiFi Attribute Expression Language&lt;/A&gt; syntax. Currently you have configured it as '$ID', but you need it as '${ID}' (wrap it with a curly bracket).&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jul 2017 09:16:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181856#M144034</guid>
      <dc:creator>kkawamura</dc:creator>
      <dc:date>2017-07-18T09:16:37Z</dc:date>
    </item>
    <item>
      <title>Re: DetectDuplicate is not working as expected</title>
      <link>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181857#M144035</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/3908/kkawamura.html" nodeid="3908" target="_blank"&gt;@kkawamura&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;you are saving me again!&lt;/P&gt;&lt;P&gt;In this question &lt;A href="https://community.hortonworks.com/questions/110551/how-to-remove-a-cache-entry-identifier-from-distri.html#comment-113808" rel="nofollow noopener noreferrer" target="_blank"&gt;https://community.hortonworks.com/questions/110551/how-to-remove-a-cache-entry-identifier-from-distri.html#comment-113808&lt;/A&gt; you sent an example &lt;A href="https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772" rel="nofollow noopener noreferrer" target="_blank"&gt;https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772&lt;/A&gt; and your example is here in my NiFi, and a looked at it several times... but I don't pay attention to this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="20487-kkawamura-remove-cache-example.png" style="width: 800px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19195i7862E66C4F98C318/image-size/medium?v=v2&amp;amp;px=400" role="button" title="20487-kkawamura-remove-cache-example.png" alt="20487-kkawamura-remove-cache-example.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;My fault! I'm ashamed!&lt;/P&gt;&lt;P&gt;Thank you very much again &lt;A rel="user" href="https://community.cloudera.com/users/3908/kkawamura.html" nodeid="3908" target="_blank"&gt;@kkawamura&lt;/A&gt;!&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 08:58:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/181857#M144035</guid>
      <dc:creator>gabrielfqueiroz</dc:creator>
      <dc:date>2019-08-18T08:58:10Z</dc:date>
    </item>
    <item>
      <title>Re: DetectDuplicate is not working as expected</title>
      <link>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/294864#M217476</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;I am having problems with DetectDuplicate. It is not working as expected or I am not knowing how to configure it - Am I missing something?&lt;/P&gt;&lt;P&gt;Imagine this simple Json list:&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt;"data": {&lt;BR /&gt;"alertaID": "xxxxx",&lt;BR /&gt;"app": "BSS",&lt;BR /&gt;"node": "Weblogic",&lt;BR /&gt;"severity": "critical",&lt;BR /&gt;"type": "com.bea/CM49-Server/CM49-Server/JVMRuntime/HeapFreePercent",&lt;BR /&gt;"hashField1": "BSS_Pcriticalcom.bea/CM49-Server/CM49-Server/JVMRuntime/HeapFreePercentWeblogic",&lt;BR /&gt;"hashField2": "criticalWeblogic",&lt;BR /&gt;"hashField3": "criticalBSS_P",&lt;BR /&gt;}&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Looking at the above Json list we expect that every item will be a non duplicate item in RedisDistributed Map Cache Client based on cache entryidenitifier. But it is not what is happening.&lt;/P&gt;&lt;P&gt;Here is the Detect Duplicate Propertie configuration:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;CacheEntryIdentifier:&amp;nbsp;&lt;/P&gt;&lt;P&gt;$.data.app::$.data.severity::$.data.type::$.data.node&lt;/P&gt;&lt;P&gt;AgeOffDuration: 5mins&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I am expecting the input with same value for data.app, data.severity,data.type ,data.node should be considered as duplicate until AgeOffDuration.and remaining input with diff value for any of those filed shoul be considerd as non duplicate&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Apr 2020 11:37:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/294864#M217476</guid>
      <dc:creator>PoonamB</dc:creator>
      <dc:date>2020-04-27T11:37:35Z</dc:date>
    </item>
    <item>
      <title>Re: DetectDuplicate is not working as expected</title>
      <link>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/294876#M217480</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/77320"&gt;@PoonamB&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Apr 2020 13:08:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/DetectDuplicate-is-not-working-as-expected/m-p/294876#M217480</guid>
      <dc:creator>cjervis</dc:creator>
      <dc:date>2020-04-27T13:08:23Z</dc:date>
    </item>
  </channel>
</rss>

