Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to Remove a Cache Entry Identifier From DistributedMapCacheServer

Solved Go to solution

How to Remove a Cache Entry Identifier From DistributedMapCacheServer

New Contributor

Hi all!

NiFi newbie attacks again!

Today I have a question about using DistributedMapCacheServer.

We have the following scenario:

- We will have a lot of incoming data but we don't want to process two (or more) identifiers at the same time.If a second object with the same identifier come from data flow (FlowFile), we will have to discard it until the other identifier runs through the whole process.

Then we find the DetectDuplicate Processor and it's working perfect (as you can see in the image below and the template is here -> detect-duplicate-v1.xml).

16756-flow-1.png

But the problem is that after the whole process execute, we will have to free the identifier from DistributedMapCacheServer.

We know that the DetectDuplicate Processor has the propertie to clear the cache (Age Off Duration), but it uses time instead of an event to clear the cache, then that propertie doesn't suit for our use case.

16760-detectduplicate-properties.png

Now we are trying to finding a way to Remove the identifier from DistributedMapCacheServer like the flow below.

16757-flow-2-ideal-flow.png

I searched in NiFi docs and internet and I don't found any processor to remove a cached identifier. The only thing that I find was @Matt Burgess article https://community.hortonworks.com/articles/71837/working-with-a-nifi-distributedmapcache.html but it uses a script and I don't know how to use it.

I am missing something?

Any help will be much appreciate!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

Hello @Gabriel Queiroz

I'm surprised to know that there's no existing processor that removes a key from distributed map cache. Would you submit a JIRA issue to request that functionality if possible?

In the mean while, if you encounter such shortcomings, you can address it by writing a custom processor with ExecuteScript or InvokeScriptedProcessor in most cases. Those processors let you write custom processor using your favorite scripting engine.

I've written an example, using Groovy to remove a key from distributed map cache. It will work with your use-case I think.

https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772

6 REPLIES 6

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

Hello @Gabriel Queiroz

I'm surprised to know that there's no existing processor that removes a key from distributed map cache. Would you submit a JIRA issue to request that functionality if possible?

In the mean while, if you encounter such shortcomings, you can address it by writing a custom processor with ExecuteScript or InvokeScriptedProcessor in most cases. Those processors let you write custom processor using your favorite scripting engine.

I've written an example, using Groovy to remove a key from distributed map cache. It will work with your use-case I think.

https://gist.github.com/ijokarumawak/14d560fec5a052b3a157b38a11955772

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

New Contributor

Hi @kkawamura,

I updated my template using your solution and it's working perfect!

here is the template -> detect-duplicate-v2-with-remove-cache.xml

How can I submit a Jira?

Thank you again!

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

Hi @Gabriel Queiroz

In order to submit a JIRA, please go to the login page and sign-up your account. Then you'll see a red 'Create' button.

https://issues.apache.org/jira/login.jsp

19381-jira-sign-up.png

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

New Contributor

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

Great, thank you very much!

Re: How to Remove a Cache Entry Identifier From DistributedMapCacheServer

New Contributor

Hi @Gabriel Queiroz

I have created a processor which removes the entry from cache on event basis. Let me know if I can add this to NiFi.

Thanks,

Ghanashyam