Created 02-07-2019 03:09 PM
Hi, currently I'm dealing with the DistributedMapCache-thing...
Found this article which gave me an idea of all this, great! https://community.hortonworks.com/articles/71837/working-with-a-nifi-distributedmapcache.html
Now I try to involve DistributedMapCache (DMC) in my NiFi-Flow without executing the script. Found old questions here but maybe in 1.7 or 1.8 there are new possibilities? Referring to above article, how can I in NiFi...
1. remove a concrete key (like command "remove")?
2. receive a list of existing keys (like command "keys"), maybe inclusive there values?
Further questions:
3. How can I clear all content of the DMC at once? Tested to disable the Controller Service for DMC-Server, but after enabeling data still existing.
4. In practice if one wants to use different tables in DMC would they all be in one DMC? Or would I configure DMC-Server and DMC-ClientService for each table?
5. At the moment I work at local installation. Is all this DMC-thing working without difficulty on a nifi-cluster (which will be coming soon I hope)?
6. Technical at the configuration of the DMC-ClientService the hostname and port of the DMC-Server has to be specified fix - which one to choose in a cluster?
If someone has further information concerning this subject I would be glad to get them. Thanks all!
Created 02-11-2019 03:20 PM
1-3: The processors that use a DMC client use the DMC in a very specific manner, so they CRUD cache entries as it applies to their operations. There isn't currently a generic processor that lets you call arbitrary cache API methods, that's what the scripting components are for.
4: We don't have the concept of tables in DMC, only key/value pairs. A table can probably be implemented by namespacing the key, not sure if the processors you're using support custom keys though.
5: The DMC operation in a cluster is very similar to how it works on a local installation, except there is a DMC server created on each node in the cluster. However a DMC client still has to choose a single host:port to connect to, and the individual DMC servers are not coordinated at the cluster, meaning if you update one, the others don't get that update; they are fully separate at the moment.
6: AFAIK there is no best practice as far as choosing a DMC server to connect to, other than choosing one on a node that tends to be available most often. You basically get individual, isolated instances to choose from. We have other DMC server implementations that possibly support High Availability and/or Data Durability, such as HBase- or Redis- backed solutions. However neither of these are included with an Apache NiFi distribution, you'd have to bring your own.
Created 02-11-2019 02:43 PM
@Matt Burgess
May I ask You whether You have some answers for me (expecially concerning questions 1 and 2)? Thanks.
Created 02-11-2019 03:20 PM
1-3: The processors that use a DMC client use the DMC in a very specific manner, so they CRUD cache entries as it applies to their operations. There isn't currently a generic processor that lets you call arbitrary cache API methods, that's what the scripting components are for.
4: We don't have the concept of tables in DMC, only key/value pairs. A table can probably be implemented by namespacing the key, not sure if the processors you're using support custom keys though.
5: The DMC operation in a cluster is very similar to how it works on a local installation, except there is a DMC server created on each node in the cluster. However a DMC client still has to choose a single host:port to connect to, and the individual DMC servers are not coordinated at the cluster, meaning if you update one, the others don't get that update; they are fully separate at the moment.
6: AFAIK there is no best practice as far as choosing a DMC server to connect to, other than choosing one on a node that tends to be available most often. You basically get individual, isolated instances to choose from. We have other DMC server implementations that possibly support High Availability and/or Data Durability, such as HBase- or Redis- backed solutions. However neither of these are included with an Apache NiFi distribution, you'd have to bring your own.
Created 02-11-2019 04:05 PM
Hi @Matt Burgess thanks for Your quick and detailed answer!
1-3: I see, the script was not just to illustrate the DMC - it is NECESSARY to work with it in NiFi. OK I will use it.
4: So I have to prepend some information at "Cache Entry Identifier" on PutDMC to identify the entries coming from different "tables". OK this will work.
5-6: This points I have to clarify with the "techies"...