Member since
05-18-2017
10
Posts
1
Kudos Received
0
Solutions
10-03-2017
07:28 PM
Hello @Andy LoPresto, Thank you very much for the detailed answer. 1. Got it. I will try the Decrypt part and verify that. 2. I understand that, it is not the recommended practice, but yes, this is what I was looking for. I will re-evaluate and raise JIRA for requesting a dynamic property for this. 3. What I meant was, the Key is being derived/decrypted using Standard PBE and that Key is used for Encryption/Decryption. 4. Yes, I am able to use the EncryptContent successfully. It is just that, since the data is shared, I had to make sure I replicate the same logic across. Thanks & Regards, Prakash
... View more
10-02-2017
04:28 PM
Hello, We are trying to use the EncryptContent processor to encrypt and decrypt PII data. We are doing Encryption in other applications using JAVA. We are trying to incorporate the same using NiFi. We are able to get EncryptContent working but not able to replicate the exact logic that we use in other applications. We need to be in sync as we share the data. The way, we are doing it in other applications is, we use the AES-CBC-PKC5Padding Keyed encryption along with a Initialization Vector and the Secret Key used, is encrypted using Standard PBE. I would like to know how to replicate this in NiFi. How do I pass the Initialization Vector? Please help. NOTE : I have this working as ExecuteScript and I am also able to use EncryptContent using the raw key hex value without KDF. Thanks, Prakash
... View more
Labels:
06-14-2017
09:48 PM
@Matt Clarke Follow up question, so we have 10 partitions in Kafka and we have 9 NiFi nodes. We are deciding on the best way to configure this. What would help us here is if we know how NiFi scales? Does it try to accommodate consumer from all the nodes (try to distribute) or spawns resource in the same node because it is available? With what happened, it does look like the later. But just wanted to know if that is how it is intended to behave. Would you suggest we configure one concurrent task per node and leave it at that?
... View more
06-14-2017
05:09 PM
Thanks much Matt for the quick response. We do not have 27 partitions and what you have mentioned above makes sense. We are now evaluating on modifying configuration and see how things behave. I will update if we still see this after the changes based on your response.
... View more
06-13-2017
09:03 PM
1 Kudo
As part of our application, we are consuming events from Kafka and we are doing data transformation before sending the events to downstream system. When we are running load as part of Performance testing, we noticed we are not getting even distribution across all the nodes through ConsumeKafka. We have 9 nodes in the cluster. I have attached the document that has snapshots of Kafka properties, count of events on all nodes from Splunk and Flow file status history from ConsumeKafka in NiFi. Trying to understand the behaviour here. And we have also noticed the below error intermittently o.a.n.p.kafka.pubsub.ConsumeKafka
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed due to group rebalance 1) Why is the data not distributed across the nodes? 2) Why do we see the error intermittently?
... View more
Labels:
- Labels:
-
Apache Kafka
-
Apache NiFi
05-19-2017
04:57 PM
@WynnerAlso, how do we clear the state if we want to re-list the files (maybe because there was an issue with the processing of the data)?
... View more
05-19-2017
04:42 PM
@WynnerI am unable to respond to your last response. Thanks for the answers again. Is there a list of processors that uses zookeeper by default for state management (or does all of them use)? I assume ListHDFS work the same way. When and what exactly is the scenario one would use DistributedCacheService for? I tested ListSFTP by bringing down the primary node, restarting all nodes and it seems to work as expected for the listing without any DistributedCacheService configuration.
... View more
05-18-2017
09:45 PM
@Wynner I am using ListSFTP. But the behavior should not change based on processors right?
... View more
05-18-2017
09:20 PM
@Wynner Thanks much for the response. Follow up question for Question 3. Say I have 4 nodes on the cluster, Node 1 is the Primary node and Node 2 is configured to use by the DistributedMapCacheClient controller service. If the Node 2 goes down, then the state information is lost and the next scheduled List will have everything that has been already processed along with any new ones. Is the understanding right? If yes, this doesn't really seem to have a Distributed behavior (that is probably what the other thread is talking about).
... View more
05-18-2017
02:44 AM
I havent gone through the code for the DistributedCache yet. After the first try today, I had a few questions that came up. 1) Do we have to use different Distributed cache Server/Client for each Processor that has State management? For example can we use the same Distributed Cache Serve/Client for both ListSFTP and ListHDFS within the Processor Group? 2) If we specify a value in the "persistence directory" in the DistributedCacheServer, the assumption is the cache is present in both the memory and in Disk? Is the understanding correct? 3) The expectation of using the DistributedCachingService is that the state is maintained across the cluster which means even if we lose the node that was primarily running the Processor, we still do not duplicate the listing. But when I try to understand the thread in the link http://mail-archives.apache.org/mod_mbox/nifi-users/201611.mbox/%3CCA%2BWJ-%2B%2Bqdkg-qRzP-7gUAX%2BAsjj%2Bm%2BH9zoF9LkjgriHKDMgPXw%40mail.gmail.com%3E specifically the line "it does not implement any coordination logic that work nicely with NiFi cluster" I am not sure I exactly follow the issue. Please clarify.
... View more
Labels:
- Labels:
-
Apache NiFi