Member since
04-29-2016
192
Posts
20
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1693 | 07-14-2017 05:01 PM | |
2890 | 06-28-2017 05:20 PM |
01-12-2017
02:40 PM
1 Kudo
Also, I'm in the process of having the Socket buffer (for ListenTCP) increased to 4 MB (the max the Unix admins can change it to).
... View more
01-12-2017
02:39 PM
1 Kudo
Thanks a lot @Timothy Spann I'm going to work with our Admin about the JVM settings and about the # of cores we have. The flowfiles are small, about 5 KB each or less. ListenTCP processor is throwing these errors - "Internal queue at maximum capacity, could not queue event"; and messages are queuing on the source system side. Below are the memory settings for the ListenTCP that I set.
... View more
01-12-2017
03:54 AM
1 Kudo
Hi guys, I have a use case where we need to load near real-time streaming data into HDFS; incoming data is of high volume, about 1500 messages per second; I've a NiFi dataflow where the ListenTCP processor is ingesting the streaming data, but the requirement is to check the incoming messages for the required structure; so, messages from ListenTCP go to a custom processor that does the structure checking; only messages that have the right structure move forward to MergeContent processor and onto PutHDFS; right now, the validation/check processor became a bottleneck and the backpressure from that processor is causing ListenTCP to queue messages at the source system (the one sending the messages); Since the message validation processor is not able to handle the incoming data fast enough, I'm thinking that I write the messages from ListenTCP first to the file system and then let the validation processor get the messages from the file system and continue forward. Is this the right approach to resolve this; are there any suggestions for alternatives. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache NiFi
01-11-2017
07:20 PM
1 Kudo
Hello, From using the NiFi's PutHDFS processor, it seems it creates missing directories in HDFS by default, but documentation doesn't specify whether it would or would not create missing directories and there are no properties related to it (like there is one for PutFile I believe). Does anyone know anything more about this "undocumented" feature, if I can call that. My only concern is, if this is not a documented feature, does this feature (creating missing directories) work reliably in a Production environment. Did i miss this in the documentation ? Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi
01-11-2017
02:16 PM
Awesome explanation @Matt, thanks. Earlier I saw some examples of people retrying failed flow files 3 times, etc., but I was not sure where that would make sense; but I see now where it would be appropriate to retry flow files; for retrying, besides the failed flowfiles for network related errors, at what other processors or types of scenarios would need a retrying of failed flowfiles? since we have 2 types of scenarios, one where you want to retry flow files and the other where you want to log, etc., I was thinking to have 2 process groups that accommodate these 2 scenarios and if I have a lot of processors where there is potential for failure, then collect the 2 kinds of failed flowfiles (one to retry and one to log) and send them to either of these 2 process groups accordingly. would that approach work ? Thanks in advance.
... View more
01-10-2017
09:13 PM
2 Kudos
Hi All, I would appreciate if you guys can point me to where I can find best practices for error handling in NiFi. Below is how I'm envisioning handling errors in my workflows. Would you suggest any enhancements or better ways to do it. My error handling requirements are simple, basically to log the errored flow files to the file system and send an alert; so all the processors in the dataflow that have a "failure" relationship would send the failed flowfiles to a funnel and from there they would go to an error handling Process group, which does the logging and alerting. Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
01-10-2017
07:39 PM
Greetings, I am not sure if I understand why you would create multiple input and output ports for a process group (PG). What purpose would the additional ports serve ? I am thinking that if you want to "call" (or send/receive data to/from) the same PG from many different processors in NiFi, then you would use different ports for each processor, to avoid mixing data the PG is getting from the different processors. As an example (please see the image below) if the PG has a MergeContent processor, then I would not want to merge flow files that are coming from different Processors into the PG. Is that one (if any) of the reason for having multiple ports in a PG ?
... View more
Labels:
- Labels:
-
Apache NiFi
11-22-2016
06:14 PM
@jfrazee @Andrew Grande @jwitt thank you all for the ideas and information.
... View more