Member since
07-30-2019
3406
Posts
1622
Kudos Received
1008
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 109 | 12-17-2025 05:55 AM | |
| 170 | 12-15-2025 01:29 PM | |
| 115 | 12-15-2025 06:50 AM | |
| 243 | 12-05-2025 08:25 AM | |
| 405 | 12-03-2025 10:21 AM |
07-11-2018
05:28 PM
1 Kudo
@Nikhil NiFi Site-To-Site uses two-way TLS authentication. - Check to make sure the keystore file being used on each of your NiFi nodes contains a single "PrivateKeyEntry" and make sure the PrivateKeyEntry supports both the ClientAuth and ServerAuth key usage. - If the PrivateKeyEntry supports serverAuth only, the NiFi service will not be able to provide a client certificate in the TLS handshake. - I also noticed timestamps for entries in your nifi-user.log to not match with timestamps from the shared nifi-app.log file. The entries specifically shared are not directly related to one another. - Thank you, Matt -
... View more
07-11-2018
04:53 PM
@David Miller I don't understand the concern with a "disconnected node" still receiving data from a load-balancer? Just because a node is disconnected does not mean it is not still running its dataflow(s). If the NiFi service or hosting server is down, their will be no running listener to receive data from the load-balancer, so the LB should failover to a different server.
... View more
07-11-2018
03:49 PM
@Ilya Li - Each NiFi processor component must complete its execution before the triggering FlowFile is routed to one of the outbound relationships from the incoming connection. - Using you example: the PutSQL processor would execute based on an incoming Flow or batch of incoming FlowFiles. Should NiFi completely die in the middle of the scheduled execution. Upon NiFi recover the FlowFile or Batch of Flowfiles would still be located on incoming connection to the putSQL processor and the same execution would occur again. - Thank you, Matt - When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.
... View more
07-11-2018
02:42 PM
@Rakesh S - Short answer: The Cluster Coordinator has no role in data distribution within a cluster. Each NiFi node only works on data it specifically receives. -
... View more
07-11-2018
02:29 PM
1 Kudo
@Mohammad
Soori
Just to make sure I understand correctly... - The TailFile is producing only 20 output FlowFiles; however, all 500 records are included within those 20 FlowFiles. correct? - With a Run Schedule of 0 secs, the processor will be scheduled toe execute and then scheduled to execute again immediately following completion of last execution. During its execution, it will consume all new lines seen since last execution. There is no configuration option that will force this processor to output a separate FlowFile for each line read from the file being tailed. - You could however feed the output FlowFiles to a splitText processor to split each FlowFile in to a separate FlowFile per line. - Thank you, Matt - When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.
... View more
07-10-2018
01:18 PM
@umang s Based on your flow design above, it looks like you are trying to route FlowFiles by comparing attribute between two different FlowFiles? That will not work. NiFi is looking for both ${temp_array} and ${category} to exist on same flowfile being evaluated by the RouteOnAttribute processor.
... View more
07-10-2018
12:01 PM
@Benjamin Bouret - The listHDFS processor does not retrieve the actual content of the files. It produces 0 byte FlowFiles that have metadata about the target content. Any hash you produce on these files will not match what the hash produced on the original source ftp server. - If I am not following above correctly, I am not really clear on exactly where you are performing this second hash. How you plan to compare the two hashes. Manually? - NiFi has guaranteed delivery when it writes data to HDFS. If the transfer fails for any reason the FlowFile is routed to failure. - FetchFTP processor also has handling of failures in retrieving the Content: - This check seems like a lot of overhead that should not be necessary. - Thank you, Matt - When an "Answer" addresses/solves your question, please select "Accept" beneath that answer. This encourages user participation in this forum.
... View more
07-09-2018
01:22 PM
1 Kudo
@Derek Calderon - Short answer is no. The ExecuteSQL processor is written to write the output to the FlowFile's content. - There is an alternative solution. You have some processor currently feeding FlowFiles to your ExecuteSQL processor via a connection. My suggestion would be to feed that same connection to two different paths. The first connection feeds to a "MergeContent" processor via a funnel and the second feeds to your "ExecuteSQL" processor. The ExecuteSQL processor performs the query and retrieves the data you are looking for writing it to the content of the FlowFile. You then use a processor like "ExtractText" to extract that FlowFIles new content to FlowFile Attributes. Finally you use a processor like "ModifyBytes" to remove all content of this FlowFile. Finally you feed this processor to the same funnel as the other path. The MergeContent processor could then merge these two flowfiles using the "Correlation Attribute Name" property (assuming "filename" is unique, that could be used), min/max entries set to 2, and "Attribute Strategy" set to "Keep All Unique Attributes". The result should be what you are looking for. - Flow would look something like following: Having multiple identical connections does not trigger NiFi to write the 200 mb of content twice to the the content repository. a new FlowFile is created but it points to the sam content claim. New content is only generated when the executeSQL is run against one of the FlowFiles. So this flow does not produce any additional write load on the content repo other then when the executeSQL writes its output which i am assuming is relatively small? - Thank you, Matt
... View more
07-09-2018
12:37 PM
1 Kudo
@Henrik Olsen - NiFi's various timeout settings are very aggressive. They are more ideal for a standalone NiFi instance running a fairly simple dataflow. In a NiFi cluster the following timeouts should be increased: nifi.cluster.node.connection.timeout=5 secs (increase to 30 secs)
nifi.cluster.node.read.timeout=5 secs (Increase to 30 secs)
nifi.zookeeper.connect.timeout=3 secs (increase to 60 secs)
nifi.zookeeper.session.timeout=3 secs (Increase to 60 secs) - A restart of NiFi will be needed after making these changes. Another thing you could do when this condition is present is to use the browser developer tools to try to catch what action is timing out. Are you seeing a lot of full garbage collection occurring (if these stop-the-world events are long enough, it can also cause this). - Thank you, Matt
... View more
07-06-2018
05:09 PM
@Derek Calderon - Sorry to hear that. I did share this HCC link with a few devs I know if they have time to assist. - Thanks, Matt
... View more