Member since
07-30-2019
3406
Posts
1622
Kudos Received
1008
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 118 | 12-17-2025 05:55 AM | |
| 179 | 12-15-2025 01:29 PM | |
| 119 | 12-15-2025 06:50 AM | |
| 244 | 12-05-2025 08:25 AM | |
| 405 | 12-03-2025 10:21 AM |
10-29-2018
06:36 PM
@Bobby Harsono - Some processor may be designed to utilize memory outside of the JVM. Some of the scripting processor like ExecuteProcess or ExecuteStreamCommand are a good examples. They are calling a process or script external to NiFi. Those externally executed commands will have a memory footprint of their own. - Listen type processors like ListenTCP or ListenUDP is another example. These have memory footprints both inside and outside the NiFi JVM heap space. These processors can be configured with socket buffer which is created outside of heap space.- - Thanks, Matt
... View more
10-29-2018
06:26 PM
@naveen - I would recommend getting several thread dumps from NiFi when in this situation to see what is causing your current threads to stale. This can be achieved using the <path to NiFi>/bin/nifi.sh script as follows: - ./nifi.sh dump <name of dump file> - Some other things to try: 1. Under heavy volume the default NiFi provenance implementation (org.apache.nifi.provenance.PersistentProvenanceRepository) may not be able to keep up. If NiFi is is waiting on Provenance, all flows will appear to be stalled. Make sure you are instead using the new org.apache.nifi.provenance.WriteAheadProvenanceRepository implementation which was redesigned to be much more performant. 2. Make sure you do not have constant Garbage collection occurring. Even minor/young GC is a stop-the-world event. It is possible that after some time of running and ingesting data, GC gets in to a non stop cycle of trying to free heap memory space. 3. Have you changed the default Max Timer Driven Thread count settings under "controller settings" in the Global menu in upper right corner of UI. Default is only 10. 4. Avoid configuring any of your processors to use the Event Driven scheduling strategy. - Thank you, Matt
... View more
10-24-2018
02:57 PM
1 Kudo
@Willian Gosse - During a NiFi restart, the flow is loaded and started before the NiFi UI is made available. During this period of time the Remote Process Groups (RPG) on each node will fail to be able to connect to the configured target NiFi URL to fetch the Site-To-Site (S2S) details. This is expected behavior. The RPGs will stop throwing this error in the logs once the configured target NiFi URL is made available and the S2S details are successfully retrieved. - The use of HTTP or RAW as the transport protocol controls how the actual FlowFiles are transferred. The re-occurring connection to retrieve the S2S details will always be over http to the target NiFi URL configured in the RPG. When using HTTP transport protocol. the NiFi FlowFiles will also be transferred via the same HTTP port as the Target NiFi UI is exposed on. Setting transport protocol to use RAW causes the RPG to use a dedicated socket port for the FlowFile transfer. The socket port used is set by the target NiFi servers in the nifi.properties file (property: nifi.remote.input.socket.port=). The advantage to using RAW is that amount of traffic going to HTTP port used to access UI is reduced considerably. The advantage to using HTTP is that you have one less port you must open through any firewalls to the NiFi nodes. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.
... View more
10-19-2018
06:11 PM
@Andy Gisbo Yes, that guide is accurate example of using OpenID with google.
... View more
10-19-2018
06:05 PM
@Emma Ixiato - The list based processors rely on File timestamps to determine if a file should be listed or not. This means that the list based processors may not list files in the target location if: - 1. New files added to source location do not have their timestamp updated. (Thus last recorded timestamp in NiFi from previous listing is newer the age of file that was added) 2. Multiple files are being written to source location at same time and the list based processor did not list all of them in my execution. Second execution would miss other files because of recorded timestamp from first list execution. - Not really sure what NiFi version you are running, but here are a few Jiras aimed at making list based processors work much better: 1. https://jira.apache.org/jira/browse/NIFI-3332 <-- (Addressed as of Apache NiFi 1.4.0) 2. https://jira.apache.org/jira/browse/NIFI-4069 <-- (Addressed as of Apache NiFi 1.4.0) 3. https://jira.apache.org/jira/browse/NIFI-5157 <-- (Addressed as of Apache NiFi 1.8.0 being released soon) - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.
... View more
10-19-2018
05:07 PM
@Stephen Greszczyszyn The word "process" can mean many things. What kind of processing are you trying to do? - The content of your syslog data is just standard Ascii, correct? If so, then it can be read by many processors. So thread question is what are you trying to do with it? - I am assuming your syslog ingest may consist of many log lines per FlowFile. If that is the case you may want to "process" these FlowFiles as records. Maybe start by looking at the various "Record" based processors. The GrokReader is probably what you want to configure the record based processors to use in order to parse your syslog content. - Thanks, Matt
... View more
10-18-2018
03:20 PM
@Pepelu Rico In the upcoming Apache NIFi 1.8 release, you may find following new capability will solve your use case issue here: https://jira.apache.org/jira/browse/NIFI-5406 Thanks, Matt
... View more
10-18-2018
03:17 PM
@Pepelu Rico Typically file being transferred to a SFTP server are written using a dot "." filename and then renamed to remove the leading dot "." once transfer has completed. - The ListSFTP processor by default has property named "Ignore Dotted Files" which should be set to "true" so that files with names that start with a dot are ignored and not listed. - While above is typical it is possible that files being written to your SFTP server are not using the standard dot/rename transfer method. Is there some other unique naming/renaming happening to indicate transfer is complete? If so, perhaps you could set up a "File Filter Regex" to avoid listing these files still being transferred. - As long as timestamp on file being written is being updated as it is being written to, the listSFTP processor will list the same file again. The ListSFTP processor creates a FlowFile Attribute named "file.size". You could compare this attribute with the FlowFile content "fileSize" attribute after the FetchSFTP processor. If they do not match, you could discard this FlowFile and wait for next listing of dame FlowFile to arrive where these values match. This option is not ideal because it means fetching content multiple times until complete file is fetched. - Aside from above there really aren't any other options here. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.
... View more
10-17-2018
02:29 PM
@pavan srikar yes, that makes sense. When you start the DistirbutedMapCache server is starts a server on each NiFi node.The DIstributedMapCache Client should be configured to point at one specific node, so that every node pulls cache entries from same server. - A little back history: The DistibrutedMapCacheServer and DistirbutedMapCacheClient controller services date back to original NiFi releases versions. Back in those days there was no zero master clustering which we have now. There was a dedicated server that ran a NiFi Cluster Manager (NCM). At that time the DistributedMapCacheServer could only be setup on the NCM. - Once NiFi moved away from having a NCM, the functionality of these controller services was not changed to avoid breaking flows of user who moved to latest versions. The DistirbutedMapCacheServer does not offer HA (if node hosting server goes down, cache becomes unavailable). To provide HA here, new external HA caches options have been added as options. - thanks, Matt
... View more
10-17-2018
02:29 PM
yes, that makes sense. When you start the DistirbutedMapCache server is starts a server on each NiFi node. The DIstributedMapCache Client should be configured to point at one specific node, so that every node pulls cache entries from same server. - A little back history: The DistibrutedMapCacheServer and DistirbutedMapCacheClient controller services date back to original NiFi releases versions. Back in those days there was no zero master clustering which we have now. There was a dedicated server that ran a NiFi Cluster Manager (NCM). At that time the DistributedMapCacheServer could only be setup on the NCM. - Once NiFi moved away from having a NCM, the functionality of these controller services was not changed to avoid breaking flows of user who moved to latest versions. The DistirbutedMapCacheServer does not offer HA (if node hosting server goes down, cache becomes unavailable). To provide HA here, new external HA caches options have been added as options. - thanks, Matt
... View more