Member since
07-30-2019
3472
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 278 | 06-03-2026 06:06 PM | |
| 546 | 05-06-2026 09:16 AM | |
| 1090 | 05-04-2026 05:20 AM | |
| 616 | 05-01-2026 10:15 AM | |
| 720 | 03-23-2026 05:44 AM |
04-18-2024
01:30 PM
1 Kudo
@s198 I think step one would be looking more into the failures. Are the failures always with rename of dot file? Put SFTP is writing to a dot file (hidden file) and then upon write completion moves file from .xyz to xyz. You also never shared your complete putSFTP processor configuration. 1. Did you inspect the SFTP server log for any logging related to the failures you encountered? 2. What is being done with the files once placed on the SFTP server? Is there some other process consuming them from there? 3. Any chance that other process is consuming the dot files (hidden files) before NiFi has a chance to rename them? 4. Any of the FlowFiles queued have the same "filename" attribute as another FlowFile or a file already present on the target SFTP server? (this is a common issue where the file of same name still exists on the target when the other is written as dot file and then rename fails. Then on retry some process consumed the duplicate and the new is then successful on rename). As far as option 3 and 4 go, both introduce some latency in your dataflow. with (3) the processor only get scheduled once every 30 seconds. So FlowFiles will queue up every 30 seconds. The putSFTP processor has a batch setting for how many FlowFiles will get processed in that execution. If more FlowFiles are queued then that batch setting, he extras will sit until next time processor is scheduled. My concern is that latency introduced my options 3 and 4 may simply be masking the actual issue needing to be addressed. With (4) the processor gets scheduled as fast as possible, but when it executes the thread remains active for 500ms working on as many FlowFiles as possible in the single execution. Then at 500ms it close out that thread and the processor (assuming run schedule of 0) would immediate schedule the processor again. As far as which is better, it is about getting best performance throughput with least amount of latency. Data volumes, sizes, etc come it play here. I typically favor option 4 myself. But if option 3 still works for you but with a much lower runs schedule (30 secs is a lot of latency for a continues flow) Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-18-2024
06:44 AM
1 Kudo
@double_w Can you share some details on which specific components you are using that appear to lose state after upgrade? The upgrade from 1.13.2 to 1.25.0 is a large leap. Did you test out your dataflow after upgrading to 1.25.0? Was state still working correctly before migrating then to 2.0.0-M3? I am unaware if any way to migrate local state to Zookeeper. While I do not have an answer for you here, the more details you share the more i can look into it possibly as I have time. Thanks, Matt
... View more
04-18-2024
06:15 AM
1 Kudo
@whoknows While there s no exact date yet in the Apache NiFi community, I have seen discussions around it as recent as Apr 8th that suggests it will be happening very soon. Possibly within the next week or two. Thank you, Matt
... View more
04-18-2024
06:10 AM
1 Kudo
@s198 Question for you: - If you leave the putSFTP processor stopped, run your dataflow so all FlowFiles queue in front of the putSFTP processor and then start the putSFTP processor, does the issue still happen? - Does issue only happen when the flow is an all started/running state? Answers to above can help in determining if changing the run schedule will help here. Run Schedule details: - The run schedule works in conjunction with Timer Driven scheduling strategy. This schedule setting controls how often a component will get scheduled to execute (different from when it actually executes. Execution depends on available threads in the NiFi Timer Driven thread pool shared by all components). By default this is set to 0 secs which means that NiFi should schedule his processor as often as possible (Basically schedule it again as soon as an available concurrent task (concurrent tasks default is 1) is available to it. To avoid CPU saturation here, NiFi builds in a yield duration if upon scheduling of a processor there is no work to be done (inbound connection(s) are empty). Depending on load on yoru system and dataflow, speed of network, this could happen very quick meaning it scheduled, sees only one FlowFile in the inbound connection at time of schedule and processes only that one FlowFile instead of a batch. It then closes that thread and starts a new one for next FlowFile instead of processing multiple FlowFiles in one SFTP connection. By changing run schedule you are allowing more time between scheduling for FlowFiles to queue on the inbound connection so they get batch processed in a single SFTP connection. Run Duration details: Another option on processors is the run duration setting. What this adjustment does is upon scheduling of a processor the execution will not end until the configured run duration has elapsed. So lets say at time of scheduling (run schedule) there is one FlowFile in inbound connection queue (remember we are dealing with micro seconds here, so not something you can visualize yourself via the UI). That execution thread will execute against that FlowFile, but rather then close out the session immediately committing the FlowFile to an outbound relationship, it will check inbound connection for another FlowFile and process it in same session. It will continue to do this until the run duration is satisfied at which time all processed FlowFiles during that execution are committed to downstream relationship(s). So Run Duration might be another setting you try to see if it helps with your issue. If you try run duration, i'd set run schedule to default. You may also want to look at your SFTP server logs to see what is happening when the file rename attempts are failing. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-17-2024
11:30 AM
1 Kudo
@shiva239 As i dig in a bit more here, this looks like a Java version compatibility issue with this Ignite driver. Apache NiFi 2.0.0-M2 requires Java 21. I see other reporting similar: "Could not initialize class org.apache.ignite.IgniteJdbcThinDriver" Exceptions when using Java version 16 and newer here: https://issues.apache.org/jira/browse/IGNITE-14888 I am not familiar with Ignite and its core driver dependencies, but above is likely your issue. In the above ignite jira, a commentor seemed to work around issue by: copied the jvm parameters from jvmdefaults.bat in ignite home catalog Perhaps you can try doing the same and adding those additional jvm paramaters to the NiFi bootstrap.conf file. This is not something I have ever tried or have an environment in which to test, but perhaps it will help you. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-17-2024
05:42 AM
@shiva239 Is it the exact same error? What is the full stack trace logged to the nifi-app.log? Thanks, Matt
... View more
04-16-2024
05:44 AM
@Jim_Steinebrey @shiva239 I would avoid adding any additional custom jars or nars to the NiFi lib directory. This will complicate future upgrade efforts. Instead you should create a new directory outside NiFi's default directory tree for your custom jars that your specific dataflow(s) depend on. Make sure that the custom driver directory is both reachable and readable by your NiFi's service user. Then as commented above configure the "Database Drivers Location(s)" property in the DBCPConnectionPool controller service to point to that new directory. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-16-2024
05:36 AM
@whoknows The Spring Framework has been upgraded to 5.3.34 as part of the Apache NiFi 1.26 release as outlined in the following jira: https://issues.apache.org/jira/browse/NIFI-13037 My guess here is that you ran into dependency issues on startup after modifying those individual 1.25 nars? I suggest upgrading to 1.26 instead when it is released. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-15-2024
09:52 AM
@tmarkfeld The NiFi UI will not be avialable until you see the following lines in the nifi-app.log: 2024-03-17 16:39:59,500 INFO [main] org.apache.nifi.web.server.JettyServer NiFi has started. The UI is available at the following URLs:
2024-03-17 16:39:59,500 INFO [main] org.apache.nifi.web.server.JettyServer https://localhost:8443/nifi Up until this time NiFi is still loading. I see you mentioned it stalls at unpacking the nars but eventually completes as you also said eventually you see NiFi listening on port 8443. You can use developer tools within your browser to see what calls have been made when you try to access your NiFi, you could then share what calls were made and which it appears to be hanging on. Hope this helps, Matt
... View more
04-15-2024
09:34 AM
1 Kudo
@Ytch All components on the NiFi canvas are executed as the NiFi service user and not as the user currently authenticated into the NiFi service. So what you should do is from each host in your NiFi cluster (do on every host since any one of the hosts can be elected as the primary node at any given time), open a command prompt window/console window, become the user that owns the NiFi process, and manually ssh/sftp to the target SFTP server. You will likely be prompted to add the target SFTP server to your known_hosts file for the NiFi service user. NiFi SFTP processor has no way of doing this interactive step. After successfully adding the SFTP to the known_hosts file for the NiFi service user, go back and try to start the GetSFTP or ListSFTP processors again to see if your issue is resolved. If not, please share your GetSFTP and ListSFTP processor component configurations. Also check the nifi-app.log for any exceptions or log output related to these processors. If no log output, you could also try enabling debug in the NiFi logback.xml for these processor classes to see what additional log output may be produced that could be useful here. classes for these processors are: org.apache.nifi.processors.standard.GetSFTP
org.apache.nifi.processors.standard.ListSFTP new log lines would look like this that you would add to logback.xml: <logger name="org.apache.nifi.processors.standard.GetSFTP" level="DEBUG"/>
<logger name="org.apache.nifi.processors.standard.ListSFTP" level="DEBUG"/> Simply add them in logback.xml where you see similar lines already. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more