Member since
07-30-2019
2910
Posts
1444
Kudos Received
846
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
120 | 04-23-2024 05:56 AM | |
33 | 04-22-2024 06:13 AM | |
168 | 04-17-2024 11:30 AM | |
125 | 04-16-2024 05:36 AM | |
85 | 04-15-2024 05:31 AM |
04-25-2024
06:16 AM
1 Kudo
@AlexisRub NiFi has never offered an embedded user authentication management feature until the more recent single-user-provider authentication. This provider was only introduced in order for Apache NiFi to support HTTPS out-of-the-box default setup. Over the years since Apache NiFi was open sourced the community noticed unsecured (previous out-of-box default) exposed on the internet, so a decision was made to change the out-of-the-box setup to be secured. A secured NiFi requires that all users/clients are both authenticated and authorized. The Single-User-Provider was introduced to simplify access to a secured NiFi for evaluation purposes. This authentication provider as you have noticed does not support multiple users. The corresponding single-user-authorizer found in the authorizers.xml configuration also does not support multi-user authorization. This authorizer simply provides the single-user-provider user complete and full authorized access to everything in the NiFi. This provider also does not support NiFi clusters. For a multi-user environment or clustered NiFi a different method of external authentication and authorization must be used. Apache NiFi provides support for numerous user/client authentication beyond just single-user, LDAP, and kerberos listed in the User Authentication section of the admin guide. Worth noting is that a secured NiFi requires a keytore and truststore and NiFi will generate the keystore and truststore files with self-signed clientAuth/ServerAuth certifcate if the keystore an truststore do not already exist at startup. When NiFi is secured (HTTPS enabled and valid keystore and truststore configured) and no additional authentication methods have been configured, user/client authentication is required through the TLS exchange. This means that when you try to access the NiFi UI via yoru browser NiFi will respond to the browser (client) within the TLS exchange that a clientAuth certificate is "REQUIRED". If one is not provided the connection is closed. When additional authenication methods are configured NiFi will instead "WANT" a clientAuth certificate. If the browser does not present a client certificate, NiFi moves on to next configured authentication method. I wanted to point out the above since certifcates are probably the next easiest way to setup a multi-user authenticated access. This would require you generating a unique clientAuth certificate for each unique user. These clientAuth certicates would either be self signed or signed by some certificate authority. If self signed the public cert for each would need to be added to the NiFi truststore file. If signed by some authority, only that signing authorities trust chain would need to be added to NiFi's truststore. The unique users would then load their client certifcate into their browser so it could be presented in the mutual TLS exchange with yoru NiFi. In order to authorize multiple users, you would need to stop using the default single-user-authorizer and instead use the StandardManagedAuthorizer. This authorization provider will allow you to define yoru initial admin user (this user will be granted the minimum required admin authorizations. So initially this would be only user authorized to access the NiFi UI. Once access, this initial admin user can define additional user and group identities directly from the NiFi UI to which Authorization policies can be defined. Granting the same policies also granted to your initial admin user will establish a second admin user's authorizations. More information on the various policies and what they grant can be found here in the Configuring Users & Access Policies section of the admin guide. That being said, I typically setup OpenLDAP and use the ldap-provider for authentication. But this requires that you have somewhere to install this (perhaps on same server with NiFi). The advantage here is you do not need to mess with the NiFi truststore. You can also use this ldap server for multiple instance of NiFi and NiFi-Registry. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-23-2024
05:56 AM
@s198 The two most common scenarios for this type of failure are: 1. File already exists with same name when trying to rename. Typically resolved by using an update attribute when a failure exists to modify the filename. Perhaps use the nextInt NiFi expression Language function to add an incremental number to filename or in your case modify the time by adding a few milliseconds to it. 2. Some process is consuming the dot (.) filename before the putSFTP processor has renamed it. This requires modifying the downstream process to ignore dot files. While it is great that run duration and run schedule increases appear to resolve this issue, I think you are dealing will a millisecond raise condition and these two options will not always guarantee success here. Best option is to programmatically deal with the failures with a filename attribute modification or change who you are uniquely naming your files if possible. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-22-2024
12:11 PM
@s198 - Do you have the full stack trace from the nifi-app.log when the rename fails? - Is it always the same exact stack trace? - Have tried putting this processor class in DEBUG via the NiFi logback.xml to see what additional logging it may produce when the exception occurs? Thanks, Matt
... View more
04-22-2024
06:13 AM
@manishg Not sure what version of Apache NiFi you are using here. I would not recommend using the InferAvroSchema processor. Depending on your use case there may be better options. Most record reader like (CSVReader) have that ability in infer schema From the output provided you have a CSV file that is 44 bytes in size. According to the InferAvroSchema processor documentation: When inferring from CSV data a "header definition" must be present either as the first line of the incoming data or the "header definition" must be explicitly set in the property "CSV Header Definition". A "header definition" is simply a single comma separated line defining the names of each column. The "header definition" is required in order to determine the names that should be given to each field in the resulting Avro definition. Does your content here meet the requirements of the InferAvroSchema processor? Do you see same issue if you try to infer schema via the CSVReader controller service? These two different components do not infer schema in the same way. The InferAvroSchema is not part of the Apache NiFi and utilizes the Kite SDK which is no longer being maintained. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-18-2024
01:30 PM
1 Kudo
@s198 I think step one would be looking more into the failures. Are the failures always with rename of dot file? Put SFTP is writing to a dot file (hidden file) and then upon write completion moves file from .xyz to xyz. You also never shared your complete putSFTP processor configuration. 1. Did you inspect the SFTP server log for any logging related to the failures you encountered? 2. What is being done with the files once placed on the SFTP server? Is there some other process consuming them from there? 3. Any chance that other process is consuming the dot files (hidden files) before NiFi has a chance to rename them? 4. Any of the FlowFiles queued have the same "filename" attribute as another FlowFile or a file already present on the target SFTP server? (this is a common issue where the file of same name still exists on the target when the other is written as dot file and then rename fails. Then on retry some process consumed the duplicate and the new is then successful on rename). As far as option 3 and 4 go, both introduce some latency in your dataflow. with (3) the processor only get scheduled once every 30 seconds. So FlowFiles will queue up every 30 seconds. The putSFTP processor has a batch setting for how many FlowFiles will get processed in that execution. If more FlowFiles are queued then that batch setting, he extras will sit until next time processor is scheduled. My concern is that latency introduced my options 3 and 4 may simply be masking the actual issue needing to be addressed. With (4) the processor gets scheduled as fast as possible, but when it executes the thread remains active for 500ms working on as many FlowFiles as possible in the single execution. Then at 500ms it close out that thread and the processor (assuming run schedule of 0) would immediate schedule the processor again. As far as which is better, it is about getting best performance throughput with least amount of latency. Data volumes, sizes, etc come it play here. I typically favor option 4 myself. But if option 3 still works for you but with a much lower runs schedule (30 secs is a lot of latency for a continues flow) Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-18-2024
06:44 AM
1 Kudo
@double_w Can you share some details on which specific components you are using that appear to lose state after upgrade? The upgrade from 1.13.2 to 1.25.0 is a large leap. Did you test out your dataflow after upgrading to 1.25.0? Was state still working correctly before migrating then to 2.0.0-M3? I am unaware if any way to migrate local state to Zookeeper. While I do not have an answer for you here, the more details you share the more i can look into it possibly as I have time. Thanks, Matt
... View more
04-18-2024
06:15 AM
1 Kudo
@whoknows While there s no exact date yet in the Apache NiFi community, I have seen discussions around it as recent as Apr 8th that suggests it will be happening very soon. Possibly within the next week or two. Thank you, Matt
... View more
04-18-2024
06:10 AM
1 Kudo
@s198 Question for you: - If you leave the putSFTP processor stopped, run your dataflow so all FlowFiles queue in front of the putSFTP processor and then start the putSFTP processor, does the issue still happen? - Does issue only happen when the flow is an all started/running state? Answers to above can help in determining if changing the run schedule will help here. Run Schedule details: - The run schedule works in conjunction with Timer Driven scheduling strategy. This schedule setting controls how often a component will get scheduled to execute (different from when it actually executes. Execution depends on available threads in the NiFi Timer Driven thread pool shared by all components). By default this is set to 0 secs which means that NiFi should schedule his processor as often as possible (Basically schedule it again as soon as an available concurrent task (concurrent tasks default is 1) is available to it. To avoid CPU saturation here, NiFi builds in a yield duration if upon scheduling of a processor there is no work to be done (inbound connection(s) are empty). Depending on load on yoru system and dataflow, speed of network, this could happen very quick meaning it scheduled, sees only one FlowFile in the inbound connection at time of schedule and processes only that one FlowFile instead of a batch. It then closes that thread and starts a new one for next FlowFile instead of processing multiple FlowFiles in one SFTP connection. By changing run schedule you are allowing more time between scheduling for FlowFiles to queue on the inbound connection so they get batch processed in a single SFTP connection. Run Duration details: Another option on processors is the run duration setting. What this adjustment does is upon scheduling of a processor the execution will not end until the configured run duration has elapsed. So lets say at time of scheduling (run schedule) there is one FlowFile in inbound connection queue (remember we are dealing with micro seconds here, so not something you can visualize yourself via the UI). That execution thread will execute against that FlowFile, but rather then close out the session immediately committing the FlowFile to an outbound relationship, it will check inbound connection for another FlowFile and process it in same session. It will continue to do this until the run duration is satisfied at which time all processed FlowFiles during that execution are committed to downstream relationship(s). So Run Duration might be another setting you try to see if it helps with your issue. If you try run duration, i'd set run schedule to default. You may also want to look at your SFTP server logs to see what is happening when the file rename attempts are failing. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-17-2024
11:30 AM
1 Kudo
@shiva239 As i dig in a bit more here, this looks like a Java version compatibility issue with this Ignite driver. Apache NiFi 2.0.0-M2 requires Java 21. I see other reporting similar: "Could not initialize class org.apache.ignite.IgniteJdbcThinDriver" Exceptions when using Java version 16 and newer here: https://issues.apache.org/jira/browse/IGNITE-14888 I am not familiar with Ignite and its core driver dependencies, but above is likely your issue. In the above ignite jira, a commentor seemed to work around issue by: copied the jvm parameters from jvmdefaults.bat in ignite home catalog Perhaps you can try doing the same and adding those additional jvm paramaters to the NiFi bootstrap.conf file. This is not something I have ever tried or have an environment in which to test, but perhaps it will help you. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-17-2024
05:42 AM
@shiva239 Is it the exact same error? What is the full stack trace logged to the nifi-app.log? Thanks, Matt
... View more