Member since
07-30-2019
3472
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 270 | 06-03-2026 06:06 PM | |
| 543 | 05-06-2026 09:16 AM | |
| 1078 | 05-04-2026 05:20 AM | |
| 605 | 05-01-2026 10:15 AM | |
| 714 | 03-23-2026 05:44 AM |
06-03-2024
11:09 AM
@inkerinmaa An Apache NiFi multi-node clustered setup is much different then a standalone NiFi installation. Your exception is related to a TLS exchange trust issue going on between your nodes. In a NiFi cluster one of the nodes will be elected to the role of "cluster coordinator" by Zookeeper (ZK). All of the nodes will communicate with ZK in order to learn which node is currently assigned to this role and then begin sending heartbeats to that elected node in order to join the cluster. It looks like you are just allowing your NiFi nodes to auto generate their own self-signed certificates on each node? Works fine to do this in a standalone NiFi setup; however, you'll need to create keystores and truststores for your NiFi cluster nodes so that proper mutual trust can be established. I also see that your are using the Single-User login provider and authorizer. For a NiFi cluster you'll also want to be using more production ready providers like the ldap-provider for login and the StandardManagedAuthorizer for all your authorizations. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-30-2024
01:27 PM
@scoutjohn I installed an out-of-the-box Apache NiFi 1.26 using single user providers and the NiFi self-signed generated certificates. I was able to send provenance events via the S2SProvenanceReportingTask successfully back to a Remote Input Port on the same NiFi with no issues. So authorization is not an issue here. I tested using both HTTP and RAW transport protocols successfully. I also validated that S2S was working by setting up a Remote Process Group to send FlowFiles to a Remote Input port as well. Here is the dataflow I setup: You can see in the above that i generated some FlowFiles that were sent over S2S to the "Input1" remote port. You can also see that my "prov" port received provenance events from the S2SProvenanceReportingTask. My S2S setting from nifi.properties file: # Site to Site properties
nifi.remote.input.host=localhost
nifi.remote.input.secure=true
nifi.remote.input.socket.port=10001
nifi.remote.input.http.enabled=true
nifi.remote.input.http.transaction.ttl=30 sec
nifi.remote.contents.cache.expiration=30 secs My Remote Process Group configuration: Switching to "HTTP" transport protocol also worked. S2SProvenanceReportingTask configuration: While all of this worked correctly, sending provenance events via the S2SProvenanceReportingTask back to the same NiFi is not advisable. It creates an endless loop of provenance events. For every FlowFile received on the "prov" port another provenance "RECEIVE" event is created which then gets set by the reporting task. This an infinite loop is created. You would certainly have difficulty related to authentication and authorization sending to another NiFi instance using the out-of-the-box keystore, truststore, and single user providers between two out of the box NiFi deployments. But for testing purposes this works. Now I see from your configuration you setup: nifi.remote.input.host=cd8e8c899db6 Makes me wonder if that given hostname is: A SAN entry in the NiFi generated keystore certificate. You could use keytool command to check. keytool -v -list -keystore keystore.p12 That hostname is resolvable and reachable by your NiFi instance. Try changing that property to "localhost" see if it resolves your issue. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-30-2024
10:00 AM
@hegdemahendra The small number in upper right corner of any processor shows the number of active threads at time the UI was last refreshed. The default auto refresh of the UI is every 30 seconds. It turns red when their is an active terminated thread. So with your example above 2(1), it is telling you that this processor as 2 active threads and 1 terminated thread. A terminated thread is the result of manual user intervention. When a processor asked to change run-status from "running" to "stopped", (Stopping Component) it first transition into a state of "stopping". It does not transition to "stopped" until all active threads complete. NiFi provides and option to "terminate" when in a stopping state because of active threads. Terminate (Terminating a components tasks) does not kill that active thread since all thread belong to a single JVM. What the terminate function does is release any FlowFile tied to the active thread(s) back to their originating connection and marks the thread as terminated. That terminated thread will continue to execute until it completes or the JVM is restarted. Should that now "terminated" thread complete, all output is sent to dev null instead of resulting in any down stream movement. This allows users to handle scenarios where there are long running threads or hung threads preventing the stopping, changing of configuration, and starting of a processor. When a terminated processor is restarted it will re-process the FlowFile(s) that were originally tied to the terminated thread(s). This prevents any data loss from occurring. If a terminated thread is in a permanently hung state, the only way to get rid of it completely is a restart of NiFi which will kill the JVM after a graceful shutdown period. As far as your custom processor getting stuck, you would need to collect thread dumps and inspect those to see what your thread is waiting on that is blocking it from progressing and address that in your custom code. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-30-2024
06:17 AM
@Vikas-Nifi @ckumar is 100% correct. Only fields explicitly marked as supporting NiFi Expression Language (NEL) can support a NEL expression "${schedule}". I am however curious about your use case as to why you would even being trying to do this. From what you shared you are extracting a cron schedule from the json content of some FlowFiles traversing an EvaluateJsonPath processor. That "schedule" is added on to the NiFi FlowFile as a FlowFile attribute (key=value pair). This would not make that key=value pair accessible to any other NiFi component unless that FlowFile containing the FlowFile attribute was processed by that other component. However, in your shared dataflow you do not mention that EvaluateJsonPath connects to your invokeHTTP processor via an inbound dataflow connection (Keep in mind that even if you did do this, it does not change the fact that the run schedule property does not support NEL). I just wanted to clarify how FlowFile attributes are and can be used. Also keep in mind that the "run schedule" is a scheduler only. The run schedule set on a processor controls when the NiFi controller will schedule the execution of the processors code. It does not mean that they the processor will immediately execute at time of scheduling (It may be delayed on execution waiting for an available execution thread from the thread pool). All scheduled components share a thread pool and NiFi framework will also handle assigning threads to next scheduled component as thread become available. So the NiFi framework needs to know the scheduling for a component when it is started; otherwise, NiFi would never know when to schedule it to execute. Unless a component property has an explicit tool tip that tells you it support NEL, then it does not. For NiFi processor components, you will find that only some processor specific properties within the "PROPERTIES" tab support NEL. This is not only available through property tooltips, but also in the processors documentation. Examples: Even when NEL is supported there is a scope. It may support FlowFile attributes, Variable Registry (going away in NiFi 2.x releases), or both. Thank you, Matt
... View more
05-29-2024
11:23 AM
1 Kudo
@Naveen_Sagar The Bearer token is issued by a specific NiFi node for a specific user identity. That Bearer token has a limited life time and can not be used to authenticate a user on any other NiFi node (even one in the same cluster as the original node that provided the bearer token). All rest-api endpoints will require some level of authorization. So simply having a valid bearer token for an authenticated user identity, does not mean that user is authorized to access/interact with every rest-api endpoint. In your case, the user would need "operate the component" or "view the component" and "modify the component" authorizations in order to change the run-status. You should inspect the nifi-user.log on the aaa.com nifi server to see what user identity attempted to change the runs-status on that node and was not authorized. Then verify the necessary authorization is setup for that user identity and try your curl command again. And make sure as @ckumar pointed out that in his curl example that you are using the "-k" flag which allows curl to auto trust the serverAuth certificate presented in the TLS exchange with your secured NiFi. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-29-2024
05:42 AM
@scoutjohn The article you are using for reference was written back in 2016 before NiFi was changed to starting secure out of the box. It is written entirely around that unsecured NiFi example. You could always unsecure your NiFi and test out S2S capability. That would atleast allow you to test/evaluate the functionality. When NiFi is secure both authentication and authorization must be handled. This includes authentication and authorizations for S2S operations. An out-of-box installation of NiFi utilizes self -generated self-signed certificates to create the keystore and truststore files needed for mutualTLS. It also uses a very basic non production single-user-provider for user authentication and a single-user-authorizer for user/client authorization. These basic providers make it easy to evaluate NiFi, but are not robust enough to support all features. Is this what you are using still or have you created your own keystore and truststore files and setup non single user authentication and authorization providers? To be honest, I always setup production ready NiFi instance and clusters that don't use the auto-generated self-signed certificates and or single user providers. I can't say that I have tried using S2S in such out-of-box environment. So I can't say that the single-user-authorizer supports those needed authorizations. Above being said, I see you set nifi.remote.input.http.enabled=true, but all that property does is allow http transport protocol which means that means that the NiFi would support transferring FlowFiles over http protocol. That does not mean unsecured, it could be http or https depending on the destination URL. The S2S properties in the the NiFi properties need to be modified to support secure S2S by changing nifi.remote.http.secure=true (you did not comment if you made that change or not). 1. Is your S2SProvenanceReportingTask producing any bulletin messages? 2. Are you seeing any not authorized related log lines in the nifi-user.log? 3. What keystore and truststore did you configure in the StandardRestrictedSSLContextService controller service? I'll try to mess around with and out-of-box setup if that is what you are using to see if what you are trying to do is possible in such a non-production ready setup when I have some time. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-29-2024
05:06 AM
@Dilipkumar I am not sure what you mean by backups. Backups of what? The NiFi-Registry is used to version control Process Groups from one or more NiFi instances. Those version controlled flow definitions include all configurations (minus any sensitive properties values). A version controlled flow definition can be imported to any NiFi instance or cluster that has authorized access to the NiFi-Registry bucket in which the it is stored. NiFi-Registry can be configured to persist the flow definition storage in a local file persistence or in a git repository. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-28-2024
06:49 AM
1 Kudo
@Alexy 100% agree with @ckumar Why is your NiFi producing som much logging? Additional loggers? Increased log levels? Huge FlowFile volume? Why are you not compressing (gz)on rollover to save disk space? Keep in mind that compression will take longer the larger the log file. The performance is not going to change whether you are writing/appending to 100 MB or larger log files. But you do have disk I/O related to amount of logging you are producing. Matt
... View more
05-28-2024
06:42 AM
@scoutjohn The Site-To-Site (S2S) configuration properties configure how your NiFi instance handles both inbound S2S to and outbound S2S connections are handled. It is the receiving instance of NiFi the determines if S2S communication should be secure or not. nifi.remote.input.secure=true
nifi.remote.input.socket.port=10000
nifi.remote.input.http.enabled=false First you need to understand how S2S works. The instance of of NiFi with a RemoteProcessGroup (RPG) or a S2S Reporting task is the client side of the connection. When that client component (RPG or S2S reporting task) executes it need to communicate with the target NiFi. That initial communication is always going to be over HTTP(S) to the target NiFi. So if the target NiFi is secured (nifi.web.https.port configured) and the URL provided to RPG or S2S reporting task is "HTTPS" the initial connection is going to be secure. This initial connection is used to fetch S2S details from the target NiFi. Included in those S2S details are numerous bits of information to include: Does target support FlowFile http(s) input transfer? (nifi.remote.input.http.enabled) Does target NiFi support socket based FlowFile transfer? (nifi.remote.input.socket.port) Does target enforce secure communictaions (nifi.remote.input.secure) List of remote inbound and remote output ports the client is authorized to see. How many nodes in the target NiFi cluster. Load on each of those nodes etc. With the setup you shared your NiFi is setup with only the nifi.web.https.port configured meaning that this NiFi can only support https communications from S2S connections. Not sure why you would want to send your data unsecured over your network. Whey not send secure since your NiFi is already secured over https. Now if you were to also configure the nifi.web.http.port (which makes no sense since you would be exposing your NiFi UI unsecured over http as well as secured over https), does it still force nifi.remote.input.secure back to true from false? I have not confgures http and https at same time for a very very long time (only some done rarely when there were different internal and external networks). I could not find any Apache Jiras that stated this is no longer an option, but it is possible that this has changed. But even if possible, i still question using unsecured when your NiFi is already secured. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
05-28-2024
06:03 AM
@mohammed_najb Is the ExecuteSQL the first processor in your dataflow or is it being fed by an inbound connection from some upstream processor such as the GenerateTableFetch processor? I only ask since ExecuteSQL processor does not retain and state so it alone would not be the best choice for ingesting from an active table that may be having additional rows added to the DB regularly. As far as the ExecuteSQL, it writes out attributes on the FlowFiles it produces. The "executesql.row.count" will record the number of rows returned by the query OR the number of rows in the specific produced NiFi FlowFile's content when "Max rows per FlowFile" property is configured with a non zero value. When multiple FlowFiles are being produced, you could use an UpdateCounter processor to create a counter and use the NiFi Expression Language "${executesql.row.count}" as the delta. As far as your query about "process fails " is concerned. The ExecuteSQL will execute the SQL query and based on configuration create 1 or more FlowFiles. Also based on configuration it will incrementally release FlowFiles to the downstream connection or release them all at once (default) via OutputBatch Size configuration. Assuming using default, no FlowFiles are output until until query is complete and all FlowFiles are ready fro transfer to the outbound connection. If failure happens prior to the is transfer (system crash, etc.), no FlowFiles are output. On next execution of the ExecuteSQL the query is executed again if no inbound connection. If ExecuteSQL is utilizing and inbound FlowFile from an inbound connection to trigger the execution, processing failure would result in FlowFile routing to failure relationship which you could setup to retry. If system crash, FlowFile remains in inbound connection an simply starts over execution on system restore. Hopefully tis gives you some insight to experiment with. As is the case with many use cases, NiFi often has more then 1 way to build them and multiple processor options. The more detailed you are with yoru use case, the better feedback you may get in the community. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more