About MattWho

MattWho · ‎03-30-2023

@Breezer NiFi was historically never built to manage local users. NiFi provides no mechanism for creating and managing multiple users locally. That being said, the Apache NiFi community found that many new users to NiFi were simply starting up unsecure NiFi instances on publicly accessible networks and decided to make changes so that by default NiFi would start with out of the box configuration secured over https. This change was released as part of the Apache NiFi 1.14 release and involved the following changes to make this work. 1. NiFi toolkit is used automatically to generate a keystore and truststore using self signed certs to secure NiFi. 2. A secured NiFi will require users/clients to authenticate and be authorized to interact with the NiFi UI in various ways. This means that out of the box there would need to be an authorizer and a means to define some user that could then be auto authorized to the needed policies. These changes were all part of https://issues.apache.org/jira/browse/NIFI-8220 The single-user-authorizer and single-user-provider were never intended for use in production as they do not provide granular multi-user level of authentication and authorization (which is what you are looking for). The simply provide for a single user who is authorized to every NiFi policy allowing for a secured environment out of the box. Since NiFi never has and does not have any intention of managing users locally (creating multiple local users with passwords managed through NiFi UI) in the future, you'll need to utilize one of the other available user authentication methods if you want an environment which supports multiple users with unique authorizations. Those methods are explained here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#user_authentication I see you don't want to rely on some external authentication provider like ldap, kerbersos, knox, etc. and that is fine. User authentication can also be achieved via a mutual TLS handshake. All this requires is generating a unique user certificate for each of your 3 users. A basic setup like this would require you to configure your NiFi to use the follwoing: Authentication: Clear the "single-user-provider" for the "nifi.security.user.login.provider" property in the nifi.properties file. Use the NiFi TLS toolkit to generate your certifcates: https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#tls_toolkit. Or you could use an external free certificate provider like Tinycert to create a certificate for each your NiFi instance(s) and a certifcate for each of your users. Authorization: Change the "single-user-authorizer" for the " nifi.security.user.authorizer" property in the nifi.properties file to "managed-authorizer". Build a new authorizers.xml that uses the "managed-authorizer" (https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#standardmanagedauthorizer), "file-access-policy-provider" (https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#fileaccesspolicyprovider), and "file-user-group-provider" (https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#fileusergroupprovider). Example: Authorizers.xml configuration: <authorizers> <userGroupProvider> <identifier>file-user-group-provider</identifier> <class>org.apache.nifi.authorization.FileUserGroupProvider</class> <property name="Users File">./conf/users.xml</property> <property name="Legacy Authorized Users File"></property> <property name="Initial User Identity 1"><full DN from user certifcate 1></property> <property name="Initial User Identity 2"><full DN from user certifcate 2></property> <property name="Initial User Identity 3"><full DN from user certifcate 3></property> </userGroupProvider> <accessPolicyProvider> <identifier>file-access-policy-provider</identifier> <class>org.apache.nifi.authorization.FileAccessPolicyProvider</class> <property name="User Group Provider">file-user-group-provider</property> <property name="Authorizations File">./conf/authorizations.xml</property> <property name="Initial Admin Identity"><full DN from user certificate 1></property> <property name="Legacy Authorized Users File"></property> <property name="Node Identity 1"></property> </accessPolicyProvider> <authorizer> <identifier>managed-authorizer</identifier> <class>org.apache.nifi.authorization.StandardManagedAuthorizer</class> <property name="Access Policy Provider">file-access-policy-provider</property> </authorizer> </authorizers> This authorizers.xml setup will add your three user identities to NiFI for purpose of authorizing them against NiFi policies only. One of those users will be designated as the "initial admin" in the file-access-policy-provider. This user will be assigned to the required policies needed for that user to act as admin. That admin user can then access NiFi and setup authorization policies for the other two users. The Certificates created for your users would be provided to each user. The user can then load that certificate into their browser. When the user navigates the the HTTPS NiFi URL, NiFi will request that client provide a certifcate and the loaded certificate can be used. This handles the unique user authentication. More details on setting up additional authorization policies for yoru users can be found here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#config-users-access-policies If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-30-2023

@davehkd Changes committed to an early version 1.19 will persist into the next version unless a specific closed jiras exists that makes another changes impacting version. I saw no other newer jiras related to Kotlin version changes at time of this response. Matt

MattWho · ‎03-30-2023

@wffger2 NiFi Flow Definitions are meant to be reusable snippets of flow. Meaning it can be imported to the canvas of the same NiFi Cluster or to the canvas of another NiFi cluster. Since you can not have multiple components on the canvas using the same component UUID, you will not be able to compare Flow Definitions between two different environments without seeing differences in those UUIDs. If you want to track differences in dataflows between your UAT and DEV environments, NiFi-Registry is the method you want to use. You can install a single NiFi-Registry and configure both your NiFi's to point at that same instance of NiFi-Registry. Then on your DEV environment you start version control on a process group (testNiFiDefinition). Then on your UAT environment you import that same version flow from the NiFi-registry. Now you have both your NiFi environments tracking against the same version flow stored in that NiFi-registry. While the component UUID will still be different, they both track to same versioned flow. If you then make changes in DEV and commit those to NiFi-Registry as a new version of the already version controlled flow, you UAT environment will report a newer version as being available. You can see the differences directly from the NiFi UI and easily upgrade your dataflow on the UAT environment to the newer version available if you choose. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-30-2023

@Meeran The solution here is to setup an external load balancer in front of your 3 NiFi nodes. Then have your clients point at that load balancer. The Load Balancer would then be responsible for sending your client request to one of the available NiFi nodes. (If a node goes down, the Load Balancer simply does not send client requests to that node. When using a Load Balancer in front of your NiFi cluster, it is important that the Load Balancer is configured to use sticky sessions. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-29-2023

+1 on @cotopaul and @steven-matison being rockstars in the community. @swanifi The PutMagicLogic processor is not included out of the box with Apache NiFi releases. It is a custom component build by other members in the Apache community (https://github.com/marklogic). You may find better help by filing an Apache Jira within the marklogic project here: https://github.com/marklogic/nifi/issues/new Aside from reading the MarkLogic documentation, I would be of little help here: https://marklogic.github.io/nifi/step-by-step As @steven-matison mentioned you may be able to limit the size of the queue on the connection feeding the PutMarklogic processor to avoid exceeding 5000 (if that happens to be some limit on this processors capability). https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings The Back Pressure Object Threshold is a soft limit for the connection. As a soft limit it simply, if the connection is >= the configured threshold, the processor component feeding that connection will not get scheduled to execute until that threshold drops back below the configured threshold. Some components process FlowFiles in batches. Some ingest processors like ListSFTP, ListFile, etc. have potential to generate a lot of FlowFiles in a single execution. If your connection source is one that produce a lot of FlowFiles in a single execution, you could add a processor in between to have better control over the connection feeding the Put MagicProcessor (example: UpdateAttribute, controlRate, etc.) While this does not solve your issue with this processor itself, it may help you move forward with its existing behavior. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-27-2023

@srilakshmi The PublishKafka processor can be configured with a comma separated list of KaFla brokers. If the processor at time of execution is able to communicate with one of these configured brokers, it will received a destination for publishing the content. If during the publish a failure occurs, The FlowFile is routed to the failure relationship. You have configurable options to retry on failure x number of times. You should avoid auto-terminating failure relationships in your datafow designs unless data loss is acceptable. Each attempt is a new execution of the processor which means connect to broker again. A failure is when PublishKafka was unable to send all the content bytes (for example: connection gets closed). Best Effort and Guarantee single node delivery setting in the PublishKafka processor have nothing to do with the NiFi nodes in the NiFi cluster. This has to do with the nodes in the destination Kafka setup. In a NiFi cluster each node executes its own copy of the dataflow(s) and each node has its own content and FlowFile repositories. Nodes are unaware of FlowFiles that exist on the other nodes in the cluster. So a FlowFile's content that fails to publish on say node2 will route to failure relationship on node 2 and if you use retry, will be executed on again on node 2. When a node goes down the FlowFile queued in connection remain on that node until it is brought back on line. When the node comes back up, it will continue processing FlowFiles from the last connection in which they were queued. So it is important the the Content and FlowFile repositories are protected to avoid dataloss (such as using RAID storage). A node that is disconnected from the cluster will still execute its dataflow(s) as long as NiFi is still running on that node. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-27-2023

@ManishR NiFi offers many components (processors, controller services, reporting tasks, etc) that can be used to construct a flow based program on the NiFi canvas (Referred to as a NiFi dataflow). While this list of default available components may be different depending on the release of NiFi being used, NiFi has embedded documentation found under help within the NiFi UI that shows all components available in that installed release. Apache NiFi also publishes the same info for the most current released version here: https://nifi.apache.org/docs/nifi-docs/ Selecting a component from the documentation with open a description of the component and all list configurable properties. Building a dataflow on the NiFi canvas consist of dragging and dropping new component processors to the canvas. You can then drag connection between these components to construct your end-to-end dataflow. There are 100s of component processors available out of the box and even more that you can download and add to your NiFi from the apache community. Once a dataflow is built and configured, starting those components would result in the creation of FlowFile (for testing, you can add a GenerateFlowFile processor that generates a FlowFile rather then ingesting content from an external source like the local file system, kafka, DB, etc. As each component executes against a FlowFile, that FlowFile is routed to one of the available relationships the particular processor offers. These relationships would be assigned to one of the connection exiting the processor and connecting to another downstream processor. The following Apache NiFi docs explain how to build a dtaflow: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#building-dataflow This covers how to search for a component in yoru dataflow(s): https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#search Then when it comes to looking at the detailed lineage of an individual FlowFile, you can use NiFi's data Provenance for that. Individual processor components generate provenance events as they execute on FlowFile (create, route, drop, etc...). You can look at the entire lineage from create to drop of a FlowFile (assuming you configure NiFi provenance with enough storage to store all the lineage). BY default NiFI is configured to only use 10GB for Provenance and only store Provenance for 24 hours, but this can be configured in the nifi.properties file. You can write click on NiFi processor component in your dataflow and Select data provenance from the pop-up context menu. This will open a provenance search query result set that show FlowFile that traversed the component. You can select one and even expand the lineage of that select component. The lineage of a FlowFile will show all events associated to that FlowFile created by the processor components that FlowFile traversed. This covers how to use NiFi's Data Provenance: https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data_provenance If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-27-2023

@apmmahesh Make sure that the nifi.properties file on all nodes is configured the same. Make sure that the "nifi.cluster.protocol.is.secure" property is set to true on all the nodes. Matt

MattWho · ‎03-27-2023

@NafeesShaikh93 Interesting Use case you have. I am not all that familiar with all the methods the Graylog offers for ingesting logs from other servers. I'd assume Syslog is one of them? If so, NiFi offers. putSyslog processor. Looking at the dataflow you build thus far, I am not sure what you are trying to accomplish. The LogAttribute and logMessage processors allows you to write a log entry in a NiFi log defined by an appender and logger in the logback.xml NiFi configuration file. By default these log lines would end up in the nifi-app.log. You could however add an additional appender and the a custom logger to send log lines produced by these processors classes to the new appender thus isolating them from the other logging in the nifi-app.log. There is no way to setup a specific logger by processor on canvas. So every logAttribute and logMessage processor you use will write to the same destination NiFi appender log. The classes for the logAttribute and logMessage processors are: org.apache.nifi.processors.standard.LogAttribute org.apache.nifi.processors.standard.LogMessage NiFi also has a tailFile processor that can tail a log file and create FlowFiles with that log entries at content. You could then use PutSyslog processor to send those log lines to yoru Graylog server possibly. The above design involves extra disk I/O that may not be necessary since you could possibly design your flow to create FlowFile attributes with all the file information you want to send to GrayLog an then use a replaceText at end of successful dataflow to replace the content of yoru FlowFile with a crafted syslog formatted content from those attributes and send directly to Graylog via the PutSyslog processor. This removes the need to write to a new logger and consume from that new log before sending o syslog. But again this is a matter of preference. Perhaps in you case maybe you want a local copy of these logs as well. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎03-27-2023

@davehkd Kotlin is a transitive dependency of OkHttp, so any NiFi component that has a dependency on OkHttp will also include the Kotlin stdlib jars as well. The Kotlin version was upgraded to 1.7.20 in Apache NiFi 1.19.0. https://issues.apache.org/jira/browse/NIFI-10655 This update impacts the following Kotlin jar: kotlin-stdlib-common-1.7.20.jar kotlin-stdlib-jdk8-1.7.20.jar kotlin-stdlib-jdk7-1.7.20.jar kotlin-stdlib-1.7.20.jar When NiFi is launched it unpacks are the NiFi nars into the NiFi work directory: <work path defined in nifi.properties file>/work/nar/extensions/ You can search those unpacked nar's "bundled-dependencies" for the kotlin jars to see all the nars containing components that utilize OkHttp and thus also have the kotlin transitive dependency. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

Online	Offline
Last Visited	‎11-18-2025 07:56 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎11-18-2025 07:56 AM
Posts	3,406
Kudos received	1619

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

Re: How to set passwords for multiple users in Apa...

Re: Does Apache NiFi 1.19.1 or 1.20.0 Have Kotlin ...

Re: Nifi Flow Definition does not keep the same id...

Re: NiFi Cluster but single point of URL

Re: PutMarklogic - fails to load the data when the...

Re: Query on Delivery guarantee parameter in Publi...

Re: How to find existing processor/files/integrati...

Re: insufficient permissions untrusted proxy

Re: how to send logs from nifi to graylog

Re: Does Apache NiFi 1.19.1 or 1.20.0 Have Kotlin ...