About MattWho

MattWho · ‎09-12-2022

@ajignacio User and group identity strings much match identically. Your ldap-user-group-provider is syncing users and groups by the identity string found in the CN AD attribute. This is why you are seeing only the CN username and CN groupname strings in the users UI within NiFi. However, when you are logging in to NiFi to authenticate you user via the ldap-provider, the resulting user identity sting is the users full AD Distinguished Name (DN). NiFi treats different strings as different users. The ldap-provider can be changed to use the user identity string typed in the username field instead of using the full DN. This is done by changing the following property: <property name="Identity Strategy">USE_DN</property> change it to : <property name="Identity Strategy">USE_USERNAME</property> Upon successful authentication the resulting user identity is evaluated against any identity mapping patterns that may be configured in the nifi.properties file. The resulting mapped value is then passed to the configured authorizer (managed-authorizer in your setup). There the authorizers is looking up that user identity string (case sensitive) against the user strings synced by your configured users group providers. If an exact match is found both the user string and the now learned group string(s) are checked against the configured NiFi policies to determine authorization. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎09-12-2022

@AnkurKush It is impossible to provide a very specific solution without understanding the exact structure of your ldap user and group entries. You should obtain the output from the ldapsearch command for a sample user and sample group you will be authorizing in NiFi. That output will help you correctly configure the empty properties needed. Some general configuration guidance: - You should avoid syncing ALL users and groups from you ldap. ldap can contain thousands of users and groups and when you sync all of these to NiFi, these users and groups identities will be loaded into NiFi's heap memory. So limiting what is synced to the specific users and groups that will be accessing your NiFi will help reduce heap usage. This can be controlled using the correct "User Search Filter" and "Group Search Filter" settings. - I recommend always setting the "Page Size" setting to 500. ldap server often is configured to limit max number fo returns in a single request of 500 or 1000. If the return set is larger then that, returns will be missing if you do not configure this property. it has not impact if there are fewer returns then the set page size of 500. Specific guidance: - When it comes to actually syncing user and group identity strings, the following section must be configured: <property name="User Search Base">ou=people,dc=example,dc=net</property> <property name="User Object Class">person</property> <property name="User Search Scope">ONE_LEVEL</property> <property name="User Search Filter"></property> <property name="User Identity Attribute"></property> <property name="User Group Name Attribute"></property> <property name="User Group Name Attribute - Referenced Group Attribute"></property> <property name="Group Search Base">ou=groups,dc=example,dc=net</property> <property name="Group Object Class">groups</property> <property name="Group Search Scope">ONE_LEVEL</property> <property name="Group Search Filter"></property> <property name="Group Name Attribute"></property> <property name="Group Member Attribute"></property> <property name="Group Member Attribute - Referenced User Attribute"></property> 1. Leaving "User Identity Attribute" and "Group Name Attribute" tell NiFi which property/attribute to use from the ldap return as the identity string for the returned user or group. without these set, you'll get no response. 2. "User Group Name Attribute" in the user sync section tells NiFi's ldap-user-group-provider which attribute from the ldap returned user entry contains groups that the returned user belongs to. Sometimes there is no group association in the user entries and this is blank. Without this set, NiFi will not be able to determine groups associated to users via the user sync and that association must be done via the group sync. 3. "Group Member Attribute" in the group sync section tells NiFi's ldap-user-group-provider which attribute from the ldap returned group entry contains the users that belong to this group. Without this set, NiFi will be unable to determine which users are associated to the returned groups. 4. The two "Reference group/user attribute" properties are needed when the user or group strings strings retuned from the configured property in 2 or 3 above are not full Distinguished names for the user or group. In this case, this would be used to define the attribute that contains the actual exact matching string. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

ajignacio · ‎09-01-2022

Thank you Matt sorry for the super delay response but it's very helpful

yongki · ‎08-22-2022

Correct, it is currently mounted on every node. I would have thought Nifi would dropped the file state once log files are deleted or archived but does not look like it. I could see 15K file states on the tailing processor.

VidyaSargur · ‎08-21-2022

@chitrarthasur, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

MattWho · ‎08-15-2022

@VJ_0082 Since your log is being generated on a remote server, You will need to use a processor that can remotely connect to the exteranl server to retrieve that log Possible designs: 1. The could incorporate a FetchSFTP processor in to your existing flow. I assume your existing RouteOnAttriibute processor is checking for when an error happens with your script? If so, add the FetchSFTP processor between this processor and your PutEmail processor. Configured the FetchSFTP processor (configured with "Completion Strategy" of DELETE) fetch the specific log file created. This dataflow assumes the log's filename is always the same. 2. This second flow could be built using the ListSFTP (configured with filename filter) --> FetchSFTP --> any processors you want to use to manipulate log --> PutEmail. The ListSFTP processor would be configured to execute on "primary" node and be configured with a "File Filter Regex". When your 5 minute flow runs and if it encounters an exception resulting in the creation of the log file, this listSFTP processor will see that file and list it (0 byte FlowFile). That FlowFile will have all the FlowFile attributes needed for the FetchSFTP processor (configured with "Completion Strategy" of DELETE) to fetch the log which is added to the content of the existing FlowFile. If you do not need to extract from or modify that content, your next processor could just be the PutFile processor. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

kellerj · ‎08-10-2022

Thank you! I looked, but not in the right place. I skimmed over the bucket API, but didn't look in detail as I assumed (incorrectly) that it was not relevant to updating a flow.

MattWho · ‎08-10-2022

@SandyClouds The ExecuteSQL processors do not support SSH tunnel. The expectation by these processors is that the SQL server is listening on a port reachable on the network. SSH tunnels are used to access the server via remotely and then execute a command locally on that SQL utilizing the SQL client on that destination server. The ExecuteSQL processor uses a DBCPConnectionPool to facilitate the connection to the database. The DBCPConnectionPool establishes a pool of connections used by one too many processors sharing this connection to execute their code. A Validation Query is very import to make sure a connection from this pool is still good before being passed to requesting processor for use. While I have not done this myself, I suppose you could set up and SSH tunnel on each NiFi cluster server (example: https://linuxize.com/post/mysql-ssh-tunnel/). Then you could still use the DBCPConnectionPool except use the established tunnel address and port in the database connection URL. Downside to this is that NiFi has not control over that tunnel, so if the tunnel is closed, your dataflow will stop working until the tunnel is re-established. The Validation Query will verify the connection is still good. If it is not, the DBCPConnectionPool will drop it and try to establish a new connection. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-10-2022

@Nifi- You'll need to provide more detail around your use case in order to get more specific assistance. NiFi offers a number of processor components that can be used to ingest from a database: - ExecuteSQL - ExecuteSQLRecord - CaptureChangeMySQL <-- probably what you are looking for These ExecuteSQL processors will utilize a DBCPConnectionPool controller service for connecting to your specific Database of choice. SQL is what is needs to passed to these processors in order to fetch database table entries. The following processors are often used to generate the SQL in different ways needed by your use case to do this in an incremental fashion (for example: generating new SQL for new entries only so you are not fetching entire table over and over) - GenerateTableFetch - QueryDatabaseTable - ListDatabaseTable The CaptureChangeMySQL processor will output FlowFiles for each individual event. You can then construct a dataflow to write these events to your choice of location. That might be some other database. Once you have your dataflow created for ingesting entries from your table in to NiFi, you'll need to use other processors within your dataflow for any routing or manipulation of that ingested data you may want to do before sending to a processor to write to the desired destination. Possibly using PutDatabaseRecord processor for example? If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-08-2022

@nk20 I am confused by your concern about in memory state. Can you provide more detail around what you are being told or what you have read that has lead to this concern? Perhaps those concerns are about something more than component state? Perhaps I can address those specific concerns. Not all NiFi components retain state. Those that do either persist that state to disk in a local state directory or write that state to zookeeper. As long as that local disk where state directory is persisted is not lost and the Zookeeper has quorum (min three nodes), then you have your state protected for your NiFi components that write state. Out of all the components (processors, controller services, reporting tasks, etc), there are only about 25 that record state. The only thing that lives in memory only is component status (in, out, read, write, send, received). These are 5 minute stats that live in memory and thus any restart of the NiFi service would set these stats back to 0. These have nothing to do with the FlowFiles or execution of the processor. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

Online	Online
Last Visited	‎02-03-2026 09:17 AM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎02-03-2026 09:17 AM
Posts	3,434
Kudos received	1628

Cloudera Community

Re: Setting TTL per key when writing to redis

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: nifi 2.6 registry security scan results

Re: NIFI Insufficient Permissions for LDAP User Gr...

Re: Need help in Configuration for User group Poli...

Re: Nifi Upgrade

Re: How to remove state of deleted log files in Ni...

Re: NiFi 1.16.3 Processor groups are not getting d...

Re: Put error message in PutEmail message body

Re: Update Flow Description in NiFi Registry

Re: Connecting to a Database using SSH tunnel in N...

Re: CDC Postgresql Nifi

Re: Incremental fetch in a stateless manner